rlittl01 has quit [Read error: Connection reset by peer]
PyroPeter has quit [Ping timeout: 252 seconds]
PyroPeter has joined #riscv
theruran_ has quit [Quit: Connection closed for inactivity]
frost has quit [Ping timeout: 252 seconds]
jacklsw has quit [Quit: Back to the real world]
<Bluefoxicy>
I have now made the stupid divider even smaller.
<Bluefoxicy>
not sure why that was possible, but I came across karnaugh maps and boolean logic on Wikipedia, drew up truth tables because I don't know about all that, and reorganized the boolean logic to be simpler and have less logic depth.
<muurkha>
cool
<muurkha>
you know VHDL and Verilog synthesis tools can do that for you in a lot of cases, right?
<Bluefoxicy>
Yeah, I'm just trying to figure out what it looks like in practice
riff-IRC has quit [Read error: Connection reset by peer]
<muurkha>
congrats :)
<muurkha>
are you targeting FPGA LUTs, or all-NAND gates, or what?
<Bluefoxicy>
the problem here is SRT division uses an expensive look-up table to figure out whether to use {-2,-1,0,1,2} for a certain term that comes down to shifting, negating an addition, and substituting zero
<Bluefoxicy>
FPGA for my project
jacklsw has joined #riscv
<muurkha>
oh, I should learn about SRT division!
<Bluefoxicy>
I fit the SRT q generator into ONE CLB. Vivado somehow figured out how to do it in 3 LUTs, I thought I needed 4
riff-IRC has joined #riscv
<Bluefoxicy>
thing is I'm not using a table. I'm using an extremely fast and small bit of logic that always generates a valid q value
<Bluefoxicy>
I've read research up to 2018
<Bluefoxicy>
There's some stuff on arxiv about making a broken guess and using fuzzy logic while doing *both* possible calculations in parallel to reduce the size of the look-up table
<muurkha>
oh awesome
<Bluefoxicy>
my q generator is smaller than the addressing circuit for the look-up table >:|
<muurkha>
haha
<Bluefoxicy>
and I don't know why
<muurkha>
can you dump the bitstream Vivado came up with? I think there are xilinx bitstream reverse engineering tools floating around now
<Bluefoxicy>
nah I didn't produce a bitstram, I only went synth and implementation to see what it would do with it
<Bluefoxicy>
it's a completely diffrenet thing on fpga versus asic
<Bluefoxicy>
with FPGA I have the advantage that I can make certain decisions with just five bits of input, so each of three functions here is a 2-output LUT6
<Bluefoxicy>
with ASIC I don't know anything about ASIC :D
<muurkha>
yeah, I've read stuff but I don't really know anything
<muurkha>
(thus my nickname)
<Bluefoxicy>
but the boolean logic is simple and direct, and there are a few points where the circuit is going to have a stable signal before it gets SEL (so you want something that can switch fast) and others where you'll have SEL long before a stable signal (so you want to be able to pass a signal that's changing quickly, but you can switch the mux more slowly)
<muurkha>
the kind of situation where you might use pass transistors if you were doing an ASIC
<Bluefoxicy>
Intel, stupidly, put zeroes in the unused portions of their SRT look-up table in the Pentium
<Bluefoxicy>
the problem being they ALSO accidentally put zeroes in five USED cells which were SUPPOSED to be 2
<Bluefoxicy>
but that entire area is above the line where the value should be 2
<muurkha>
silly intel! no cookie!
<Bluefoxicy>
so filling those cells with 2 instead of 0 would have been perfectly fine and avoided this
<muurkha>
nowadays they probably have bugs like that every year but patch them with microcode
<muurkha>
there are a lot of switching devices that can do a crossbar switch kind of thing that can pass data a lot faster than it can switch
<Bluefoxicy>
The < signs on that page should be ≤ btw
<Bluefoxicy>
I asked the guy who wrote it. He said that's a mistake and he'll fix it eventually.
<muurkha>
the OG there is the electromechanical telephone switching network, but there are lots of others
<Bluefoxicy>
OG?
<muurkha>
original gangster
<Bluefoxicy>
ah
<muurkha>
(a voice signal might vary at 3 kHz, but the stepping relay can only change who it goes to at 0.1 Hz or less)
Sofia_ has joined #riscv
<Bluefoxicy>
you see the big, colorful graphic on that page?
<muurkha>
yeah
<Bluefoxicy>
there are regions where the value of any cell can be 1 or 2 (or -1 or -2) and it'll still work
<muurkha>
yeah
<Bluefoxicy>
so some people did things like make the table symmetrical so you can fold it, only need half the table
<muurkha>
right
<muurkha>
thus 1066
Sofia has quit [Ping timeout: 276 seconds]
<muurkha>
there are a lot of optical systems that have that property too, where the optical signal is varying at THz but the switching can only happen at MHz speeds
<muurkha>
and pass transistors have that property too
<muurkha>
or they did! nowadays I think nobody uses them, maybe that's wrong
Sofia_ is now known as Sofia
<Bluefoxicy>
I wonked about with it and got it so like if R is abcd.ef (I don't need the last digit), with [bcd.ef] being the original bits XOR'd with [a], then if I have (e AND NOT b) AND (c XOR d), I use 2 bits of R plus 3 of D to decide whether to shift (Q= 2 or -2)
<Bluefoxicy>
and otherwise, I know whether to shift based on a couple bits of R
<Bluefoxicy>
this is not exactly accurate
<Bluefoxicy>
but because the iteration is R[j+1] = 4×(R[j] - q*D), if q is zero, then it's 4×(R[j]), so I have a mux at the output of the adder
<Bluefoxicy>
it picks either the adder's output or R to shift left 2 bits.
<Bluefoxicy>
which means as long as I'm only incorrect about shifting (q being 2 or -2) when q is zero, it doesn't matter
<Bluefoxicy>
which, like you said, pass transistor logic there: I can figure out q is zero WAY before I can figure out if I should shift
<Bluefoxicy>
whereas R is going to be on the mux going into the adder immediately, and waiting for the shift signal to finish a few levels of logic, so I want that to be able to switch between stable signals quickly
<Bluefoxicy>
the result of all this is I get this handful of gates that just does the right thing
<Bluefoxicy>
it's annoying tbh.
<Bluefoxicy>
Division is hard.
<Bluefoxicy>
Addition, subtraction, and multiplication are just simple, fixed trees of adders. Easy. Nothing all that difficult. Might be a little delay, may need to register stages of multiplication to meet timing.
<Bluefoxicy>
Division is an iterative process with a bunch of decisions to make in each iteration >:|
<Bluefoxicy>
or in technical mathematical computer science terms, "hard"
<Bluefoxicy>
I spent months trying to find a way to not have a division in my FM synthesizer at all ever. :\
<jrtc27>
hasn't been updated to say they were ratified, but that's the set that got sent off to the board for ratification
Ivii has quit [Remote host closed the connection]
smartin has quit [Ping timeout: 256 seconds]
mahmutov has quit [Ping timeout: 252 seconds]
<Bluefoxicy>
@muurkha, there's a point where I need to compute a band-limited waveform because my operators can generate trivial waveforms other than sine, and correcting the scaling requires some computation like K((pi*K^(2/3)/cubert(3) ÷ (4K×sin(f0×pi/fs)))^3)
<Bluefoxicy>
and I can't directly generate 1/sin
<Bluefoxicy>
I also need to compute 1/(KK')Ksin(w0)×K'sinh((Kln(2)×B×w0)÷(2Ksin(w0))) for band-pass/band-cut filters
<Bluefoxicy>
the constants can be pre-computed and hard-wired
<Bluefoxicy>
but everywhere I need cosecant can only possibly calculate sine and reciprocate the result, unless someone here has a magic CORDIC function that spits out cosecant natively
<dh`>
i'm sure you can derive a series for cosecant