ChanServ changed the topic of #prjunnamed to: FPGA toolchain project · rule #0 of prjunnamed: no one should ever burn out building software · https://github.com/prjunnamed/prjunnamed · logs: https://libera.irclog.whitequark.org/prjunnamed
gatecat[m] has quit [Quit: Idle timeout reached: 172800s]
widlarizerEmilJT has quit [Quit: Idle timeout reached: 172800s]
povikMartinPovie has joined #prjunnamed
<povikMartinPovie> hey
<povikMartinPovie> so here’s the rebel base
<whitequark[cis]> :p
<Wanda[cis]> hi
<Wanda[cis]> join us, we have cookies
<Wanda[cis]> hm
<Wanda[cis]> ... or at least chocolate. chocolate will do.
<whitequark[cis]> also cursed macros
<Wanda[cis]> chomps chocolate
<Wanda[cis]> hm. amazing just how much the prjunnamed IR ends up looking like Amaranth NIR
<Wanda[cis]> I was thinking we'd avoid having to use late nets like NIR does and here I am figuring out how to add them
<Wanda[cis]> ah well. turns out NIR was an unnamed dress rehearsal in disguise all along
<whitequark[cis]> no i'm pretty sure that was an explicit goal of doing it like this
<whitequark[cis]> to sneakily validate it in amaranth so that we have a known-good design when starting unnamed
<whitequark[cis]> and to have an easy lowering from amaranth to unnamed
<Wanda[cis]> possibly, I have no memory of this
<Wanda[cis]> but sounds like something we'd do
<whitequark[cis]> i remember feeling very clever about it.
<whitequark[cis]> therefore it must be true
<povikMartinPovie> do you know what a mapped network will look like inside unnamed?
<Wanda[cis]> we have TargetCell for that
<povikMartinPovie> if I were to play with LUT resynthesis, is the groundwork already laid to represent the before and after?
<Wanda[cis]> basically like an instance, except strongly typed by a prototype
<Wanda[cis]> and with more efficient memory representation
<Wanda[cis]> as well as our first techmapping pass
<povikMartinPovie> and you map already, or is there a way to import a mapped network from, say, yosys?
<Wanda[cis]> (FF mapping will follow in a few hours, though I may have to redesign the IR to add late nets first)
<Wanda[cis]> well we have import/export in yosys JSON format
<Wanda[cis]> and we have verified that it can roundtrip a mapped netlist (a glasgow bitstream) through the target cell representation
<Wanda[cis]> so far we map IOBs; I'm currently busy writing the FF mapping pass and beefing up the IR a bit while I'm at it; once we're done with this and a few flop optimizations that Cat is doing, we intend to write a simple cutless LUT + CARRY mapper and see if this thing can handle actually synthesizing a glasgow applet
<povikMartinPovie> intriguing
<whitequark[cis]> we'd need memories for a glasgow applet
<Wanda[cis]> <povikMartinPovie> "if I were to play with LUT..." <- btw this project is in such intense flux that I'd not recommend playing with writing passes without really close coordination
<Wanda[cis]> oh right. memories.
<whitequark[cis]> although i could lower FIFO sizes and then it should be possible to bitblast them
<Wanda[cis]> was never good at that
<povikMartinPovie> haha
<whitequark[cis]> LUT resynthesis sounds interesting but we don't even have a LUT cell yet
<whitequark[cis]> (we should)
<povikMartinPovie> arent you the author of memory_libmap, Wanda?
<whitequark[cis]> i think that was the joke
<Wanda[cis]> I am. that was a joke about dissociative amnesia.
<povikMartinPovie> well anyway the energy barrier of rust might be too high, so we will see if I start anything at all
<Wanda[cis]> oh yeah, we're also having a bit of a fun time fighting with rust bullshit
<Wanda[cis]> it's better than C++ bullshit, but it's by no means free of bullshit
<povikMartinPovie> but your project may be the ideal place to contribute the two resynth algorithms I know
<whitequark[cis]> which ones?
<povikMartinPovie> mfs2 and lutdc in abc lingo I think?
<whitequark[cis]> none of these words are in the bible
<povikMartinPovie> both are pretty neat and they complement each other in my experience
<povikMartinPovie> whitequark[cis]: I know…
<povikMartinPovie> I can find the papers if you are interested
<whitequark[cis]> i am
<povikMartinPovie> ok, I will get to it later when not on the phone
<whitequark[cis]> i've been looking at alanmi papers this night actually
<povikMartinPovie> this one works on a post-mapped cut which has a suspiciously high lut count vs the number of inputs to the cut, collapses the cut into a single lut table, then tries to generate a new lut network for the cut: https://people.eecs.berkeley.edu/~alanmi/publications/2023/iwls23_lut.pdf
<povikMartinPovie> that's the main one on my mind
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 1 commit to main [+0/-0/±4] https://github.com/prjunnamed/prjunnamed/compare/e7aef7d63fa4...d2d1cafdb86e
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark d2d1caf - Implement folding of inverters into FF control inputs.
<whitequark[cis]> oh, neat
<povikMartinPovie> I wrote an implementation of that here https://github.com/povik/toymap/blob/master/post.cc but I also have a cleaner rewrite on my disk
<povikMartinPovie> for the second algorithm the reference paper might be this one: https://people.eecs.berkeley.edu/~alanmi/publications/2010/todaes10_exdc.pdf
<povikMartinPovie> but it might be better just to see the code for it, one second
<povikMartinPovie> this one requires a performant SAT solver; for each LUT in the network, you propose some changes to its input (e.g. dropping an input or substituting the input with something else if that saves you a LUT), and then you repeatedly query to SAT solver to try to recover the look-up table which would be required for the new set of inputs to preserve the function of the original network
<povikMartinPovie> usually you find out it doesn't exist, there's no way to reimplement the original network having rewired the inputs, but when you find it exists, you save a LUT
<povikMartinPovie> basically: you use the SAT solver to find out if there are nodes in the mapped network you can remove, as long as you adjust the LUTs which have been using this node
<whitequark[cis]> interesting
<whitequark[cis]> do you apply this on every LUT, or only on the paths with negative slack?
<povikMartinPovie> both are area-saving algorithms, primarily
<povikMartinPovie> the first one can be adapted to improving timing, too
<povikMartinPovie> the adaptation is the subject of a follow-up paper: https://people.eecs.berkeley.edu/~alanmi/publications/2024/iwls24_acd.pdf
<povikMartinPovie> I haven't read it so I don't know if it contains anything interesting, or if what it does can be extrapolated from the original
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 2 commits to main [+0/-0/±3] https://github.com/prjunnamed/prjunnamed/compare/d2d1cafdb86e...7eb36fb9be37
<_whitenotifier-4> [prjunnamed/prjunnamed] wanda-phi b1f6c09 - Add void pseudo-cells.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark 7eb36fb - Simplify control nets of `iob` and target cells.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 1 commit to main [+0/-0/±2] https://github.com/prjunnamed/prjunnamed/compare/7eb36fb9be37...556c86dd348e
<_whitenotifier-4> [prjunnamed/prjunnamed] wanda-phi 556c86d - Use void cells for Yosys JSON importing.
<povikMartinPovie> <whitequark[cis]> "LUT resynthesis sounds interesti..." <- fwiw it makes more sense to me for the resynthesis algorithms to operate on the target cell instead of an intermediate cell like $lut
<povikMartinPovie> the post-mapping transformation needs to understand the delays and areas of the target cells anyway for good results
<Wanda[cis]> perhaps
<povikMartinPovie> the way that's handled with Yosys, where you operate on the intermediate cell, but you need to look up the definitions for the target cell, is akward to me
<Wanda[cis]> but there's the idea I've been toying with since my first unnamed IR design 2 or 3 years ago
<Wanda[cis]> of just ... using a generic $lut cell all the way through the flow
<Wanda[cis]> why bother defining SB_LUT4 as a target cell at all when there's literally nothing to it other than being a 4-input LUT
<Wanda[cis]> it's not any different from an ECP5 LUT4 cell either
<povikMartinPovie> you define the propagation delays on inputs, and the area
<povikMartinPovie> the area doesn't matter if LUT4 is all you got, if you have LUT5 to... you get the idea
<Wanda[cis]> yeah but they are queried from the target anyway
<povikMartinPovie> s/to/too/
<Wanda[cis]> it's about removing redundant representations
<Wanda[cis]> oh btw, the way that unnamed IR works is that you select a target when you first create the netlist
<Wanda[cis]> the target is a trait object that, among other things, provides the set of valid target cells and hooks for a bunch of passes
<Wanda[cis]> it would be perfectly possible to have a completely generic CellRepr::Lut cell kind, and just have a target hook that gives you timings
<Wanda[cis]> the main problem with that idea is that there are also weird target-dependent LUT-like things that you kiiinda want in your LUT optimizations as well as plain LUTs
<Wanda[cis]> Lattice has CCU2*; SB has SB_CARRY + SB_LUT4 which has constraints that a plain SB_LUT4 doesn't; then there's the Xilinx and Altera multi-output LUTs
<Wanda[cis]> so perhaps ultimately you don't gain that much from having a shared representation just for the easy case? idk.
<povikMartinPovie> you could also make the case for a generic Lut cell to use in the input, with no fixed relationship to target lut cells
<povikMartinPovie> though you might have a different way to represent that already, and I hear you are not shopping around for more cells unless really necessary
<Wanda[cis]> I'm not sure why we'd want lut cells on input
<Wanda[cis]> you can represent a LUT already; it is a mux tree
<povikMartinPovie> yeah, that's what I mean
<Wanda[cis]> if anything, I'm wondering whether we should have a binary mux cell that has more than 1-bit select input
<Wanda[cis]> $bmux in yosys terms I think?
<povikMartinPovie> yes
<Wanda[cis]> a LUT is trivially represented as an instance of this cell with const data input
<Wanda[cis]> ohh right, I added the $bmux cell to yosys in the first place
<povikMartinPovie> figures, I know it was a late addition
<povikMartinPovie> that's why the Verilog frontend seems to ignore it and opt for shift cells instead
<povikMartinPovie> which doesn't always work out well
<Wanda[cis]> well verilog doesn't really have anything equivalent
<Wanda[cis]> I think you always have to transform from a shift?
<povikMartinPovie> array[idx] where the inner dimension isn't pow-of-2 is a much better fit for bmux than a shift cell
<povikMartinPovie> of course you can always expand into a mux tree, it depends on how coarse grain you want your IR
<Wanda[cis]> what do you mean by an array?
<Wanda[cis]> like SV multi-dimensional memory? or an SV packed array?
<povikMartinPovie> both here, I think; the end result is the same
<Wanda[cis]> not necessarily
<Wanda[cis]> one could reasonably involve being mapped to memory
<Wanda[cis]> but... yeah, okay, array indexing in SV I guess fits the bill for bmux
<povikMartinPovie> yup
<povikMartinPovie> why's Trit called that way? what does the last t stand for?
<whitequark[cis]> it's like Bit but there's three values
<whitequark[cis]> if there was also 'z i would've called it Quit
<whitequark[cis]> you should let me name more things. I love naming things
<povikMartinPovie> this one's well done
<Wanda[cis]> whitequark[cis]: <del>how about we start with the project</del>
Guest63 has joined #prjunnamed
Guest63 has quit [Client Quit]
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 2 commits to main [+0/-0/±5] https://github.com/prjunnamed/prjunnamed/compare/556c86dd348e...aa3dd318b714
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark c8af317 - Add more flip-flop simplifications.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark aa3dd31 - Add flip-flop reset and enable recognition.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 1 commit to main [+0/-0/±3] https://github.com/prjunnamed/prjunnamed/compare/aa3dd318b714...bb9156ec0238
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark bb9156e - Add flip-flop reset and enable recognition.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 1 commit to main [+0/-0/±4] https://github.com/prjunnamed/prjunnamed/compare/bb9156ec0238...18edf00ed2c9
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark 18edf00 - Add flip-flop reset and enable recognition.
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark pushed 1 commit to main [+0/-0/±4] https://github.com/prjunnamed/prjunnamed/compare/18edf00ed2c9...37295334b86e
<_whitenotifier-4> [prjunnamed/prjunnamed] whitequark 3729533 - Add flip-flop reset and enable recognition.
<whitequark[cis]> fourth times the charm
RowanG[m] has quit [Quit: Idle timeout reached: 172800s]