<Adrien[m]>
Back to the discussion about mapping and ABC
<Adrien[m]>
There is a feature request I opened some time ago in Yosys repo, about optimizing combinatorial logic when all inputs can fit in one level of LUTs
<Adrien[m]>
I am still trying to understand if this should be a task for yosys, or for ABC
<Adrien[m]>
Certainly someone in this channel has clear understanding of the boundaries between yosys and ABC and could explain this to me ?
<Adrien[m]>
I may be able to assign an intern to this development, if only I could see through these software stacks a bit more clearly :D
<corecode>
is your proposal a simple (hierarchical) karnaugh map type thing?
<corecode>
does it have to use a hierarchy?
<Adrien[m]>
Yes, complete conversion of the component instance to plain lookup tables
<Adrien[m]>
That optim is also called memoization in software compilation field, if I'm correct
<Adrien[m]>
The hierarchy proides the boundaries of interest, provided by the developer
<Adrien[m]>
To cope with inexistent or insufficient cut-searching-algorithm in mapping tools
<Adrien[m]>
in that case, yosys and whatever backent mapping tools it is using.
<Adrien[m]>
s/backent/backend/
<corecode>
i think it's a good option. doesn't have to be a standard pass
<corecode>
have it operate on the hierarchy, run it before the tree is flattened
<corecode>
rather than annotating boundaries and rediscovering them
<lofty>
Adrien[m]: I would disagree with your proposal, actually.
<lofty>
Suppose Yosys performs this transformation
<lofty>
ABC and especially ABC9 flows work in and-inverter graph form
<lofty>
So when Yosys has to give the design to ABC, it has to blow the LUT back into gates
<lofty>
Which is not that different from representing it as gates to begin with
<corecode>
so how can it be done
<lofty>
It can't.
<lofty>
If you make this optimisation you have to compromise at the Yosys/ABC boundary somehow
<lofty>
Either you blow the LUT to gates
<corecode>
that's silly tho to say that this optimization cannot be done
<corecode>
maybe it can't be done with the current structure
<lofty>
See, saying things like "it's silly to say this cannot be done" is the kind of thing that gets you a brusque "patches welcome" in response
<lofty>
so instead of ignoring my experience in the area would you like to rephrase?
<corecode>
can ports be marked to have an opportunistic cutting point for lut mapping later?
<lofty>
No.
<corecode>
why not?
<lofty>
You can't communicate this to ABC
<corecode>
nono, through
<lofty>
Let me guess
<corecode>
does the lut mapping happen in abc?
<whitequark[cis]>
you know, a long time ago I was told that a LUT mapper other than abc cannot be done
<corecode>
whitequark[cis]: and?
<whitequark[cis]>
and then it turned out that a nearly trivial pass converting fine gates to small LUTs and then combining those small LUTs into bigger ones gets you somewhere in the ballpark of abc. I think it was about 2x area and slightly lower Fmax
<lofty>
Your proposal is that Yosys instead does LUT mapping, which runs into the issue that you cannot then pass information into ABC; *all* optimisations would have to be entirely in Yosys
<corecode>
so what needs to happen to communicate this to abc?
<whitequark[cis]>
I think there's a lot of area to be explored in LUT mapping rather than defeatingly accepting that it can only be done by abc
<lofty>
Perhaps I should better say that to do this would be a serious effort requiring multiple people
<whitequark[cis]>
(I've attempted to explore it but at the time my health took a nosedive and so my attempt remains in the Yosys tree but it's not really usable)
<lofty>
whitequark[cis]: ABC's algorithms are state of the art. I'm all for reinventing the wheel to learn how wheels are made, but to get up to ABC parity is a challenge.
<whitequark[cis]>
I like a challenge
<whitequark[cis]>
I think I'm not alone in that, too
<lofty>
I've been paid pretty good money to try to match ABC by myself
<lofty>
I failed.
<corecode>
so can you modify abc?
<whitequark[cis]>
when I fail I stand up and try again
<whitequark[cis]>
eventually with enough persistence you succeed
<whitequark[cis]>
and advance the state of the art
<lofty>
when I fail I look for different jobs that can put food on the table
<lofty>
Anyway
<Adrien[m]>
You know I'm OK with a "patches welcome" answer :D
<Adrien[m]>
As long as I can get a few guidelines about what part of what tool is doing what, because I'm new to all these internals and probably won't have time to deep dive myself.
<lofty>
If someone writes a priority cut mapping pass for Yosys, they have one piece of the optimisation triforce
<whitequark[cis]>
I'd like to see an improvement on abc (which by the way can compete with it not only in performance but also in usability) and it would be unfortunate if your failure discouraged others from doing it; so I'll do my best to encourage :)
<corecode>
why can't you add this to abc?
<lofty>
The other two pieces are structural choices and sequential mapping
<lofty>
corecode: ABC's code is miserable to work with.
<lofty>
It is C code trying to be C++
<corecode>
ok
<whitequark[cis]>
it is also written by one person with a quirky style that's difficult to understand if you don't have his context in your head somehow
<corecode>
thats not a good reason not to try to implement something so seemingly trivial
<lofty>
Yes, "seemingly trivial" is the thing
<whitequark[cis]>
there are very few tests; take a look at recent commits to abc and you'll see what I mean
<Adrien[m]>
In many proprietary tools there is support for tool-specific attributes that developers can add to the RTL, saying how this module or instance should be handled
<Adrien[m]>
We could be selective at least this way, about tabulating some components instances
<Adrien[m]>
BTW having proper handling of usual KEEP or DONT_TOUCH attributes would be much welcome, so there may be a broader context in which that tabulating feature would fit nicely.
<lofty>
ABC is built up of a variety of "managers" which represent the netlist in different ways
<lofty>
These are all highly optimised and fragile and there is little documentation on them
<Adrien[m]>
lofty: I've had issues with it, so I should have a new look to it if it's supposed to work
<jix>
yosys's (* keep *) just makes sure the signal stays around it doesn't prevent optimizations from rewriting the fan-out of that signal to not use it anymore
<jix>
which is fine if you want to make sure you still have that signal in simulation or FV traces, but doesn't cover a lot of use cases people have
<lofty>
whitequark[cis]: my failure was educational to me and I would very much like to try again, but first I have to remove my association of the task with a very bad work environment.
<corecode>
yea, it's code that does a lot of things
<lofty>
corecode: and now it must do even more
<corecode>
could it go into another manager?
<corecode>
maybe we should first figure out why it fails to get to the optimum solution
<corecode>
and sometimes it does not
<corecode>
but yea definitely looks like an abc task if you want to use abc
<lofty>
I think that might be a much easier problem to solve than trying to communicate things to ABC...
<lofty>
My guess? There's probably a bug with how it handles whiteboxes.
<corecode>
Adrien[m]: maybe that's a first task for an intern. create several test cases that succeed/fail when trivial aspects are changed, and record all kinds of internal data from yosys and abc
<lofty>
I did my best to create a set of test cases in my own LUT mapper
<lofty>
They're kind of painful to debug
Psentee has quit [Quit: Off]
Psentee has joined #yosys
<Adrien[m]>
<jix> "yosys's (* keep *) just makes..." <- Aha yes indeed. And the DONT_TOUCH has the purpose of ensuring both "sides" of the signal are kept separate.
<Adrien[m]>
Originally I wanted to guide the yosys/abc mapper by putting such attributes where I intended to have my LUT boundaries, this is another way of guiding the mapper about where interesting boundaries are supposed to be.
<povik>
12:43 < whitequark[cis]> and then it turned out that a nearly trivial pass converting fine gates to small LUTs and then combining those small LUTs into bigger ones gets you somewhere in the ballpark of abc. I think it was about 2x area and slightly lower Fmax
<povik>
is that what flowmap is?
<Adrien[m]>
The DONT_TOUCH (if it's not implemented or honored yet) has rather same level of importance than the hierarchy boundary feature, to my feeling
<Adrien[m]>
Now I'm used to design my own netlist-level components, instantiate my LUTs, so that I can let automated toos yosys/abc handle the rest of the less-critical glue logic... it's extremely time-consuming !
<whitequark[cis]>
<povik> "12:43 < Catherine> and then it..." <- nono, flowmap is the more complicated one
<whitequark[cis]>
the simple one is a techmap file I don't quite recall how it's called plus the opt_lut pass
<whitequark[cis]>
I can dig it out for you later; busy right now
<whitequark[cis]>
the result was way better than what I thought it would be and it provided ample motivation to do something that wasn't completely trivial
<whitequark[cis]>
unfortunately for flowmap I completely fucked up the internal representation for the pass; it needs a complete rewrite
<whitequark[cis]>
I was working at the edge of my competency at the time and made a few design mistakes I didn't realize until much later
<povik>
would the different representation improve QoR?
<whitequark[cis]>
"it's complicated"
<whitequark[cis]>
with a different representation flowmap-r (note the -r) would actually work properly, and we could get results and evaluate them comparing against abc
<povik>
i tested the toy priority cut mapper against flowmap also and it seemed easy to beat
<povik>
ah, i don't think i invoked the -r variety
<whitequark[cis]>
so the reason I picked the flowmap paper is because it was the first one in my search results with an actual algorithm in it
<povik>
right
<whitequark[cis]>
my approach to this problem was essentially "spray and pray"; implement a bunch of papers, understand the tradeoffs, then combine it all into something nice
<whitequark[cis]>
flowmap itself basically unusable as an algorithm, for more than 4-LUTs the runtime becomes intractable
<povik>
i was surprised how close i got to abc with they toy, if you haven't checked that already
<povik>
s/they/the/
<whitequark[cis]>
it was never intended to be the solution to replacing abc, it was just some thing I wrote and thought it could be useful to upstream
<whitequark[cis]>
povik: have not! got a link and some test results?
<povik>
that's what we were discussing with lofty yesterday
<whitequark[cis]>
the reason I wrote flowmap is to pave the path to better LUT mapping essentially
<whitequark[cis]>
at the time, I talked to gatecat and they told me that they don't think replacing abc is feasible at all
<whitequark[cis]>
and I wanted to prove them wrong (in a friendly way)
<whitequark[cis]>
not only by getting some results myself but by inspiring others to try too
<whitequark[cis]>
toymap seems exciting!
<povik>
:)
<povik>
i wrote it over few days trying to make techmapping seem less magical
<povik>
when i saw it's cutting up an AIG in one of mischenko's paper i knew i wanted to try it myself
<povik>
s/mischenko/Mishchenko/
<whitequark[cis]>
very excited
nak has joined #yosys
<jix>
I'd really love to see (experiments with) more use of e-graphs in synthesis to avoid pass order problems or problems with optimizations undoing some manually tuned parts of the design or just undoing good choices of prior optimizations ... I know that abc already supports choices but it's not clear to me where abc actually makes use of them and there's not really any such concept in yosys
* povik
googles e-graphs
<jix>
The https://egraphs-good.github.io/ paper is a good intro and resparked their popularity but SMT solvers have been using e-graphs since forever
<tpb>
Title: egg (at egraphs-good.github.io)
<povik>
ok, printing that out for later
<povik>
uh, 33 pages
<jix>
the basic idea is that instead of doing rewrites you keep the previous and new version around using something that you can think of as choice nodes but using a nice representation to work with and keep things relatively compact
<povik>
right, what would be called structural choices in papers related to abc
<jix>
except that AFAICT from using abc, it's not really embracing it and doesn't use an internal representation tuned to make heavy use of it
<povik>
i guess one way to see the effect of it is to re-import the aig just before lut mapping and compare the results
philtor has quit [Remote host closed the connection]
<lofty>
jix: well, things like abc9 flow3 depend pretty heavily on structural choices, so it seems to work fairly well in my experience
Psentee has quit [Quit: Off]
<jix>
lofty: I haven't looked at that in particular as I mostly do FV stuff, and for that a lot of stuff doesn't support them at all and IIRC also some of the aig to aig optimizations would just discard them
Psentee has joined #yosys
<jix>
lofty: what's the corresponding paper to flow3 (if there is a specific one)?
ec has quit [Remote host closed the connection]
ec has joined #yosys
GenTooMan has quit [Ping timeout: 248 seconds]
GenTooMan has joined #yosys
strobo has joined #yosys
FabM has quit [Ping timeout: 240 seconds]
_catircservices has quit [Quit: Bridge terminating on SIGTERM]
whitequark[cis] has quit [Quit: Bridge terminating on SIGTERM]
Adrien[m] has quit [Quit: Bridge terminating on SIGTERM]
xiretza[m]1 has quit [Quit: Bridge terminating on SIGTERM]
Wanda[cis] has quit [Quit: Bridge terminating on SIGTERM]
sauce has joined #yosys
philtom has joined #yosys
philtom is now known as philtor
philtor has quit [Quit: Leaving]
philtor has joined #yosys
_catircservices has joined #yosys
_catircservices1 has joined #yosys
<corecode>
so how does abc deal with LUT blocks?
<corecode>
i.e. when i use a target specific lut instance
lexano has joined #yosys
_catircservices1 has quit [Quit: Bridge terminating on SIGTERM]
_catircservices has quit [Read error: Connection reset by peer]
_catircservices has joined #yosys
Sarayan has quit [Remote host closed the connection]
<povik>
you mean if you explicitly instance a target-specific LUT cell in your design, then run abc synthesis?
<povik>
i think those will be treated as blackboxes, maybe abc9 can consider those properly in the timing model
lkcl has quit [Quit: Leaving]
<corecode>
so that could be used for Adrien's idea
Adrien[m] has joined #yosys
<Adrien[m]>
I very often instanciate bare LUTs, carry, DSP, RAMB18/36 etc in my RTL dsigns
<Adrien[m]>
just to obtain precisely what I want to obtain
<Adrien[m]>
yosys and whatever is behind seem to know that this exists as-in in the FPGA.
<Adrien[m]>
and it is kept untouched in the design
<corecode>
Adrien[m]: i mean a pass before abc that will turn sections into blackbox luts
<corecode>
then abc can't optimize them away
<lofty>
corecode: blackbox cells are implemented by hiding them from ABC
<lofty>
which means they can't be optimised away, but they also mislead ABC as to the actual length of the path
<Adrien[m]>
<lofty> "which means they can't be..." <- Good to know. Thanks for pointing this out.
<Adrien[m]>
Then it would be very useful that ABC recognizes the primitives from its input design.
<Adrien[m]>
Do you mean the the mapper has some form of taking timing into account to determine appropriate boundaries for what should be grouped into LUTs ?
<Adrien[m]>
and if that is the case, does it try to favor the critical paths in priority ?
<Adrien[m]>
Finding documentation for abc internals does not seem easy...
<lofty>
[19:44:49] povik: i think those will be treated as blackboxes, maybe abc9 can consider those properly in the timing model <--- you could treat them as whiteboxes, I suppose
<lofty>
[23:01:21] Adrien[m]: Do you mean the the mapper has some form of taking timing into account to determine appropriate boundaries for what should be grouped into LUTs ? <--- yes, either LUT level or timing information provided to ABC (which is what forms ABC9)
<lofty>
[23:03:39] Adrien[m]: and if that is the case, does it try to favor the critical paths in priority ? <--- yes
<lofty>
The priority cut algorithm will always provide a depth-optimal result (modulo bugs)