#prjunnamed on 2025-02-19 — irc logs at libera.irclog.whitequark.org

2025-02-18 08:42 whitequark[cis] changed the topic of #prjunnamed to: FPGA toolchain project · rule #0 of prjunnamed: no one should ever burn out building software · https://prjunnamed.org · https://github.com/prjunnamed/prjunnamed · logs: https://libera.irclog.whitequark.org/prjunnamed

02:26 sdomi has quit [Ping timeout: 248 seconds]

03:21 sdomi has joined #prjunnamed

04:01 mupuf has quit [Remote host closed the connection]

04:01 mupuf has joined #prjunnamed

04:26 cr1901 has quit [Quit: Leaving]

04:31 cr1901 has joined #prjunnamed

06:36 <whitequark[cis]> <mei[m]> "oh! so it's like that for you..." <- haha no it's like hauling rocks. i just do it anyway and eat the cost

09:16 <povikMartinPovie> if we had AndNot cells, it would enable representing an AIG with a single node per and gate of the AIG, modulo inversions at the boundary

09:23 <povikMartinPovie> this comes into play when you are writing mappers directly operating on unnamed IR -- complementing is somewhat free, so you don't want to distinguish between a fanout connected through an inverter, and one who's not; if you can normalize the source netlist so that it's free of internal invertors, it can help

09:28 <povikMartinPovie> I think And, Or, Andnot, Xor are a complete set to represent any two-input function modulo complementing an output?

09:28 <jix> another possibility is to use 2-luts, which give you one degree of freedom too much, but then evaluate the combinational part once using all zero inputs and normalize the output polarities of all those 2-luts to be zero

09:28 <jix> and that's exactly equivalent to using only And, Or, AndNot and Xor

09:29 <whitequark[cis]> i like the 2-lut option more

09:29 <whitequark[cis]> we've discussed it before

09:38 <povikMartinPovie> the 2-lut thing can just as well be an abstraction (using ControlNets maybe?), I'm not sure it's worth the bother introducing into the basic set of cells

09:39 <whitequark[cis]> i've also thought about adding ControlNets into the mix, yeah

09:39 <povikMartinPovie> is there a chance you will add Andnot? I'm inclined to add it via a local patch and continue my work

09:39 <jix> (the evaluation with all zero inputs is done if you want to normalize polarities even across combinational logic using representations where you can't push inverters through, if that's not a concern you can use rewrite rules alone ofc)

09:40 <whitequark[cis]> it seems pretty application-specific and thus far we've avoided pinning ourselves to AIG-specific restrictions

09:40 <whitequark[cis]> i think i want to understand your plan more first

09:40 <whitequark[cis]> can you tell us how it will simplify your job?

09:42 <whitequark[cis]> and in particular, why it should be a part of the general repertoire of cells rather than a pass-specific thing?

09:42 <whitequark[cis]> for example, the LUT mapper maintains a table of LUT dispositions and some other internal state, which it does not need for interchange with anything else; would this not serve you well?

09:43 <whitequark[cis]> the other possibility is just "looking through" an inverter, which is very cheap in our representation

09:44 <povikMartinPovie> sure, I'm porting a standard cell mapper which so far is working off its own AIG representation of the input, which is built at the start of the pass from unnamed IR

09:45 <povikMartinPovie> I realized it could be simplified working closer to unnamed IR, there's not that much gained from building the internal representation other than dealing with the inverter thing

09:47 <jix> given that the IR doesn't maintain fanout indices (afaik?), don't you need to build that index on the fly anyway and could use an index that looks through inverters when needed?

09:47 <whitequark[cis]> it does not, yeah

09:47 <povikMartinPovie> I guess

09:47 <whitequark[cis]> (we've planned at the beginning to do so and discussed it several times since but it never seemed particularly valuable given you can do it in one pass and like five lines of code)

09:47 <whitequark[cis]> * (we've planned at the beginning to do so and discussed it several times since but it never seemed particularly valuable given you can do it in a single pass and like five lines of code)

09:49 <povikMartinPovie> the input wouldn't need to be AIG, but if it's coarser than an AIG it would restrict the options of the mapper, since as a cut mapper it's always packing a whole number of input nodes into the selected gate

09:49 <povikMartinPovie> not sure if that would be useful to anyone, e.g. to force a mux to be split across gates

09:49 <povikMartinPovie> * a mux not to be

09:50 <whitequark[cis]> by "coarser" do you mean "more than 2-input per cell"?

09:50 <povikMartinPovie> yes, or allowing xor/xnor

09:53 <whitequark[cis]> hm, so you need to lower the netlist first and then build a fanout index

09:53 <whitequark[cis]> i think i'm inclined to agree with @jix that it would probably not be very beneficial to add this cell to our repertoire

09:54 <whitequark[cis]> among other things i dislike that it's non-commutative which makes merging harder, and i'm not sure how to satisfy simplify's monotonicity guarantees if it's going to be mapped to

09:54 <whitequark[cis]> and if it's just used and understood by this pass it seems too pass-specific to make the entire project care about it

09:54 <jix> fwiw, that's not what I said (but it might be a direct conclusion from what I said and other things that you consider a given thing)

09:54 <whitequark[cis]> oh, yes, sorry

09:55 <povikMartinPovie> some amount indexing of the design is required inside the mapper in any case, but when it's a single cell standing between a 1:1 correspondence of unnamed IR and nodes which the mapper is packing, I want to bring it up

09:55 <whitequark[cis]> i meant that i agree with you saying that building a fanout index yourself would be necessary and as a part of that you can look through inverters

09:55 <whitequark[cis]> and then make some additional conclusions from that

09:55 <povikMartinPovie> this 1:1 correspondence is of some value inside the mapper, maybe to the user too

09:56 <povikMartinPovie> I've a version of the mapper which builds up a complete AIG for itself from the unnamed design, and finds a mapping, I'm considering moving it closer to the IR before I add the final step of moving the mapping back into unnamed

09:57 <jix> whitequark[cis]: I'm also not disagreeing, I just don't think I can make any calls on what's beneficial to the project as a whole

09:58 <whitequark[cis]> i guess to summarize my thoughts, i feel that because UIR is an interchange format (with other tools; with whoever is reading it; between passes) first, there is a relatively high burden to adding a new cell; we could probably stand to reduce the cell count from the current one

09:59 <whitequark[cis]> (tangentially, we probably don't need so many shifts and divisions each being their individual cell)

09:59 <whitequark[cis]> an obvious way to reduce it is to replace our bitwise cells with Lut1 and Lut2 cells

10:00 <whitequark[cis]> (muxes are pattern matched over in passes like FSM inference, so they should in any case remain their own thing)

10:01 <whitequark[cis]> since you need a preprocessing pass anyway, this would allow you to get a netlist that has a 1:1 correspondence with the mapper nodes just as well, and you can bail out (panic) if the netlist isn't in the right form, which i think you need anyway as it is

10:02 <povikMartinPovie> > this would allow you to get a netlist that has a 1:1 correspondence with the mapper nodes just as well

10:02 <povikMartinPovie> that's true, if you don't mean that this netlist can be held as unnamed IR

10:02 <povikMartinPovie> what I'm asking for (want to have discussed) is extending the IR so that it's possible to do so

10:03 <jix> I think "this" referred to making unnamed use Lut1 and Lut2 for all 1/2-input 1-output bitlevel cells?

10:03 <whitequark[cis]> yes

10:03 <povikMartinPovie> ah, ok, sorry

10:04 <povikMartinPovie> we're on the same page here

10:04 <whitequark[cis]> actually, it would be more general than that

10:04 <whitequark[cis]> or, sorry, let me clarify something

10:04 <whitequark[cis]> if by 1/2 input 1 output you mean "1/2 operands" not "1/2 nets" then we are talking about the same thing

10:05 <whitequark[cis]> (I think it would not be the right move to use Lut1/Lut2 only for bit-level cells)

10:05 <whitequark[cis]> enum Cell { Lut2(u8, Value, Value) } basically

10:06 <jix> I wasn't thinking about word-level at all when writing that, but my impression was that you want to avoid unnecessary duplication of bit-level and word-level functionality so I would agree

10:07 <whitequark[cis]> we would need a bunch of infrastructure to make the text IR more readable (i suppose we could represent %0:1 = lut2 1000 %1 %2 as %0:1 lut2 and %1 %2 for example), and a decent amount of work on the existing passes

10:07 <whitequark[cis]> * we would need a bunch of infrastructure to make the text IR more readable (i suppose we could represent %0:1 = lut2 1000 %1 %2 as %0:1 = lut2 and %1 %2 for example), and a decent amount of work on the existing passes

10:08 <whitequark[cis]> (actually, something like %0:1 = and !%1 %2 might be truer to the spirit here; recognize and specially print all of the most common functions then handle the rest with a truth table)

10:09 <whitequark[cis]> anyway, i'm neither committing nor rejecting anything here, i want to hear what Wanda thinks first

10:09 <whitequark[cis]> these are my personal thoughts on what would be the most useful

10:10 <Wanda[cis]> I don't think we should have more 2-input cells

10:11 <Wanda[cis]> in fact, the Or cell is on thin ice

10:12 <Wanda[cis]> it's enough for mapping already; you can just store (net, inversion) pairs

10:23 <povikMartinPovie> on another topic, do you want to add a method which exposes the size of the cells array?

10:24 <whitequark[cis]> i'm curious where that comes up?

10:25 <povikMartinPovie> I would know the highest index a net can have, and could use a linear array to store some per-net information

10:25 <povikMartinPovie> I wouldn't need HashMap lookups in an inner loop

10:28 <whitequark[cis]> but net indices are not accessible outside of the netlist crate, are they?

10:28 <povikMartinPovie> ah, pub(crate)

10:28 <whitequark[cis]> yes. i insisted on this approach from the very beginning because i felt that Rust graph manipulation implementations tend to overuse indices in a way that makes them harder to work with

10:29 <whitequark[cis]> hm

10:29 <jix> you could provide a `DenseNetMap<T>` to enable this pattern without exposing that implementation detail

10:29 <whitequark[cis]> yes, that's what i was thinking about

10:30 <povikMartinPovie> works for me if my idea of what it is is correct

10:30 <povikMartinPovie> I assume it would borrow the design for the duration of its existence

10:31 <whitequark[cis]> there's a few ways it could go

10:31 <whitequark[cis]> i would probably just prototype this with a HashMap; we do this a lot in existing code already

10:31 <jix> I just meant a wrapper around a Vec and that wouldn't need any borrowing

10:32 <whitequark[cis]> i've mostly avoided spending too much effort on fine-grained optimizations in favor of defining an architecture that allows them later

10:33 <whitequark[cis]> like, swapping one type of map for another is really not a high effort thing so i'm content with not doing the best performing thing upfront

10:33 <povikMartinPovie> if you tell me there will likely be a way to do this later I'll just use a HashMap for now

10:33 <whitequark[cis]> yep

10:35 <whitequark[cis]> i think it's useful to have a repertoire of passes that are written using common Rust abstractions first and then extract the most useful patterns out of them later

10:35 <whitequark[cis]> i suppose the exception to this was CellRepr, but i felt that it's something that should be designed in from the outset because it changes the interface so much

10:38 <whitequark[cis]> this is partly because i think this results in better designs, and partly for aesthetic reasons (i like a codebase that was grown incrementally like this a lot more than i like one which front-loaded a lot of expected fine-grained optimization work much of which may not even have any meaningful effect on runtime)

10:43 <whitequark[cis]> i'm not sure if you've looked into it but right now i can synthesize boneless in 500ms and minerva in 2000ms, which is like... i think less than yosys takes to load the ice40 techmap.v?

10:43 <whitequark[cis]> and most of that comes out of running canonicalize like 15 times to keep fixing up one cell

10:44 <povikMartinPovie> sure, but I want to be not too far off from ABC whose mapper is fairly optimized

10:44 <whitequark[cis]> right

10:45 <whitequark[cis]> i'm on board with that obviously, so let's collect some benchmarks once it's ready for that

10:46 <povikMartinPovie> yes.

11:10 <widlarizerEmilJT> I think that when you have an IR that's comfy to traverse, you're not going to suffer on runtime or development velocity or memory consumption when your inverters aren't folded into ands

11:12 <whitequark[cis]> tbh i think the really big initial gains will be from using some sort of SmallVec type thing for Value

11:15 <whitequark[cis]> Vec is 24 bytes (which is too big, nobody wants >4 billion nets in a Value...), if we find a way to cut that down and also inline the simple cases (a constant; a single net) that should improve how quickly we can traverse the IR

11:16 <whitequark[cis]> `enum ValueRepr { Net(Net), Slice { nets: Box<[Net]>, cap: u32 } }` would probably already improve things a lot

11:20 <whitequark[cis]> SmallVec isn't the right solution because the size/capacity are still both usize

11:23 <whitequark[cis]> but also these things are fiddly enough that i'd really want benchmarks first before committing to anything

11:26 <whitequark[cis]> there's https://docs.rs/mediumvec/latest/mediumvec/vec32/struct.Vec32.html which at a very quick glance doesn't seem to be irresponsible in the unsafe parts

11:29 <widlarizerEmilJT> Had to check by writing them out but you can name all 2-input LUTs if you accept that the following are binary operations too: false, true, fst, snd, implies, impliedby, nimplies, nimpliedby. Printing these in the IR printer would be funny

11:30 <whitequark[cis]> yeah i've done that and i think i would want all of them to be written out as truth tables

11:30 <whitequark[cis]> lut2 0000, lut2 1111 are perfectly fine (and shorter. and take the same amount of space actually) for their pretty rare appearances. so is lut2 1010 or lut2 1100, i think these are even easier to read than fst or snd

11:31 <whitequark[cis]> (frankly as a toolchain dev i'm fine with leaving and as 1000 but i think this might be less true for people who just want to write a backend)

11:31 <whitequark[cis]> you kind of learn to read LUTs once you do LUT mapping

11:31 <whitequark[cis]> so actually here's a really cool thing

11:31 * whitequark[cis] sent a code block: https://catircservices.org/_irc/v1/media/download/Adm6a0rnzdQ--9TZmF_7uSkW5S2frVBJPLNLDZKeR9stAz1407pSNiNG0WS0QedhxFPZ9JBKKXmD8xq3zLk6yVi_8AAAAAAAAGNhdGlyY3NlcnZpY2VzLm9yZy9sdXRoUXRGd1lOaHF5SmlVblBibGNQUG8

11:32 <whitequark[cis]> rust enum optimization makes these two types be the exact same size (24) by utilizing 'niches'

11:32 <galibert[m]> Nah, car and cdr are so much more intuitive

11:32 <whitequark[cis]> i'm wondering if i can shove in another variant by taking advantage of the fact that the pointer is not just non-null but also aligned

11:32 * whitequark[cis] sent a code block: https://catircservices.org/_irc/v1/media/download/ASQI-D9mr2iNE6v1xnnDvsUKKRYSwCponiO23PehzIHSitMkIsgPKLaQJmENJIcm0l7gSKvxpZRK22V15Uxg_iG_8AAAAAAAAGNhdGlyY3NlcnZpY2VzLm9yZy9UZnlPVWtReWh2bEZsVW9zRFFQWFJJaFg

11:32 <whitequark[cis]> still 24 bytes

11:33 <galibert[m]> No tagging ?

11:33 <whitequark[cis]> it does use tags

11:33 <widlarizerEmilJT> the tags come free if they fit

11:33 <whitequark[cis]> the first word of `Vec<Net>` is a pointer to the contents, yes?

11:33 <galibert[m]> You said Vec was 24 by itself ?

11:33 <whitequark[cis]> so it can utilize any values that are illegal for that pointer to take as tags for the other variants

11:33 <galibert[m]> Oh, nicely perverted

11:33 <whitequark[cis]> it's basically doing the thing C programmers do when they want to be clever, but completely automatically and safely

11:33 <whitequark[cis]> this rules

11:34 <whitequark[cis]> i wonder if i can shove in a constant... let's see

11:34 <galibert[m]> Here be tame dragons

11:34 <widlarizerEmilJT> I always thought this was very neat, but does this sizing have stability across platforms?

11:35 <galibert[m]> Probably smaller in 32bits

11:35 <whitequark[cis]> widlarizerEmilJT: well `usize` changes size across platforms so the answer is obviously no

11:36 <whitequark[cis]> #[repr(Rust)] struct layout is not considered stable

11:36 <widlarizerEmilJT> the motorola 68k EDA users will be very sad about that, or something

11:37 <widlarizerEmilJT> povik (Martin Povišer): the mapper you're porting is toymap?

11:37 <galibert[m]> Embedded arm32/rv32 is more pertinent

11:37 <whitequark[cis]> basically the only 32 bit platform that matters is wasm32

11:37 <whitequark[cis]> (running prjunnamed on a softcpu is a cool hack but i think the actual usefulness is very limited)

11:37 * whitequark[cis] sent a code block: https://catircservices.org/_irc/v1/media/download/AUklmZFlt7IpRRt0d0sEMSQYTI2xvrHXhcxlvo-AKN12wOgYQ2daiNITnDnhnui0uK-NiWOF7Ski-KG7v8efHYa_8AAAAAAAAGNhdGlyY3NlcnZpY2VzLm9yZy9rek1nbFpydldxd2lXbGp0b2FSTk1ZV2E

11:38 <povikMartinPovie> pressmold, actually; toymap does LUTs

11:38 <whitequark[cis]> * ```... (full message at <https://catircservices.org/_irc/v1/media/download/AU2Rq7XTzK1QQpwXPBlUVfi1TB99z0l4lIP9ccjbCsxGeZsk4OR3727CqMRlZzK4rxmzUK69Ju5M4aBkdx2sDtq_8AAAAAAAAGNhdGlyY3NlcnZpY2VzLm9yZy9WaGNobkJVUWxUSXNtRkJXVVZyZVFrZWo>)

11:38 <povikMartinPovie> I'm making some changes along the way

11:38 <whitequark[cis]> so this bumps up Value2 to 32 bytes

11:38 <galibert[m]> I was thinking more about doing runtime templating on the hard cpu of a mister

11:38 <whitequark[cis]> i think we'll need some sort of, i dunno, SmallConst thing which is just a pair of u32's or usize's

11:39 <galibert[m]> Like choosing a set of simulated peripherals

11:41 <whitequark[cis]> yeah, i do think we need to care about 32-bit use cases

11:41 <whitequark[cis]> but the 64-bit one is of course the primary one

11:42 <galibert[m]> Obviously

11:46 <whitequark[cis]> now i'm curious how the profile looks like

11:49 * whitequark[cis] uploaded an image: (414KiB) < https://catircservices.org/_irc/v1/media/download/AZHBHC81xO10yXdbuhQGcMEV64Hdg2hQ_fqZShRLo5hfX29KJIPxVD2gywMbpDOI3BnoUyq1Q505PMZVwW_MDde_8AAAAAAAAG1hdHJpeC5vcmcvR0Rhd1lUU2RTbGRlalRvaVdScndDVU54 >

11:49 <whitequark[cis]> not terribly surprising

11:52 <whitequark[cis]> i'm happy that metadata basically doesn't even show up in the profile, because i tried very hard to make it incredibly cheap to track

11:54 <galibert[m]> Diffuse stuff can be hard to profile though

11:55 <whitequark[cis]> mm?

11:57 <galibert[m]> I’m not saying that the metadata is costly, because I’m sure it isn’t. But stuff that used for very short times but often can hide from stochastic profiles

11:57 <whitequark[cis]> that's why i ran the whole synthesis in a loop

12:00 <galibert[m]> Good idea

12:10 <whitequark[cis]> so interestingly i think most of the allocations come from CellCollector

12:11 <whitequark[cis]> this makes sense since we're mostly traversing the design while doing not much

12:24 <whitequark[cis]> i just improved runtime by 25%

12:24 <_whitenotifier-4> [prjunnamed] whitequark created branch cat/wheeee - https://github.com/prjunnamed/prjunnamed

12:24 <_whitenotifier-4> [prjunnamed] whitequark opened pull request #53: pattern: reuse the allocation in `CellCollector` - https://github.com/prjunnamed/prjunnamed/pull/53

12:24 <whitequark[cis]> interestingly, devirtualizing the inner thing manually produced no measurable benefit

12:26 <_whitenotifier-4> [prjunnamed] whitequark closed pull request #53: pattern: reuse the allocation in `CellCollector` - https://github.com/prjunnamed/prjunnamed/pull/53

12:26 <_whitenotifier-4> [prjunnamed] whitequark deleted branch cat/wheeee - https://github.com/prjunnamed/prjunnamed

12:26 <_whitenotifier-4> [prjunnamed/prjunnamed] github-merge-queue[bot] pushed 1 commit to main [+0/-0/±2] https://github.com/prjunnamed/prjunnamed/compare/fe83b4fca787...a78e306d3e2b

12:26 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark a78e306 - pattern: reuse the allocation in `CellCollector`.

12:26 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-53-fe83b4fca78704baaaad8bb912ac8576c1d1ff7d - https://github.com/prjunnamed/prjunnamed

12:26 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] deleted branch gh-readonly-queue/main/pr-53-fe83b4fca78704baaaad8bb912ac8576c1d1ff7d - https://github.com/prjunnamed/prjunnamed

12:26 * whitequark[cis] uploaded an image: (621KiB) < https://catircservices.org/_irc/v1/media/download/ATm0eER6j7IXnCLFFlJhPoUnsrTqWbw8QzAmB0QZCJ-4lIcwU_LyQWwZ6gQgbtW6FX-0S5zp8Lejx7YP842HQW6_8AAAAAAAAG1hdHJpeC5vcmcvbUREQVJ5TGVNZUxTQ3dZZ0RJSnRMUHhm >

12:26 <whitequark[cis]> that's more like it

12:30 <galibert[m]> Don’t start rewriting the low-level allocator because you find it not fast enough, mm’kay ? ;-)

13:56 <widlarizerEmilJT> Are choice nodes in scope for prjunnamed?

13:56 <whitequark[cis]> shaved off 10% more by inlining a single net into Value, for a 28% total improvement on Minerva SiliconBlue synthesis

13:56 <_whitenotifier-4> [prjunnamed] whitequark created branch cat/smolvalue - https://github.com/prjunnamed/prjunnamed

13:57 <widlarizerEmilJT> sick!

13:57 <_whitenotifier-4> [prjunnamed] whitequark opened pull request #54: Store contents of 0-wide and 1-wide `Value`s inline - https://github.com/prjunnamed/prjunnamed/pull/54

13:57 <whitequark[cis]> widlarizerEmilJT: they keep coming up--i'm considering them, Martin wants them, you maybe want them--so i think so

13:58 <whitequark[cis]> in particular they would be valuable to have in the netlist to aid verification of synthesis

13:58 <widlarizerEmilJT> oh that's a good point

14:10 <_whitenotifier-4> [prjunnamed] whitequark synchronize pull request #54: Store contents of 0-wide and 1-wide `Value`s inline - https://github.com/prjunnamed/prjunnamed/pull/54

14:12 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-54-a78e306d3e2b38bb8b6023aedb8569bd5188da7e - https://github.com/prjunnamed/prjunnamed

14:12 <_whitenotifier-4> [prjunnamed] whitequark closed pull request #54: Store contents of 0-wide and 1-wide `Value`s inline - https://github.com/prjunnamed/prjunnamed/pull/54

14:48 <whitequark[cis]> interestingly, the same optimization for Const appears to be basically completely unprofitable

14:48 <whitequark[cis]> (or rather, its equivalent with up to 15 inlined trits)

14:48 <whitequark[cis]> this sort of makes sense to me: we don't rely on 1-bit consts very heavily

14:56 <_whitenotifier-4> [prjunnamed] whitequark created branch cat/tryfrom - https://github.com/prjunnamed/prjunnamed

14:57 <_whitenotifier-4> [prjunnamed] whitequark opened pull request #55: Convert some inherent methods to `TryFrom` impls. NFC - https://github.com/prjunnamed/prjunnamed/pull/55

14:57 <_whitenotifier-4> [prjunnamed] whitequark synchronize pull request #55: Convert some inherent methods to `TryFrom` impls. NFC - https://github.com/prjunnamed/prjunnamed/pull/55

15:07 <whitequark[cis]> for not completely clear reasons, there seems to be basically no benefit to using a Vec32 as a part of Value

15:07 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-55-0009172540b53ed53dbf354f708716e55e7d7f5a - https://github.com/prjunnamed/prjunnamed

15:07 <_whitenotifier-4> [prjunnamed/prjunnamed] github-merge-queue[bot] pushed 1 commit to main [+0/-0/±3] https://github.com/prjunnamed/prjunnamed/compare/0009172540b5...e4b242869bfa

15:07 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark e4b2428 - netlist: convert inherent methods to `TryFrom` impls. NFC

15:07 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] deleted branch gh-readonly-queue/main/pr-55-0009172540b53ed53dbf354f708716e55e7d7f5a - https://github.com/prjunnamed/prjunnamed

15:07 <_whitenotifier-4> [prjunnamed] whitequark deleted branch cat/tryfrom - https://github.com/prjunnamed/prjunnamed

15:07 <_whitenotifier-4> [prjunnamed] whitequark closed pull request #55: Convert some inherent methods to `TryFrom` impls. NFC - https://github.com/prjunnamed/prjunnamed/pull/55

15:15 <whitequark[cis]> yep, using Vec32 never pays off

15:25 <whitequark[cis]> much happier with this

15:25 * whitequark[cis] uploaded an image: (729KiB) < https://catircservices.org/_irc/v1/media/download/AYlm1Mw6aDJScFaZprVqCUzQMcHMWTrHZSnOZ6whiAigd7r2F_eygzi_yNc-Ar9xEej8UxPTcfmf-ffrK20LNwS_8AAAAAAAAG1hdHJpeC5vcmcvWWR6YlBHVHNYVFd2Y2xodm1nalpZbkdw >

15:26 <whitequark[cis]> and CellRepr::get now doesn't call the allocator ever

15:26 * whitequark[cis] uploaded an image: (118KiB) < https://catircservices.org/_irc/v1/media/download/AeQ7fOqgiIv_riBIWC3DqnHE5yTPYyWezqj3GOg27Ot4Y9xNHb6ZCAQjLjZEm2yAdeGJ3cE-08yK91_kPvEl52-_8AAAAAAAAG1hdHJpeC5vcmcvZnVrUUJnV1ZTZ0NKT2JvUlN6cWV0VFNT >

15:26 <whitequark[cis]> it's basically waiting on memory

15:52 <whitequark[cis]> improved it by 7% more

15:52 * whitequark[cis] uploaded an image: (23KiB) < https://catircservices.org/_irc/v1/media/download/AYfEW4KXOjQPATN9oOtf7O3KbwmIaRGNgWG4W8eVPX3-iS39kUHKFQb1tNtFJ4NyX58aT5kU9rDYDZyHZUqITMS_8AAAAAAAAG1hdHJpeC5vcmcvSEtXaGdSVUFxek9Pbm9jdnBMaENIVHh2 >

15:52 <whitequark[cis]> this is a completely normal cfg (all patterns get inlined now)

15:55 <whitequark[cis]> it's only 40 kB so i guess it just all fits into L1I$ or something

15:55 <galibert[m]> Beautiful

16:09 <whitequark[cis]> i've been able to confirm that my assignment of 0=0, 1=1, X=-1 for Trit and similar for Net do actually have a noticeable perf impact

16:09 <whitequark[cis]> on the order of 1%

16:10 <whitequark[cis]> which i find interesting, i think that wasn't a given

16:11 <Wanda[cis]> huh. surprising.

16:11 <whitequark[cis]> it does affect codegen of course, and i expected that

16:11 <whitequark[cis]> * it does affect instruction selection of course, and i expected that

16:11 <whitequark[cis]> but i wasn't sure if it will actually be measurable or not

16:20 <whitequark[cis]> lessons learned:... (full message at <https://catircservices.org/_irc/v1/media/download/AbxPWXKOqjQg6yCrS6RhRs4yKwVMwVCqGTc8OQo8x4LojoKWWbZVeXO1Hm-u1xy5dSKW7nR1t9GFPnPBmIjMIee_8AAAAAAAAGNhdGlyY3NlcnZpY2VzLm9yZy9sd3lyTGdQWmxqUFNWWFlsb29pYm9Cb3A>)

16:25 <_whitenotifier-4> [prjunnamed] whitequark created branch cat/inline - https://github.com/prjunnamed/prjunnamed

16:27 ignaloidas has joined #prjunnamed

16:31 <whitequark[cis]> Wanda: i think i know how i want constants to be compressed: compress a run iff it's over 32 identical trits

16:31 <whitequark[cis]> (i.e. compress memory init values)

16:31 <Wanda[cis]> a run, anywhere?

16:31 <Wanda[cis]> hmm

16:31 <Wanda[cis]> I actually had a different idea

16:31 <whitequark[cis]> meow?

16:31 <Wanda[cis]> oh

16:32 <Wanda[cis]> I think we need both

16:32 <Wanda[cis]> okay, so

16:32 <_whitenotifier-4> [prjunnamed] whitequark opened pull request #56: Some small runtime improvements - https://github.com/prjunnamed/prjunnamed/pull/56

16:32 <Wanda[cis]> you're thinking of the SB_RAM40_4K parameter values, right?

16:32 <Wanda[cis]> which are single 4kbit consts

16:32 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-56-e4b242869bfaccf01ffc37623f8690a563647507 - https://github.com/prjunnamed/prjunnamed

16:32 <_whitenotifier-4> [prjunnamed/prjunnamed] github-merge-queue[bot] pushed 3 commits to main [+0/-0/±11] https://github.com/prjunnamed/prjunnamed/compare/e4b242869bfa...1cf6787365f0

16:32 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark 3c27b0b - Reformat. NFC

16:32 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark ebd4100 - netlist: simplify. NFC

16:32 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark 1cf6787 - pattern: mark `execute` functions for cross-crate inlining.

16:32 <_whitenotifier-4> [prjunnamed] whitequark closed pull request #56: Some small runtime improvements - https://github.com/prjunnamed/prjunnamed/pull/56

16:32 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] deleted branch gh-readonly-queue/main/pr-56-e4b242869bfaccf01ffc37623f8690a563647507 - https://github.com/prjunnamed/prjunnamed

16:32 <Wanda[cis]> which, yes, would benefit from what you're describing, and I'm on board with it

16:32 <Wanda[cis]> but I was thinking of actual memory cells

16:33 <Wanda[cis]> which don't have large consts really (in the text format at least), they just have a lot of them

16:33 <Wanda[cis]> and for that we'd need something special-cased

16:33 <whitequark[cis]> so, hm

16:33 <Wanda[cis]> like repeat support for init lines

16:33 <whitequark[cis]> the parser concatenates all init lines without looking at the length, right

16:33 <whitequark[cis]> yeah it really doesn't care

16:33 <Wanda[cis]> ... yeah

16:34 <whitequark[cis]> so you can just emit init 0000000*24

16:34 <Wanda[cis]> true, we could do this

16:35 <Wanda[cis]> this would still be handled by different codepaths though

16:35 <whitequark[cis]> yeah

16:36 <Wanda[cis]> very well, let's do that

16:36 <Wanda[cis]> mmm

16:36 <Wanda[cis]> maybe we should emit init address as a comment while we're at it

16:37 <whitequark[cis]> address?

16:37 <Wanda[cis]> the address of memory cell a line corresponds to

16:37 <whitequark[cis]> could do

16:37 <Wanda[cis]> er

16:37 <Wanda[cis]> memory row

16:48 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark pushed 1 commit to main [+0/-0/±3] https://github.com/prjunnamed/vscode-syntax/compare/34cc183f919f...19d0d498fc2a

16:48 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark 19d0d49 - Add support for repeats.

16:48 <_whitenotifier-4> [vscode-syntax] whitequark created tag v0.4.2 - https://github.com/prjunnamed/vscode-syntax

16:48 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark tagged 19d0d49 as v0.4.2 https://github.com/prjunnamed/vscode-syntax/commit/19d0d498fc2a214367c208eb56f26e6b591f108b

16:56 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark pushed 1 commit to main [+0/-0/±2] https://github.com/prjunnamed/vscode-syntax/compare/19d0d498fc2a...89b49d026016

16:56 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark 89b49d0 - Fix highlighting for metadata attached to memories.

16:56 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark tagged 89b49d0 as v0.4.3 https://github.com/prjunnamed/vscode-syntax/commit/89b49d026016269285215776bb56c3beadab4653

16:56 <_whitenotifier-4> [vscode-syntax] whitequark created tag v0.4.3 - https://github.com/prjunnamed/vscode-syntax

16:57 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark deleted tag v0.4.3

16:57 <_whitenotifier-4> [vscode-syntax] whitequark deleted tag v0.4.3 - https://github.com/prjunnamed/vscode-syntax

16:57 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark pushed 1 commit to main [+0/-0/±3] https://github.com/prjunnamed/vscode-syntax/compare/89b49d026016...a4d07fd24d03

16:57 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark a4d07fd - Fix highlighting for metadata attached to memories.

16:57 <_whitenotifier-4> [prjunnamed/vscode-syntax] whitequark tagged a4d07fd as v0.4.3 https://github.com/prjunnamed/vscode-syntax/commit/a4d07fd24d03612c32c6d9afa15927e24f63281e

16:57 <_whitenotifier-4> [vscode-syntax] whitequark created tag v0.4.3 - https://github.com/prjunnamed/vscode-syntax

16:58 <whitequark[cis]> okay; it no longer skips all-X rows but it does compress any sequential identical rows

16:59 <Wanda[cis]> yesss

16:59 <_whitenotifier-4> [prjunnamed] whitequark created branch cat/memory-init - https://github.com/prjunnamed/prjunnamed

16:59 <_whitenotifier-4> [prjunnamed] whitequark opened pull request #57: Coalesce identical consequential memory init values - https://github.com/prjunnamed/prjunnamed/pull/57

17:03 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-57-1cf6787365f045cd0d7087d3f979b18383a41f8c - https://github.com/prjunnamed/prjunnamed

17:03 <_whitenotifier-4> [prjunnamed] github-merge-queue[bot] deleted branch gh-readonly-queue/main/pr-57-1cf6787365f045cd0d7087d3f979b18383a41f8c - https://github.com/prjunnamed/prjunnamed

17:03 <_whitenotifier-4> [prjunnamed/prjunnamed] github-merge-queue[bot] pushed 1 commit to main [+0/-0/±4] https://github.com/prjunnamed/prjunnamed/compare/1cf6787365f0...f6b3e2e761f9

17:03 <_whitenotifier-4> [prjunnamed/prjunnamed] whitequark f6b3e2e - netlist: coalesce identical consequential memory init values.

17:03 <_whitenotifier-4> [prjunnamed] whitequark closed pull request #57: Coalesce identical consequential memory init values - https://github.com/prjunnamed/prjunnamed/pull/57

17:03 <_whitenotifier-4> [prjunnamed] whitequark deleted branch cat/memory-init - https://github.com/prjunnamed/prjunnamed

19:39 <povikMartinPovie> can I add stable order while design doesn't change as a documented property of iter_cells_topo?

19:49 <whitequark[cis]> that seems reasonable

19:49 <whitequark[cis]> I mean, what would be the alternative?

19:49 <mei[m]> the alternative would be "there's a hashmap somewhere in there" ;3

19:51 <whitequark[cis]> we had that at the beginning actually

19:51 <whitequark[cis]> there's a commit replacing all hashmaps with btreemaps

19:51 <whitequark[cis]> (later relaxed to "only for things that are iterated")

20:01 <povikMartinPovie> one advantage linear array/vector has over HashMap<Net, ..> is you can do split_at_mut

20:02 <povikMartinPovie> or can you do something similar with the latter?

20:03 <povikMartinPovie> doesn't seem like it; add that to my wishlist of features for a map-over-nets collection

20:11 <whitequark[cis]> does btreemap offer that?

20:15 <povikMartinPovie> apparently no

22:51 <_whitenotifier-4> [prjunnamed] ethanuppal commented on issue #48: Add a lexical analyzer (tokenizer) to the IR parser - https://github.com/prjunnamed/prjunnamed/issues/48#issuecomment-2669939077

22:52 <mei[m]> r? [#9](https://github.com/prjunnamed/prjunnamed/pull/9)

23:08 mwk has quit [Ping timeout: 252 seconds]

23:19 mwk has joined #prjunnamed

23:49 <_whitenotifier-4> [prjunnamed] wanda-phi commented on issue #48: Add a lexical analyzer (tokenizer) to the IR parser - https://github.com/prjunnamed/prjunnamed/issues/48#issuecomment-2670032875

23:50 leocassarani[m] has quit [Quit: Idle timeout reached: 172800s]