#amaranth-lang on 2024-03-23 — irc logs at libera.irclog.whitequark.org

2024-02-21 07:31 whitequark[cis] changed the topic of #amaranth-lang to: Amaranth hardware definition language · weekly meetings: Amaranth each Mon 1700 UTC, Amaranth SoC each Fri 1700 UTC · play https://amaranth-lang.org/play/ · code https://github.com/amaranth-lang · logs https://libera.irclog.whitequark.org/amaranth-lang · Matrix #amaranth-lang:matrix.org

00:00 <whitequark[cis]> zyp[m]: or maybe ask for a pair of ports whose addresses are tied together

00:00 <whitequark[cis]> however, one reason to have a readwrite port is because enables aren't really independent in that case

00:00 <tpw_rules> they seeeeeeeem to be according to altera docs

00:00 <zyp[m]> whitequark[cis]: how do you express that as a signature?

00:01 <zyp[m]> and if you do, how does it differ from a RW port?

00:01 <whitequark[cis]> tpw_rules: not *always* independent

00:01 <tpw_rules> i think she means sort of like a socket interface which returns a pair of connected sockets: you as for a pair of connected ports

00:01 <tpw_rules> ask*

00:01 <whitequark[cis]> zyp[m]: you don't, they just are physically same

00:01 <galibert[m]> tpw_rules: transparency make it an expensive pass through

00:01 <whitequark[cis]> lib.wiring never says that all the stuff you get as signals in an interface object refers to like, different signals

00:01 <whitequark[cis]> it just says what the structure is

00:01 <tpw_rules> that would also be hard to get multiple ports from, unless you can ask for n tandemed read ports with your write port

00:01 <zyp[m]> whitequark[cis]: so in other words they're not very usable with `wiring.connect`

00:03 <tpw_rules> wiring.connect doesn't care, it's just a bit arbitrary which one's address really matters (the latest one)

00:03 <whitequark[cis]> not as individual separate ports no

00:03 <zyp[m]> tpw_rules: `wiring.connect` doesn't care, but you end up driving `addr` twice

00:04 <tpw_rules> whitequark[cis]: do you have an example offhand of not independent conditions? i am curious what a sample constraint would be

00:04 <whitequark[cis]> ask Wanda

00:04 <tpw_rules> e.g. i think an en + rd/wr switch would be like... 75% dual port

00:04 <whitequark[cis]> well, that's actually what i'm talking about

00:04 <whitequark[cis]> en+we is very different from re+we

00:04 <whitequark[cis]> because the former can't express re=we=1

00:05 <whitequark[cis]> but I don't fully understand it...

00:05 <tpw_rules> yes, but i would harshly judge a vendor who claimed that was true dual port

00:05 <tpw_rules> (my considerably valuable input)

00:06 <tpw_rules> and, perhaps, amaranth should not expose it like that

00:07 <tpw_rules> but anyway, i'm happy about a resolution on the clock issue, help from amaranth shouldn't be necessary. i am okay with punting on true dual port for now, but it would be nice to have a language-sanctioned solution

00:08 <tpw_rules> to me i don't think a sat solver is necessary, i think it should only be true dual port where port_r.addr.eq(x) and port_w.addr.eq(y) and `x is y`

00:10 <tpw_rules> (which might be some simple same-net check in the IR?)

00:10 <whitequark[cis]> we've also considered that

00:10 <whitequark[cis]> basically, it's hard to come up with a very satisfying solution. I think we'll need to do it iteratively

00:10 <whitequark[cis]> so maybe we just don't have TDP at first. well, iCE40 doesn't have TDP at all :D

00:10 <whitequark[cis]> also I still wonder if inference cannot be made to work

00:11 <tpw_rules> yes, you can throw yourself at the mercy of the tools

00:11 <tpw_rules> i wonder if it would end up needing hand crafted templates

00:11 <tpw_rules> i'm not sure what the full extent of quartus's inference capabilities are, i know it rejects non power of two sizes. and we're not doing weird stuff like byte enables. but idk about many read port duplication. perhaps i need to move those tests up on my priority list

00:12 <whitequark[cis]> tpw_rules: if the only problem is NPOT memories then the existing get_memory hook is absolutely trivial to adjust for that

00:12 <tpw_rules> i've also had problems with transparency

00:13 <whitequark[cis]> which?

00:13 <tpw_rules> my quartus will not infer transparent=True for cyclone v

00:14 <tpw_rules> then there's the real issue that motivated my interest in all this, which is wrappers to make the init behavior match. but i think those could be just wrappers indeed

00:14 <whitequark[cis]> I see

00:16 <tpw_rules> as i understand now amaranth just tells yosys "memory" and lets yosys generate all the verilog. i think for reliable inference we would need to fix up yosys a bunch and/or have amaranth just write out known-good snippets

00:16 <whitequark[cis]> known-good snippets aren't something i'm necessarily willing to do at this stage

00:16 <whitequark[cis]> because amaranth doesn't actually do verilog, only yosys does

00:16 <tpw_rules> ok, so that's not "inference" to you

00:16 <tpw_rules> good to know

00:16 <whitequark[cis]> it's a layering issue

00:16 <tpw_rules> yes

00:17 <whitequark[cis]> it would have to go into yosys' write_verilog backend

00:17 <whitequark[cis]> -mode quartus or whatever

00:17 <tpw_rules> does RTLIL support arrays that are not memory? could amaranth instead use not memory RTLIL objects and set up something that it knows will be written out to verilog how quartus likes?

00:18 <whitequark[cis]> the closest thing to that is an RTLIL memory object.

00:19 <whitequark[cis]> unless you're like, writing verilog separately and then instantiating it, which is something I'll only consider if it can be shown that inference will never work with quartus

00:19 <tpw_rules> which gives yosys full control back of how to write out the verilog. i suppose it's not out of the question to teach yosys how to do it and then have amaranth rely on that

00:19 <whitequark[cis]> (which is doubtful)

00:21 <tpw_rules> okay well i gotta run to dinner, thanks for the discussion. it would definitely be nice to keep work and complexity out of amaranth for this

00:21 <whitequark[cis]> see you

00:21 <tpw_rules> amaranth being able to actually generate memories so easily is a big part of the value to me indeed

00:39 <Wanda[cis]> re enables: the motivating example is Xilinx, which has EN and WE signals (WE being per-byte)

00:41 <Wanda[cis]> there are two modes: plain (read_en = EN, write_en = EN & WE, cannot express write without read), or `NO_CHANGE` (read_en = EN && WE == 0, write_en = EN & WE, cannot express write and read at the same time)

00:41 <Wanda[cis]> so whichever mode you pick, there is one combination you cannot express

00:43 <Wanda[cis]> for yosys memory inference I basically gave up and decided to use a SAT solver: if SAT solver determined that `write_en && !read_en` is UNSAT, I use the first mode; if SAT solver determines that `write_en && read_en` is UNSAT, I use the second mode; otherwise, I reject the port combination

00:45 <Wanda[cis]> note that my approach was determined by a simple constraint: I needed to be able to infer memories after the original Verilog has long been converted to a mux tree, and a bunch of optimization passes have already run on it

00:47 <Wanda[cis]> another sound approach would have been defining some blessed Verilog code patterns that you're supposed to use, and recognizing that directly (I think this is what most toolchains do?), but that'd have required me to hook into the Verilog frontend in a much earlier stage, and that is just... holy fuck you don't want to touch the yosys Verilog frontend.

00:47 lf has quit [Ping timeout: 264 seconds]

00:47 lf_ has joined #amaranth-lang

00:49 <Wanda[cis]> further, yosys memory inference was also very much designed for Verilog in the first place, obviously

00:50 <Wanda[cis]> the memory inference pass would've been very different if it was designed to consume some higher-level description of memories rather than trying to divine it from Verilog statements

00:51 <Wanda[cis]> thus, arguably, we shouldn't be basing Amaranth memory design on yosys at all, since its constraints do not really apply.

00:51 <Wanda[cis]> this is why yosys never got a read-write memory port: there's simply no way to express such a thing in Verilog

00:52 <Wanda[cis]> you can have a memory write, and a memory read, and then you can have some magic that recognizes they can be combined

00:54 <Wanda[cis]> now, if I were to do a greenfield design for high-level structures for memory description (as if I could ever do this while having my mind contaminated by yosys), read-write ports could be a very good idea; the really annoying part though is that you need like three variants of them

00:54 <Wanda[cis]> ReadWritePortIndependentEnable, ReadWritePortWriteImpliesRead, ReadWritePortExclusiveReadWrite

00:54 <Wanda[cis]> (with less horrifying names hopefully)

00:55 <Wanda[cis]> and the latter two would probably have EN + WE signals like Xilinx

00:56 <Wanda[cis]> anything else, unfortunately, will involve SAT solving bullshit

00:56 <Wanda[cis]> or having significant parts of hw inaccessible

00:56 <Wanda[cis]> * of hw functionality inaccessible

00:57 <Wanda[cis]> the problem with all this is that having a legitimately hardware-independent description is hard because there's so much tiny semantic differences.

01:00 <Wanda[cis]> another annoying one: does asserting reset on the read-write port override write enable? idk, depends on the vendor.

01:03 <Wanda[cis]> I have spent several months looking at various memory primitives across multiple vendors and classifying their behavior, and the result of that was this hellish thing: https://github.com/YosysHQ/yosys/blob/main/passes/memory/memlib.md

01:04 <Wanda[cis]> it doesn't fully capture all possibilities, because some vendors do even crazier bullshit, but it captures the .... reasonably sane subset

01:05 <Wanda[cis]> (did you know old Altera blockrams *capture* write address & data at clock posedge, but *perform* the write at negedge? fun.)

01:06 <Wanda[cis]> (did you know the async reset on the same blockrams also applies to the write registers, not just read registers? kill me.)

01:06 <whitequark[cis]> honestly i suspect we'll just have to do the SAT bullshit all over again

01:07 <Wanda[cis]> I... don't know

01:08 <Wanda[cis]> I think if you added the ReadWritePort with the three possibilities I outlined (exclusive read/write, independent read/write, write implies read), then defined some conservative semantics for reset, you could do memory lowering entirely without SAT

01:08 <whitequark[cis]> that's pretty gross though

01:08 <Wanda[cis]> shrug

01:09 <Wanda[cis]> perhaps

01:09 <Wanda[cis]> idk how to make it work

01:09 <whitequark[cis]> so many platform details leaking unnecessarily

01:09 <Wanda[cis]> but it may be worth it

01:09 <Wanda[cis]> like

01:09 <Wanda[cis]> I ... really don't like the SAT-based thing

01:09 <Wanda[cis]> it feels so fucking fragile

01:10 <whitequark[cis]> i think the ReadWritePort thing is also fragile in a different, similar way

01:10 <whitequark[cis]> cause when you go to another platform it breaks if you're not making it happy

01:10 <Wanda[cis]> so does SAT

01:10 <Wanda[cis]> the difference is that with ReadWritePort we can actually document the combinations expected to work in a sane way

01:11 <whitequark[cis]> we don't even have testbenches for anything

01:11 <whitequark[cis]> basically, i'm not ready to commit to either of these solutions until we actually understand the problem space

01:11 <whitequark[cis]> not just on the hardware side, but also on "what the toolchains actually do" side, and "what are the desirable configurations people actually use" side

01:11 <Wanda[cis]> the problem space is horrifying.

01:12 <Wanda[cis]> also, another major wart

01:12 <Wanda[cis]> if you want your memories to infer well, you have to give up on determinism

01:12 <Wanda[cis]> particularly TDP

01:12 <Wanda[cis]> what's the transparency behavior between the two TDP ports on common BRAMs? fuck if I know

01:13 <Wanda[cis]> I spent a lot of time reading documentation and sim models trying to figure that out, and most vendors just go "lol idk"

01:13 <whitequark[cis]> hmm, so our behavior is currently defined as "you get new value if transparent_for, old value otherwise"

01:13 <whitequark[cis]> and we may have to go to "you get new value if transparent_for, unspecified value otherwise"

01:14 <Wanda[cis]> like, "get old value" tends to be supportable for intra-port transparency

01:15 <Wanda[cis]> for inter-port... I thiiiink Xilinx can do it, and that's it?

01:15 <Wanda[cis]> I mean, for BRAMs; for LUTRAMs it just happens everywhere by default

01:16 <whitequark[cis]> i think we just need to teach unspec values to the simulator and aggressively emit those on anything that looks nondeterministic that we can't practically just fix

01:16 <whitequark[cis]> which is a blocker for the desirable uninitialized memories feature

01:17 <whitequark[cis]> so it kinda has to happen anyway

01:17 <whitequark[cis]> anyway, i'm checking out

01:18 <Wanda[cis]> also as an argument against SAT

01:19 <Wanda[cis]> with ReadWritePort, whether your memory infers or not will depend (with suitable implementation) on the memory ports you created and nothing else

01:19 <Wanda[cis]> with SAT-based approach, your success or failure can depend on some logic 10 modules across from the one containing the memory

01:23 <whitequark[cis]> yes, I understand the downsides

01:23 <Wanda[cis]> people are already surprised by lesser shit happening!

01:24 <Wanda[cis]> like the thing where we infer BRAMs sometimes for memories with comb read ports

01:24 <whitequark[cis]> I mean, that seems like a valid and desirable optimization

01:24 <Wanda[cis]> because it turns out that yosys managed to merge some random flop that happens to be on the read path into the memory

01:32 <tpw_rules> Wanda[cis]: altera only supports old data for inter-port

01:32 <tpw_rules> (and only new data for intra-port)

01:36 <Wanda[cis]> hm

01:37 <Wanda[cis]> is it "new data" or "new data on we-enabled lanes, 'x on non-we-enabled lanes, unless no write, in which case current data"? I recall one vendor being batshit like this

01:37 <Wanda[cis]> it may have been altera

01:37 <tpw_rules> uh that sounds close, but i kinda skimmed past that part since amaranth doesn't have lane support

01:38 <Wanda[cis]> uh? of course it has lane support

01:38 <tpw_rules> i guess that might come up with mixed width

01:38 <Wanda[cis]> it's called granularity

01:38 <Wanda[cis]> mixed width doesn't have to be involved in this

01:38 <tpw_rules> buh you're right, idk how i forgot that

01:40 <tpw_rules> anyway at least new data is patchable with some extra logic, bearing in mind your comment about the haters

01:42 <tpw_rules> sigh. i'm kinda negative on SAT too for my 2 cents

01:50 <Wanda[cis]> new data is patchable if you can rely on non-we-enabled lanes actually reading back the current data

01:50 <tpw_rules> according to the manual that's a selectable option

01:51 <Wanda[cis]> ... okay, yeah, according to my notes it does work on altera

01:52 <tpw_rules> oh wait that might be family specific

01:52 <tpw_rules> but you can still do new data if you use an old data mode, it's just more limiting

01:53 <Wanda[cis]> I looked at every family for my notes

01:54 <tpw_rules> the doc i have seems to say that on the native intra-port mode, some families allow you to read current data for non-we-enabled lanes. some mandeate returning x

01:57 <Wanda[cis]> oh.

01:57 <Wanda[cis]> right.

01:57 <Wanda[cis]> M4K

01:57 <Wanda[cis]> I grepped for the wrong thing

01:58 <tpw_rules> M10K mandates x

01:58 <whitequark[cis]> Wanda: haha oh god we might actually want `TrueSettle`

01:58 <Wanda[cis]> cyclone, cyclone II, stratix, stratix II, arria gx

01:59 <Wanda[cis]> Catherine: ... what is it now

01:59 <whitequark[cis]> I'm currently debugging put_stream and get_stream functions in the OBI codebase, which definitely do not function as they should

01:59 <Wanda[cis]> OBI?

01:59 <whitequark[cis]> doesn't matter

02:00 <whitequark[cis]> point is, there is a weird race condition, and it goes away if i double or triple up yield Settle in add_testbench_wrapper

02:00 <whitequark[cis]> s//`/, s//`/, s/add_testbench_wrapper/add\_testbench wrapper/

02:00 <Wanda[cis]> augh

02:00 <Wanda[cis]> ... I think I may have said something regarding chainsaws

02:01 <whitequark[cis]> chainsaws?

02:01 <Wanda[cis]> and fucking thereof.

02:01 <cr1901> That sounds dangerous

02:01 <Wanda[cis]> anyway.

02:01 <Wanda[cis]> what if we just

02:01 <Wanda[cis]> (always a good beginning, I know)

02:02 <Wanda[cis]> always schedule processes in front of testbenches?

02:02 <whitequark[cis]> yes I suspect we have to do something like that

02:03 <Wanda[cis]> like, if there's a schedulable process (and pyrtl emitted stuff counts as process), run it; only look at testbenches when that queue runs out

02:03 <Wanda[cis]> hm

02:03 <Wanda[cis]> there... should also be delta cycles in there somewhere

02:03 <Wanda[cis]> ugh

02:03 <Wanda[cis]> yeah

02:03 <Wanda[cis]> chainsaw please

02:04 <whitequark[cis]> fucking gtk3 gtkwave. garbage because it's gtkwave and because it's gtk3! double the horribleness

02:04 Degi_ has joined #amaranth-lang

02:04 <whitequark[cis]> still doesn't support removing a trace with Delete because no one competent ever touched that UI

02:04 Degi has quit [Ping timeout: 255 seconds]

02:04 Degi_ is now known as Degi

02:04 <cr1901> Thoughts on Surfer?

02:05 <whitequark[cis]> can't stand gesture based interfaces

02:05 <cr1901> very reasonable

02:05 <whitequark[cis]> basically unusable for me with the current input model

02:06 * whitequark[cis] uploaded an image: (162KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/wHePqcngDmRmodGbZbrQopeo/Screenshot_20240323_020554.png >

02:06 <cr1901> I don't use it full time either, but I have it installed in the hopes that it'll eventually be good enough for me

02:06 <whitequark[cis]> i instrumented the simulator to advance vcd time whenever it does a delta cycle, and also show when it executes coro and rtl processes

02:06 <whitequark[cis]> you can clearly see how sometimes that happens in the same delta

02:06 <whitequark[cis]> * same delta cycle

02:06 <whitequark[cis]> this is Not Correct

02:07 <whitequark[cis]> actually it's not 100% clear what the causality on that is, let's see

02:08 <tpw_rules> is that arrow like, a delta function

02:14 <whitequark[cis]> its just a vcd thing i put in a bunch of places

02:14 * whitequark[cis] uploaded an image: (172KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/PxIAoZqISUXsUiaEdsbrIkBg/Screenshot_20240323_021407.png >

02:14 <whitequark[cis]> ok, so here gtkwave glitched out displaying cursors, but fortunately these are actually the exact two places i wanted to highlight

02:14 <whitequark[cis]> 1st time get_testbench gets called, it's doing yield stream.valid. 2nd time it's called, it's doing yield stream.payload (more or less)

02:15 <whitequark[cis]> in between these, the pyrtl processes do their thing, and update the state

02:22 <whitequark[cis]> there's also a second race condition here, which is not an amaranth issue

02:22 <whitequark[cis]> namely, we have m.d.comb += self.dac_stream.valid.eq(self.dwell_stream.valid)

02:23 <whitequark[cis]> and then we have put_testbench set dwell_stream.valid, and get_testbench check dac_stream.valid

02:23 <whitequark[cis]> naturally they end up running in parallel

02:24 <zyp[m]> huh, I like those arrows, I kinda would like to be able to make them from testbenches

02:24 <whitequark[cis]> it's the vcd "event" feature

02:24 <whitequark[cis]> it's been requested before but is kind of tricky to expose in a non-awful way

02:25 <whitequark[cis]> it's also completely unsupported by cxxrtl

02:25 <whitequark[cis]> ideally what i want to see is annotations, where you can hover around a timeline and at various instantaneous moments (which can happen within zero time too) you can see something happening, like a process sending a command

02:25 <whitequark[cis]> and guess what!!! the CXXRTL protocol already supports those, they're called "diagnostics"

02:26 <whitequark[cis]> and the CXXRTL protocol also lets you distinguish happens-before relationship within zero time

02:26 <whitequark[cis]> so I think the play here is to add CXXRTL protocol support to pysim and then use Surfer or whatever as a frontend

02:26 <zyp[m]> Surfer can ingest CXXRTL protocol?

02:27 <whitequark[cis]> yeah

02:27 <whitequark[cis]> it's not public yet since the protocol is fairly unstable still

02:27 <whitequark[cis]> and it requires you to use CXXRTL from a branch

02:27 <whitequark[cis]> but yeah

02:27 <zyp[m]> nice, and yeah, that'd make sense

02:28 <whitequark[cis]> anyway back to testbenches

02:28 <whitequark[cis]> what do we do about this mess?

02:28 <whitequark[cis]> I see two issues at hand

02:29 <whitequark[cis]> 1. testbench processes racing with pyrtl processes under... some conditions?

02:29 <whitequark[cis]> actually, hold oon

02:30 <whitequark[cis]> yeah I'm wrong about the first race. the testbench processes wake up and immediately execute Settle

02:31 <whitequark[cis]> yeah, I think the only issue here is basically that if one testbench does e.g. "x.eq(1), y.eq(2), z.eq(3)" and another does "y, y, y"

02:32 <whitequark[cis]> then neither of those operations is atomic at all, even though both could trivially be

02:34 <whitequark[cis]> I think the way we schedule testbenches is fundamentally broken

02:35 <whitequark[cis]> they're preemptible at every single yield, and worse, they are guaranteed to be preempted at every single yield if that's at all possible

02:36 <whitequark[cis]> you know how people say that concurrent programming with asyncio is easier than threads because asyncio tasks can get preempted only at await points and threads can get preempted anywhere? well, if you make "variable get" and "variable set" await points, you basically make threads out of asyncio

02:37 <whitequark[cis]> in fact not just threads, but threads with a scheduler that is actively being evil

02:37 <zyp[m]> combinational signals have to preempt testbenches, whether they are handled by pyrtl or a process

02:37 <whitequark[cis]> elaborate?

02:39 <zyp[m]> code has m.d.comb += a.eq(b), testbench does yield b.eq(whatever) then yield a

02:40 <zyp[m]> or equivalently async for x in sim.changed(b): sim.set(a, x)

02:41 <whitequark[cis]> yes, of course

02:41 <whitequark[cis]> sim.set(a) where a is awaited on must be a preemption point, I don't dispute that

02:41 <whitequark[cis]> however I don't think that any other sim.set() or any sim.get() should be!

02:41 <whitequark[cis]> that's just making your life harder for no reason

02:53 <zyp[m]> hmm

02:55 <zyp[m]> so if I've got async def foo(sim): await sim.tick(); sim.set(x, whatever) and async def bar(sim): await sim.tick(); sim.get(x), the implicit settle will make bar read what foo just wrote, right?

02:55 <zyp[m]> * so if I've got async def foo(sim): await sim.tick(); await sim.set(x, whatever) and async def bar(sim): await sim.tick(); await sim.get(x), the implicit settle will make bar read what foo just wrote, right?

02:57 <zyp[m]> but if I've instead got async def foo(sim): await sim.tick(); await sim.set(y, something); await sim.set(x, whatever) it can be preempted between the two sets?

02:58 <Wanda[cis]> I think we can make it even stricter

02:58 <Wanda[cis]> if you have sim.set in testbench, it can get preempted by processes, but not by other testbenches

02:59 <whitequark[cis]> yes, I was just talking about this on a voice call with Isabel

02:59 <whitequark[cis]> there's actually a case where you sim.set() in one testbench and await sim.changed() on the same thing in another

03:06 <whitequark[cis]> ok, so as a first attempt, I'll make yield eq() return True if that triggered something

03:06 <whitequark[cis]> and then make the testbench wrapper only settle on Assign if it returns True

03:27 * whitequark[cis] uploaded an image: (196KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/zKfbRAtsLaVHKUUQRhJsUAdk/Screenshot_20240323_032710.png >

03:32 <whitequark[cis]> Wanda: I guess we can think of `Print` and `Assert` as a type of testbench that lives in RTL

03:32 <whitequark[cis]> and schedule them after all other RTL processes, together with testbenches

03:33 <Wanda[cis]> ... do we get that check phase after all?

03:33 <whitequark[cis]> no

03:33 <Wanda[cis]> it's equivalent though

03:33 <whitequark[cis]> actually, not necessarily?

03:34 <whitequark[cis]> depending on how preemption ends up working (and this can actually cause comb assertions to pass/fail)

03:34 <Wanda[cis]> hm

03:34 <whitequark[cis]> so let me explain first what made me rethink it

03:35 <Wanda[cis]> "testbenches are (or, should be) only scheduled if the set of runnable processes (incl. RTL) is empty": true or false?

03:36 <whitequark[cis]> so you know how CXXRTL works right? you have an eval/commit loop. I think that is actually a perfectly fine design and requires no adjustment, for the following reason: the eval/commit loop is a way of updating shared mutable state that has internal side effects semantically represented by parallel processes

03:36 <whitequark[cis]> the eval/commit loop ensures that the side effects are fully deterministic

03:36 <whitequark[cis]> (let's ignore $print and $check here for a second)

03:38 <whitequark[cis]> outside of the core eval/commit loop, you have the testbenches, usually one in CXXRTL, but you may easily want more than that. the testbenches do not participate in the eval/commit loop and their execution isn't necessarily deterministic; they serialize in whichever way happen to be written

03:38 <whitequark[cis]> * whichever way they happen to

03:40 <whitequark[cis]> the fact that the stuff inside the eval/commit loop looks like "process triggered by async event" and the stuff outside the eval/commit loop looks like "process triggered by async event" is, I think, a red herring

03:40 <whitequark[cis]> they're fundamentally different kinds of entities

03:42 <whitequark[cis]> the former is an implementation detail of how netlists are translated to C++. logically, you give the circuit a map {input=>value} and you get back a map {output=>value}; the circuit is a function in discrete time. you could even make it a pure function if you do ({input=>value},state)->({output=>value},state')

03:43 <whitequark[cis]> the latter is an inalienable part of the programming model of testbenches

03:44 <whitequark[cis]> the way I've been originally planning to integrate CXXRTL with Amaranth was by introducing a "reactor". I haven't actually added one because I wasn't sure of the design, but now I'm getting more confident in what it should be

03:47 <whitequark[cis]> the CXXRTL reactor would get a bunch of CXXRTL modules, wire them up together (so that you can compose big modules out of smaller modules with separate compilation), let you add a clock driver (mostly just for performance reasons), and then let you wait on netlist wires

03:49 <whitequark[cis]> new blackboxes actually depend on building the reactor, since you want your blackbox to be invoked on a clock and gated with some enable, and you don't want to manually run it from the toplevel

03:52 <whitequark[cis]> the reasons I haven't added a reactor were mainly: (a) I did not know how to wire together separate cxxrtl::modules efficiently, since you could have arbitrary comb connections and I wanted to avoid injecting delta cycles as much as possible, and (b) it wasn't clear exactly what the concurrency semantics would be otherwise

03:56 <whitequark[cis]> it seemed to me like there would need to be one big event loop that both cxxrtl::modules register themselves in, and testbenches/blackboxes do, and they'd just all kind of mash together. but that's unsatisfying because of clock gating: a clock gating primitive like BUFGCE is one of the few places where you really, unavoidably care about ordering of events in zero physical time

04:07 <whitequark[cis]> (back)

04:12 <whitequark[cis]> the way I think about it now is: the reactor is the boundary between the deterministic and the nondeterministic. everything inside is a bunch of mutable state that's guarded by the eval/commit loop guarantees, and which should be completely deterministic (provably so if you don't allow arbitrary code in eval()). everything outside is the nondeterministic world with I/O, testbenches, synchronization via mutexes, channels,

04:12 <whitequark[cis]> whatever other means, which the reactor notifies about state changes via subscriptions

04:13 <whitequark[cis]> actually "reactor" at this point is a misnomer because it shouldn't even have an event loop (the eval/commit loop isn't an event loop), it's just an efficient, imperative interface over what's naturally a pure function

04:17 <whitequark[cis]> so you have the "reactor" (i'll keep calling it that to hopefully reduce confusion) which tells you when stuff changed, and a scheduler of some sort which actually calls back into the outside world, as two separate modules

04:19 <whitequark[cis]> both Print and Assert are IO but they're embedded within RTL, creating this confusing arrangement that was the impetus for trying to add a check phase

04:21 <whitequark[cis]> consider a module like this:

04:21 * whitequark[cis] sent a code block: https://catircservices.org/_matrix/media/v3/download/catircservices.org/SaAbWLvHXDIETZXAXXwKRNKy

04:21 <whitequark[cis]> * ```... (full message at <https://catircservices.org/_matrix/media/v3/download/catircservices.org/gMHIyUHgrjaZyAmKzuOhrHyV>)

04:22 <whitequark[cis]> (imagine that m is translated into one piece of code and BUF is translated into another piece of code; doesn't matter what the simulator details are)

04:23 <whitequark[cis]> it's actually not enough to have a check phase within the step function for m, because to correctly process this, you need to consider that there may be comb feedback through the outside of m. in this case, the "reactor" would (in practice) run delta cycles for both m and BUF to fixpoint, and only after that the assertion would be checked

04:24 <whitequark[cis]> i.e.: you can't do void step() { do { eval(); bool changed = commit(); } while(changed); check(); } in m and BUF separately

04:25 <whitequark[cis]> * i.e.: you can't do void step() { do { eval(); } while(commit()); check(); } in m and BUF separately

04:25 <whitequark[cis]> * i.e.: you can't do void step() { do eval(); while(commit()); check(); } in m and BUF separately

04:26 <whitequark[cis]> you have to do uh... `bool step() { bool changed = false; do eval(); while(changed ||= commit()); return changed; }` and `void fixpoint() { do ; while(m.step() && BUF.step()); m.check(); BUF.check(); }`

04:26 <whitequark[cis]> s/&&/||/

04:47 <whitequark[cis]> anyway, I think that instead of a check phase, CXXRTL should have a scheduler, and schedule non-physical things like comb assert and print together with IO and testbenches and away from the deterministic core. (actually I suspect that clocking should work in kind of some similar way, but I'm too tired right now to think about it)

04:47 Guest68 has joined #amaranth-lang

04:47 Guest68 has quit [Client Quit]

04:48 <whitequark[cis]> s/non-physical/impure/

04:48 <whitequark[cis]> all of the above applies to the Amaranth simulator of course

05:05 <whitequark[cis]> so the way I imagine things working are: the core of the simulator is just a passive chunk of state. outside of it, there is a scheduler that runs testbenches in whichever way and order seem necessary. testbench doing sim.get() just reads a bit of the state; testbench doing sim.set() writes a bit of the state, potentially kicking off side effects within and as such affecting other bits of the state, which in turn potentially

05:05 <whitequark[cis]> alters readiness of other testbenches (but does not directly run them). testbenches include ones added with add_testbench and also combinatorial Print/Assert

05:08 <whitequark[cis]> like... for all i know, testbenches could be scheduled by asyncio itself? that seems silly to do in most cases, but conceptually it would work

05:13 <whitequark[cis]> zyp: I think `sim.get()` has no need to be awaitable under any circumstances, and `sim.set()` shouldn't be a preemption point for neither `add_process` (no need to, as everything's double buffered) nor `add_testbench` (no need to, as it isn't a preemption point), so doesn't need to be awaitable either

05:14 <whitequark[cis]> * zyp: I think `sim.get()` has no need to be awaitable under any circumstances, and `sim.set()` shouldn't be a preemption point for neither `add_process` (no need to, as everything's double buffered) nor `add_testbench` (no need to, as preemption is undesirable in this context), so doesn't need to be awaitable either

05:19 <zyp[m]> so what happens when a testbench sets one signal and reads another that has a combinatorial path between them? when does it get updated if there's no await statements involved during or between those operations?

05:20 <whitequark[cis]> sim.set() runs eval/commit until converged

05:20 <zyp[m]> and if you've got a combinatorial add_process?

05:21 <whitequark[cis]> sim.set() does two different things in add_process() (where it just sets next) and add_testbench() (where it runs delta cycles)

05:21 <whitequark[cis]> if you're in the outer loop handling testbenches, sim.set() runs the inner loop; if you're in the inner loop, it schedules a pending change for the next loop iteration

05:23 <zyp[m]> hmm, maybe, I'm too tired to picture how everything fits together (why am I even awake)

05:23 <whitequark[cis]> it's actually quite easy to implement

05:32 <Wanda[cis]> yeah that's how I pictured all of it, too

05:33 <Wanda[cis]> neither get nor set needs to actually be a coroutine

05:33 <Wanda[cis]> (or awaitable)

05:33 <whitequark[cis]> this design I described above actually makes sense

05:59 <whitequark[cis]> zyp: ah, there is one consequence of changing things like this

05:59 <whitequark[cis]> I think add_process maybe shouldn't have Delay anymore

06:00 <whitequark[cis]> I can actually justify it without invoking the simulator implementation. add_process (going forward) is intended for behavioral replacement of RTL constructs, and RTL constructs can't do anything like Delay. so it should be made equal in power to it.

06:02 <zyp[m]> it's useful if you want to simulate phase shifting/delay primitives

06:02 <whitequark[cis]> yes, I know

06:02 <whitequark[cis]> you can still do that using add_testbench

06:03 <whitequark[cis]> (really, add_process ~ add_rtl_process, add_testbench ~ add_io_process)

06:43 <_whitenotifier-6> [amaranth] whitequark opened pull request #1231: Only preempt simulator testbenches on explicit wait points - https://github.com/amaranth-lang/amaranth/pull/1231

06:43 <whitequark[cis]> https://github.com/amaranth-lang/amaranth/pull/1231

06:43 <_whitenotifier-5> [amaranth] whitequark edited pull request #1231: Only preempt simulator testbenches on explicit wait points - https://github.com/amaranth-lang/amaranth/pull/1231

07:11 <_whitenotifier-5> [amaranth] codecov[bot] commented on pull request #1231: Only preempt simulator testbenches on explicit wait points - https://github.com/amaranth-lang/amaranth/pull/1231#issuecomment-2016390226

07:33 <_whitenotifier-6> [amaranth] whitequark opened pull request #1232: Allow visualizing delta cycles in VCD dumps - https://github.com/amaranth-lang/amaranth/pull/1232

07:34 <_whitenotifier-6> [amaranth] whitequark edited pull request #1232: Allow visualizing delta cycles in VCD dumps - https://github.com/amaranth-lang/amaranth/pull/1232

07:35 <_whitenotifier-5> [amaranth] codecov[bot] commented on pull request #1232: Allow visualizing delta cycles in VCD dumps - https://github.com/amaranth-lang/amaranth/pull/1232#issuecomment-2016396442

07:52 <_whitenotifier-5> [amaranth] whitequark edited pull request #1232: Allow visualizing delta cycles in VCD dumps - https://github.com/amaranth-lang/amaranth/pull/1232

08:17 <_whitenotifier-6> [amaranth] whitequark edited pull request #1231: Only preempt simulator testbenches on explicit wait points - https://github.com/amaranth-lang/amaranth/pull/1231

09:24 <_whitenotifier-5> [amaranth] whitequark opened pull request #1233: Dump simulation testbench commands as VCD waveforms - https://github.com/amaranth-lang/amaranth/pull/1233

09:27 <_whitenotifier-6> [amaranth] codecov[bot] commented on pull request #1233: Dump simulation testbench commands as VCD waveforms - https://github.com/amaranth-lang/amaranth/pull/1233#issuecomment-2016424455

09:45 <whitequark[cis]> zyp: Wanda: I think I discovered a serious issue with the simulator, namely, the `read_stream` is unimplementable with either the old or the new interface

09:45 <whitequark[cis]> * new interface when using `add_testbench`

09:45 <whitequark[cis]> consider this:

09:45 * whitequark[cis] uploaded an image: (193KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/hKupWWFEGjfYGdYcKcUhoqwy/Screenshot_20240323_094253.png >

09:46 <whitequark[cis]> the stream protocol requires doing a very specific thing for it to work correctly: values of signals valid and payload must be sampled at exactly the posedge of the associated clock

09:47 <whitequark[cis]> in fact, i think that at all other times, the values of those signals can be completely undefined?

09:48 <whitequark[cis]> anyway, on the screenshot, get_testbench is reading from the dac_stream. it awaited tick(), and got back control with a state where post-tick() propagation has already finished

09:48 <galibert[m]> isn't that true of every synchronous signal? The only state that matters is at posedge?

09:49 <whitequark[cis]> so it gets a weird intermediate value as a result, where the x coordinate is from the next point, and the rest of payload is from the previous one

09:49 <whitequark[cis]> now, consider a slightly different arrangement (I swapped add_testbench calls) that "works":

09:50 * whitequark[cis] uploaded an image: (109KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/SGidvInFwtCYfNeMwYRLwYia/Screenshot_20240323_095005.png >

09:51 <whitequark[cis]> here, get_testbench is also sampling a weird and wrong intermediate value. however, because it runs after put_testbench (note: the gateware has a comb assignment of valid from the stream put_testbench writes to, to the stream get_testbench reads from), and nothing else runs in that cycle, this weird intermediate value happens by chance to match the value at the next posedge, and things "work"

09:52 <whitequark[cis]> except that get_testbench is actually getting the values out almost one cycle earlier than it should be able to have them

09:53 <whitequark[cis]> this is completely broken! we can't ship this

09:55 <whitequark[cis]> zyp: I propose altering the domain trigger object in this way: `await tick(sample=[stream.ready, stream.valid])` returns the values of those signals at the moment of the tick, and not at a later point; the same for `await tick().until()`, the condition is evaluated at the moment of the tick

09:57 <galibert[m]> Easier than collating the writes

10:03 <whitequark[cis]> actually, I'm going to suggest something even more radical. await tick(*sample, domain="sync", context=None)

10:04 notgull has joined #amaranth-lang

10:05 Guest26 has joined #amaranth-lang

10:06 <whitequark[cis]> here's why. right now, sim.get() has two completely different behaviors depending on the context where it's called from. if it's in add_process it gives you the value exactly at the wait point (e.g. right at the moment when await tick() fired), even though values may have changed since then; if it's in add_testbench it gives you the current value, and there's no way to get the value at the wait point

10:06 <whitequark[cis]> this sucks. it's extremely confusing, and it's why we need the weird @testbench_helper functionality

10:06 <whitequark[cis]> what if: we banned using sim.get() in add_process processes?

10:07 <whitequark[cis]> then the only ways to sample values become: await changed(*sample) and await tick(*sample), corresponding to a comb and a sync process

10:09 <whitequark[cis]> and, because await tick() doesn't actually behave any different in add_process vs add_testbench, we no longer need to differentiate the context for the helper functions, because you can write them in a way that they just work universally

10:11 <whitequark[cis]> (maybe await tick(domain="sync", *, context=None).sample(*signals) is a bit more palatable; same general idea. also it needs an __aiter__.)

10:14 <whitequark[cis]> now the difference between add_process and add_testbench becomes: add_process is race-free but no sim.get() or sim.delay() (I think delay has to go from add_process after all); add_testbench can race (with other testbenches only) but you can do sim.get() or sim.delay(); it is recommended to do any kind of IO via add_testbench and only use add_process for behavioral models that are self-contained and don't

10:14 <whitequark[cis]> have any shared non-signal state with anything else (because add_process is only race-free in regards to signals)

10:21 <galibert[m]> Doesn’t that mean testbench is active and process passive?

10:21 <whitequark[cis]> yeah I think so

10:22 <galibert[m]> And can you do combinatorial in process? IIRC ack on wishbone tends to be comb?

10:29 <whitequark[cis]> yes

10:29 <galibert[m]> that sounds cool

10:32 Guest26 has quit [Ping timeout: 250 seconds]

10:45 <whitequark[cis]> ok, so I prototyped await tick().sample() (with the old interface and a bunch of really gross, unupstreamable hacks)

10:47 <whitequark[cis]> take a look at this

10:48 <whitequark[cis]> here, get_testbench is waiting on a Tick() until the async trigger

10:48 * whitequark[cis] uploaded an image: (43KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/vosNPGmTTnpBrgLOlurdmpYA/Screenshot_20240323_104742.png >

10:48 <whitequark[cis]> right after the posedge, it samples the signals it was asked to

10:48 * whitequark[cis] uploaded an image: (48KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/csuUsqhXVROTOAONvZTxejcU/Screenshot_20240323_104832.png >

10:49 <whitequark[cis]> after that it waits for the simulation to settle

10:49 * whitequark[cis] uploaded an image: (40KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/CzSRnhCdOedCnYPFSszYDosY/Screenshot_20240323_104902.png >

10:49 <whitequark[cis]> and finally after settling it returns control to the user process

10:49 * whitequark[cis] uploaded an image: (48KiB) < https://catircservices.org/_matrix/media/v3/download/matrix.org/sSVuEoIKwAIljgHQdmzZKkvK/Screenshot_20240323_104933.png >

10:50 <whitequark[cis]> this approach has actually resolved the race condition that existed in this design; any ordering of get_testbench and put_testbench produces more or less the same behavior (they differ slightly at the very beginning)

12:04 <galibert[m]> Nice

12:06 <galibert[m]> That’s going to be so useful

12:52 jn__ is now known as jn

15:16 <_whitenotifier-6> [rfcs] whitequark opened pull request #64: Amend RFC #36 with a concrete concurrency model - https://github.com/amaranth-lang/rfcs/pull/64

15:17 <whitequark[cis]> zyp: Wanda: I wrote an amendment for RFC 36 https://github.com/amaranth-lang/rfcs/pull/64/files

15:19 <galibert[m]> What would be a sane way to replace an Instance of a hard block with something that simulates it when you're doing simulation? I'm thinking replacing my instances of m10k in true dual port mode by a standard Memory for sim

15:20 <whitequark[cis]> that sounds reasonable

15:20 <_whitenotifier-6> [rfcs] whitequark commented on pull request #64: Amend RFC #36 with a concrete concurrency model - https://github.com/amaranth-lang/rfcs/pull/64#issuecomment-2016523254

15:21 <galibert[m]> how should I detect that I'm in simulation to do that?

15:21 <whitequark[cis]> look at the platform

15:21 <galibert[m]> None?

15:21 <whitequark[cis]> I think it's None in simulation currently

15:21 <galibert[m]> Ok, that's going to change iirc, but not right now

15:24 balrog_ is now known as balrog

16:15 <tpw_rules> whitequark[cis]: should DUID's increment have a lock? what if you're somehow multithreading signal construction in a test suite or so?

16:15 <whitequark[cis]> we should just get rid of those entirely

16:16 <tpw_rules> ok, i was about to use it as a name source for https://github.com/amaranth-lang/amaranth/issues/1223

16:17 <tpw_rules> but it looks like the only thing that uses it is SignalKey which only checks for it on a Signal

16:18 <tpw_rules> does removing it need to go through deprecation or can it just be yanked out as part of that?

16:18 <whitequark[cis]> it's not public

16:19 <tpw_rules> how do the $nnn suffixes get added when two names are the same? is that amaranth or yosys?

16:19 <whitequark[cis]> RTLIL backend I think

16:20 <tpw_rules> ok i'll yank it out

16:22 <whitequark[cis]> careful with SignalKey though, if you replace it with e.g. id() it may introduce nondeterminism

16:22 <tpw_rules> yes. i need equivalent functionality for the signal names so i was thinking of just migrating it into Signal for now

16:23 <whitequark[cis]> why do you need that for signal names?

16:23 <tpw_rules> it seemed an easy way to name private signals

16:24 <whitequark[cis]> I don't think that's a good way to do it

16:24 <tpw_rules> would they only be named by a backend and keep a name of "" in .name?

16:24 <whitequark[cis]> the RTLIL backend should just have a counter to emit $nnn

16:24 <whitequark[cis]> like it already emits suffixes

16:24 <whitequark[cis]> tpw_rules: yeah

16:24 <tpw_rules> okay, sounds good

16:24 <whitequark[cis]> skipped for VCD and stuff

16:25 <tpw_rules> so they wouldn't be written to a vcd either? i thought you just wanted a flag which disabled showing them

16:25 <whitequark[cis]> um

16:26 <whitequark[cis]> how would that even work?

16:26 <tpw_rules> i hadn't gotten to that part yet

16:26 <whitequark[cis]> VCD has no such functionality

16:26 <whitequark[cis]> and why write stuff you'll never show, anywau

16:26 <tpw_rules> are there other waveform viewers that need to check?

16:26 <whitequark[cis]> s/anywau/anyway/

16:27 <whitequark[cis]> naw

16:37 <tpw_rules> should the repr as just "(sig )"? or "(sig <private>)"?

16:37 <tpw_rules> they*

16:39 <whitequark[cis]> `(sig)` seems fine

16:39 <tpw_rules> ok, the current formatting would have a space after "sig" but i can remove that

16:49 <Wanda[cis]> yeah the DUID stuff needs to go, but removing it is tricky, and I don't think it belongs to the same patchset

16:49 <Wanda[cis]> what you need to do is essentially just allowing empty name, then filtering such signals in VCD writer and in the naming pass in hdl._ir.Design

16:50 <Wanda[cis]> the RTLIL backend needs no changes at all, I think

16:50 <Wanda[cis]> it's already made to handle anonymous nets

16:50 <Wanda[cis]> (the ones that happen all the time when you just pass the result of an operator directly to another one)

16:52 <Wanda[cis]> actually you don't even need to allow empty name, it's already allowed because we don't bother to check the name other than "it's a string"

16:52 <tpw_rules> currently it has the same meaning as None there to Signal

16:53 <Wanda[cis]> oh

16:53 <tpw_rules> so it will get the variable name

16:53 <Wanda[cis]> we check by boolean?

16:53 <Wanda[cis]> yeaah that needs a fix

16:53 <tpw_rules> yes, i did already

16:55 <tpw_rules> what do you mean by filter in the naming pass? i currently just taught the rtlil pass to treat a name of "" as None and then give it a name there

16:56 <Wanda[cis]> have you looked at the Design class?

16:57 <Wanda[cis]> and the naming pass in it?

16:57 <tpw_rules> yes, it looks like i can modify _add_name

16:58 <Wanda[cis]> no

16:58 <Wanda[cis]> _add_name should not be called for these signals at all

16:58 <Wanda[cis]> because they don't have names

16:59 <Wanda[cis]> you need a very simple change

17:00 <tpw_rules> ok, let me digest the logic a little further

17:00 <Wanda[cis]> _assign_names, the loop for signal in frag_info.used_signals:

17:00 <Wanda[cis]> just... continue here if the signal's name is ""

17:00 <tpw_rules> yes, i was just looking there. just skip signals without names there?

17:00 <Wanda[cis]> yup

17:00 <Wanda[cis]> and you don't need any change in the RTLIL backend

17:00 <tpw_rules> okay. how do anonymous nets end up getting processed through here?

17:00 <Wanda[cis]> through where?

17:01 <Wanda[cis]> it's a naming pass, if a net is anonymous, it's none of its business

17:01 <tpw_rules> the Design class. i guess nets end up being Values or something, not Signals?

17:01 <Wanda[cis]> no

17:01 <Wanda[cis]> that happens later, in NIR converter

17:02 <Wanda[cis]> and _nir.Net / _nir.Value are anonymous by design

17:02 <Wanda[cis]> names are external annotations

17:02 <tpw_rules> so frag_info.used_signals is exactly the list of Signal objects the user has used in their design

17:02 <Wanda[cis]> and for anonymous signals we simply don't emit them

17:03 <Wanda[cis]> tpw_rules: in the particular fragment even, not in the design

17:03 <tpw_rules> okay. i guessed that signals might be things which are not Signals, but that's where net comes in. that change makes sense then, thank you

17:04 <Wanda[cis]> so how it works is

17:04 <Wanda[cis]> the NIR nets / values are completely nameless

17:04 <Wanda[cis]> and the names are outside

17:05 <Wanda[cis]> the pre-NIR `_assign_names` pass creates a mapping of (fragment, signal) -> name

17:05 <Wanda[cis]> then while constructing the NIR we also construct a dict of signal -> NIR value

17:06 <Wanda[cis]> and we copy the earlier mapping from _assign_names to NIR

17:06 <tpw_rules> is it possible for a private Signal to end up as a port?

17:06 <Wanda[cis]> then comes the RTLIL backend. the RTLIL backend is mainly concerned about emitting actual NIR, not the names, so it'll construct an anonymous (ie. auto-named) wire for every cell output

17:07 <Wanda[cis]> but it will then also emit a wire for every signal present in the name dictionary for a given fragment, and assign the raw anon value to it

17:08 <Wanda[cis]> with a further optimization that if a wire needed for NIR cell output exactly matches a signal, we'll use the signal wire instead of emitting an anonymous one

17:08 <Wanda[cis]> tpw_rules: yes, and in this case you get an auto-named port

17:12 <tpw_rules> so used_io_ports exclusively contains IOPort

17:12 <Wanda[cis]> ... as the name suggests, yes

17:14 <tpw_rules> oh i see, ports already are all auto-named anyway, it's a later pass that puts the names back on

17:14 <Wanda[cis]> Verilog-level port creation is done in separate passes at the end of _ir.py

17:15 <Wanda[cis]> _compute_ports specifically

17:15 <Wanda[cis]> for the ports that are values (as opposed to IO values)

17:15 <Wanda[cis]> they're not all auto-named, we reuse a signal name if one happens to match

17:16 <tpw_rules> more directly, why don't we have to check for a signal with a private name in the "Reserve names for top-level ports" loop

17:16 <Wanda[cis]> oh hm.

17:17 <Wanda[cis]> actually that's a good idea

17:17 <Wanda[cis]> oh wait, no

17:17 <Wanda[cis]> we don't

17:18 <Wanda[cis]> at this point, top-level port names are already decided

17:18 <tpw_rules> it looks like a port name can only come from the ports= argument of prepare? which only come from a Platform.build call or similar?

17:18 <Wanda[cis]> not quite

17:18 <Wanda[cis]> port names passed to ports= can be None, which means "name them from the signal"

17:18 <Wanda[cis]> see _assign_port_names

17:18 <Wanda[cis]> there we should have the check for anon signal

17:19 antoinevg[m] has quit [Quit: Idle timeout reached: 172800s]

17:22 <tpw_rules> yeah it is possible to break the RTLIL builder if you do something weird like `ports={"": (rx.err, PortDirection.Output)}`. does that get filed under "you get what you deserve"?

17:23 <tpw_rules> (passing that through the CLI to Platform.build)

17:23 <Wanda[cis]> it kinda does, but also we do want a better diagnostic for it

17:23 <Wanda[cis]> _assign_port_names would be a good place to make sure all port names make sense

17:23 <tpw_rules> uh and actually fixing that for loop does not fix that problem

17:25 <Wanda[cis]> also: we need to check that all our names are actually valid RTLIL identifiers

17:25 <Wanda[cis]> (ie. don't contain whitespace)

17:25 <Wanda[cis]> which should probably be in Signal / IOPort / ... constructor

17:26 <tpw_rules> i think that's a later patch, i also have the issue for accidentally doing that to module names

17:26 <Wanda[cis]> yeah

17:31 <_whitenotifier-6> [amaranth] wanda-phi reviewed pull request #1231 commit - https://github.com/amaranth-lang/amaranth/pull/1231#discussion_r1536666314

17:33 notgull has quit [Ping timeout: 260 seconds]

17:35 <tpw_rules> so, i don't even need to do anything to the VCD writer now?

17:35 notgull has joined #amaranth-lang

17:36 <Wanda[cis]> hmmmm

17:36 <Wanda[cis]> reject anonymous signals in traces?

17:36 <Wanda[cis]> for main VCD, the namer change will handle it already

17:36 <tpw_rules> yeah, i was thinking that too actually

17:38 <tpw_rules> yeah it currently tries to write them out but i think it makes a syntax errror

17:38 <tpw_rules> is there a way to grab more useful info than "you attempted to trace a private signal"? it doesn't repr to anything useful

17:39 <Wanda[cis]> not really

17:39 <Wanda[cis]> they are anonymous after all

17:39 <Wanda[cis]> but, well, the place where one passes traces= is pretty obvious, and you don't usually have that many signals in it

17:39 <Wanda[cis]> so it's not like you have to go looking through the entire hierarchy or anything

17:41 <tpw_rules> ok. if users complain later maybe we can auto-name them

17:41 <_whitenotifier-5> [amaranth] wanda-phi reviewed pull request #1231 commit - https://github.com/amaranth-lang/amaranth/pull/1231#discussion_r1536667774

17:48 <_whitenotifier-6> [amaranth] wanda-phi reviewed pull request #1231 commit - https://github.com/amaranth-lang/amaranth/pull/1231#discussion_r1536668736

17:49 <_whitenotifier-6> [amaranth] wanda-phi reviewed pull request #1231 commit - https://github.com/amaranth-lang/amaranth/pull/1231#discussion_r1536668861

17:58 <_whitenotifier-5> [amaranth] tpwrules opened pull request #1234: Allow name of "" to denote a private Signal - https://github.com/amaranth-lang/amaranth/pull/1234

17:59 <tpw_rules> Wanda[cis]: i forget what you said your status was on https://github.com/amaranth-lang/amaranth/issues/1100 but it was a problem for the quartus fix. is there a way i could do this or collaborate with you on it? i am not really familiar with RTLIL at all

17:59 <Wanda[cis]> the code is done, it now needs tests

17:59 <Wanda[cis]> lots of tests

18:00 <Wanda[cis]> my intent is to work on them as soon as I'm done with reviewing Cat's PRs

18:00 <_whitenotifier-5> [amaranth] codecov[bot] commented on pull request #1234: Allow name of "" to denote a private Signal - https://github.com/amaranth-lang/amaranth/pull/1234#issuecomment-2016562482

18:01 <tpw_rules> okay, sounds good. thanks again

18:02 <_whitenotifier-6> [amaranth] wanda-phi reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536670659

18:05 <_whitenotifier-5> [amaranth] wanda-phi reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536670992

18:16 <_whitenotifier-5> [amaranth] tpwrules reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536672306

18:17 <_whitenotifier-5> [amaranth] tpwrules reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536672348

18:33 <_whitenotifier-5> [amaranth] wanda-phi reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536674341

20:10 <Wanda[cis]> Catherine: so regarding non-RFC 64

20:11 <Wanda[cis]> the core concurrency model is sound and I agree with it

20:12 <Wanda[cis]> I find .sample() useful too, and I consider it interesting how we're reinventing systemverilog program (aka our testbench) and now clocking input variables

20:13 <Wanda[cis]> however, I'm very much not sold on two things

20:13 <Wanda[cis]> add_process functionality reduction and the proposed interaction of until / repeat with sample

20:15 <Wanda[cis]> I believe we explicitly wanted delay in processes, so that we can implement delay lines

20:16 <_whitenotifier-6> [amaranth] tpwrules reviewed pull request #1234 commit - https://github.com/amaranth-lang/amaranth/pull/1234#discussion_r1536687061

20:16 <Wanda[cis]> it may be the case that we want to provide the functionality in a different way (.set with a delay?), but removing it entirely goes directly against the discussion on RFC 36

20:20 <Wanda[cis]> as for removing sim.get: it makes things weirdly asymmetic, and prevents you from implementing some FF types easily

20:21 <Wanda[cis]> you also say that sim.set can be used to copy the value of a signal without looking at it, but that's not what the RFC says

20:21 <Wanda[cis]> the RFC says that set argument needs to be int, or ShapeCastable.const-convertible

20:23 <Wanda[cis]> (this also means the DDR register example in the RFC was invalid all along, which, oops)

20:23 <Wanda[cis]> * (this also means the DDR register example in the RFC has been invalid all along, which, oops)

20:24 <Wanda[cis]> further: .sample cannot be combined with .edge, which is a major problem given sim.get removal in process — how would you amend the DDR buffer example to add clock enable or sync reset, for example?

20:25 <Wanda[cis]> also: we now have no way at all to read memory in processes

20:26 <Wanda[cis]> (we also have no way at all to watch a memory for changes, which I consider a separate defect; I may consider amending RFC 62 to add such a thing

20:26 <Wanda[cis]> * a thing)

20:27 <Wanda[cis]> overall, I believe our processes should be powerful enough to, in principle, implement all of PyRTL compiler with them

20:28 <Wanda[cis]> and the VCD writer as well

20:35 <Wanda[cis]> * add_process functionality reduction and the proposed interaction of until with sample

20:36 <Wanda[cis]> as for until, it now has very weird semantics

20:37 <Wanda[cis]> essentially the first time it performs the check, it does sim.get, then it goes through sampling

20:38 <tpw_rules> hm, does RTLIL support unicode identifiers? it says no ascii below 32 but what about above 127

20:38 <Wanda[cis]> this means that, in the absence of other testbench modifications of the condition, the first and second checks are actually the same

20:39 <Wanda[cis]> tpw_rules: it does (by not disallowing them). however, whatever toolchain ends up consuming the generated Verilog may disagree.

20:40 <tpw_rules> Wanda[cis]: ok, i'm trying to come up with identifier rules. so chars 32-126 are okay, must not start with a \ or $ ?

20:40 <Wanda[cis]> starting with \ or $ is perfectly fine

20:40 <Wanda[cis]> chars 33-126 (not 32-126) are okay

20:41 <Wanda[cis]> as for 128+ ... I think they should be allowed too

20:41 <tpw_rules> oh, yes, 32 is space

20:41 <Wanda[cis]> like, it's 2024, international identifiers are a thing

20:41 <tpw_rules> i'd be okay with that, yeah

20:41 <Wanda[cis]> and if it causes problems for some shit toolchain down the line, I believe the correct solution is adding a flag that sanitizes the verilog output, perhaps implemented in yosys

20:42 <Wanda[cis]> and set it from the platform

20:42 <tpw_rules> okay. it's also still strictly better than currently

20:43 <galibert[m]> Yeah, cursed identifier names for cursed HDL

20:43 <Wanda[cis]> I'd go for something like "is printable and not whitespace"

20:43 <Wanda[cis]> instead of looking at char codes

20:43 <Wanda[cis]> it works out to 33-126 for ascii range

20:43 <galibert[m]> testing printability in unicode is annoying

20:44 <Wanda[cis]> we have Python.

20:44 <tpw_rules> it does come with the unicode database, and i think relies on unicode rules for internal names already

20:46 <tpw_rules> i'll fiddle with that and file a PR

20:46 <tpw_rules> not sure if that's RFC-worthy, my primary goal is fixing explosions using ""

20:46 <Wanda[cis]> it's not RFC-worthy, just do a PR

20:46 <tpw_rules> ok

20:52 <Wanda[cis]> Catherine: (continuing RFC 64 review) the `stream_recv` example is just plain broken if `valid` is already `1`, since `until.sample` will return `None` as data in that case

20:53 <Wanda[cis]> further, the example as given may tick twice with ready set to 1, eating a value

20:53 <Wanda[cis]> likewise for stream_send, the value can get sent twice

20:55 <Wanda[cis]> I think these examples just expose the fundamental brokenness of until-sample interaction, I'm not quite sure what it should be, though

21:00 <_whitenotifier-5> [amaranth] tpwrules commented on issue #1209: Zero-length submodule name breaks things - https://github.com/amaranth-lang/amaranth/issues/1209#issuecomment-2016601640

21:13 <Wanda[cis]> also, interaction of sample with repeat is... slightly weird

21:14 <Wanda[cis]> return a list of tuples of sampled signals, one tuple per tick? meh.

21:14 <Wanda[cis]> idc really

21:16 <Wanda[cis]> I also see some potential use for a symmetric feature to sample: await sim.tick(...).update(sig, new_value), which corresponds to SystemVerilog clocking output variables

21:16 <Wanda[cis]> doing this would perform the updates right after the tick, ensuring all other testbenches see the new values

21:17 <Wanda[cis]> but this is... less well-motivated than sample

21:17 <galibert[m]> why would sample return None?

21:18 <Wanda[cis]> galibert: have you read the proposed amendment?

21:18 <galibert[m]> Non, only the discussion here

21:18 <Wanda[cis]> then I'd recommend doing so before engaging in discussion.

21:19 <galibert[m]> Didn't realize there was an amendment

21:19 <Wanda[cis]> I literally mentioned the RFC number

21:19 <Wanda[cis]> * the RFC PR number

21:19 <galibert[m]> oh, 64?

21:19 <Wanda[cis]> yes

21:20 <galibert[m]> thanks

21:21 <tpw_rules> Wanda[cis]: people seem to want spaces in FSM names based on https://github.com/amaranth-lang/amaranth/pull/595 which get transformed into signal names

21:23 <Wanda[cis]> ... then people should get used to names that don't contain spaces

21:23 <Wanda[cis]> the accepted character set is already ridiculously rich

21:25 <galibert[m]> Oh, that "None" if the simulation has not advanced. Why would the simulation not advance in the first place?

21:25 <tpw_rules> okay

21:26 <Wanda[cis]> like, neither VCD nor RTLIL can accept space-having names, so there's literally nothing we can do with them

21:26 <Wanda[cis]> they're also not valid in Verilog names

21:27 <galibert[m]> sim.tick().until(...).sample(...) is supposed to wait at least until the next tick, no?

21:27 <Wanda[cis]> .... okay I think VHDL names can contain spaces, so there's actually prior art for that

21:27 <Wanda[cis]> but VHDL names are also several kinds of batshit insane, so I refuse to continue that line of thought any further

21:28 <galibert[m]> There's a lot of prior art for converting spaces to underscores when needed too

21:29 <Wanda[cis]> galibert[m]: can you *please* read the RFC

21:30 <galibert[m]> I did now

21:30 <galibert[m]> until returns immediatly if the condition is ok, but until is supposed to be tested after tick elapses

21:31 <Wanda[cis]> that's not what the RFC says

21:31 <Wanda[cis]> > If condition is initially true, await will return immediately without advancing simulation.

21:31 <Wanda[cis]> that's also not what the example implementation says

21:35 <galibert[m]> I don't understand the example implementation. But a version working like you say makes no sense, which may be why I didn't interpret it that way

21:36 <galibert[m]> since a simple "do a think when that variable is on on the tick" is just not implementable

21:36 <Wanda[cis]> that is exactly why I'm complaining about it!

21:37 <galibert[m]> Ok, then I agree with you :-)

21:38 <galibert[m]> I understood it as linear composition, where each level of trigger does its thing and passes to the next one, trying again if the next one fails (recursively)

21:39 <galibert[m]> So tick() waits one tick, then calls until() which passes or fails, if it fails ticks tries again after a tick, if it passes sample takes the values and always passes

21:39 <galibert[m]> you could even add a changed in there

21:40 <galibert[m]> you need to have the whole chain to pass, otherwise the first one advances time and tries again

21:40 <galibert[m]> (or waits for the time to advance in the comb case)

21:40 <Wanda[cis]> that's ... not how it works

21:41 <galibert[m]> that's a pity

21:41 <galibert[m]> it made sense

21:59 Maja has joined #amaranth-lang

22:15 notgull has quit [Ping timeout: 245 seconds]

23:27 <_whitenotifier-5> [amaranth] tpwrules opened pull request #1235: Enforce naming rules on core HDL - https://github.com/amaranth-lang/amaranth/pull/1235

23:29 mcc111[m] has joined #amaranth-lang

23:29 <mcc111[m]> I have a cursed question... (full message at <https://catircservices.org/_matrix/media/v3/download/catircservices.org/POwKvmbIFVAXwgQWYWoozgBc>)

23:30 <Wanda[cis]> you need the specific interpreter

23:30 <mcc111[m]> ok

23:31 <Wanda[cis]> also whatever happens in tcl generally happens after Amaranth is done producing the netlist

23:32 <mcc111[m]> but there's that Amaranth Build System thing right

23:33 <Wanda[cis]> (plus we have support for remote builds, so in general the Python interpreter and tcl interpreter involved don't even run on the same machine)

23:35 <mcc111[m]> that does sound desirable

23:50 <mcc111[m]> today I installed the "Libero" toolchain on my ubuntu laptop

23:50 <mcc111[m]> it was a miserable experience

23:56 <Darius> you can use Tkinter to run Tcl inside Python, but yes you will need the right Tcl interpreter