<_whitenotifier-6>
[amaranth-lang/amaranth-lang.github.io] github-merge-queue[bot] 7fe1fc8 - Deploying to main from @ amaranth-lang/amaranth@8c65a79cdd2d1841992bbc1c6b7948ba697ae712 🚀
notgull has joined #amaranth-lang
notgull has quit [Ping timeout: 272 seconds]
_whitelogger_ has joined #amaranth-lang
jn_ has quit [*.net *.split]
balrog has quit [*.net *.split]
josuah has quit [*.net *.split]
Bluefoxicy has quit [*.net *.split]
_whitelogger has quit [*.net *.split]
Bluefoxicy_ is now known as Bluefoxicy
<whitequark[cis]>
2 kwords so far on the "minimal" stream RFC
<underst0rm[m]>
I am so looking forward to the stream RFC. I have been playing with Zynq and wrote some AXI4-lite and AXI stream peripherals with amaranth. Since it was my first amaranth and FPGA work, it was relatively painful. I hope to use interfaces and the amaranth_soc CSR next time (somehow I didn't see them both)
Chips4MakersakaS has quit [Quit: Idle timeout reached: 172800s]
<whitequark[cis]>
4972 words (I have an interesting definition of "almost")
<galibert[m]>
I guess it goes well with your definition of minimal :-)
<whitequark[cis]>
tpw_rules: what is a payload property?
<tpw_rules>
(i would not want to fill my code with `payload` everywhere)
<tpw_rules>
stream.payload.something
<whitequark[cis]>
oh, you write stream.p.something instead
<whitequark[cis]>
this is useful because it means that you can nest arbitrary stuff within payload, even if it has ready or valid as attributes
<tpw_rules>
can there be stream.something with __getattr__ that checks `p`? yes, that is a fair point
<tpw_rules>
__getattr__ on the stream
<whitequark[cis]>
technically? yes. practically? this means we can never extend stream.Interface again with anything ever
<whitequark[cis]>
e.g. the stream transformers from "future work" become impossible, because someone may have used that name and it'll break their code
<tpw_rules>
that is also a good point
<whitequark[cis]>
it also creates an ambiguity for the reader
<tpw_rules>
let me read it properly
<whitequark[cis]>
"hmm, stream.x. is x a part of the stream protocol, or something completely unrelated?"
<whitequark[cis]>
for the same reason the CSR API uses .f
<tpw_rules>
it does say we lock in that the "stream signature always has exactly these three members."
<whitequark[cis]>
yes but it could have non-member attributes
<tpw_rules>
ok, i guess a member is a thing that gets connected and an attribute is just a .whatever
<whitequark[cis]>
yes
<whitequark[cis]>
attribute is a python concept, member is a lib.wiring concept
<tpw_rules>
lots of terminology :)
<galibert[m]>
I'm not sure what the "similar logic which applies to valid = Const(1)" is
<galibert[m]>
valid always 1 does not mean "I do not support backpressure", it just means "I always have data available", no?
<whitequark[cis]>
there are streams with no backpressure (ready = Const(1)) and streams which always have data available
<whitequark[cis]>
and you could have interconnect which expects data to be available on every cycle, with or without backpressure
<whitequark[cis]>
this comes up in SerDes implementations
<galibert[m]>
Sure. So it means a transmitter with valid:1 can be connected to anything, right?
<whitequark[cis]>
yeah, the roles are swapped
<whitequark[cis]>
a receiver with valid:1 expects a transmitter with valid:1
<galibert[m]>
In fact a transmitter has no way to say it does not support backpressure, if I understand correctly
<whitequark[cis]>
it has, ready:1
<galibert[m]>
huh?
<galibert[m]>
a transmitter does not drive ready, only the reciever does?
<whitequark[cis]>
whitequark[cis]: for example if you make a multiplier or like a FIR filter or something which does not track pipeline bubbles, you could have valid:1 on the input (and also valid:1 on the output). however it would still support ready as that is just a clock enable for the entire thing
<whitequark[cis]>
galibert[m]: `lib.wiring` allows tying inputs to constants
<whitequark[cis]>
for exactly this reason
<whitequark[cis]>
if you tie an input to a constant, you can only be connected to something whose output is that exact constant
<Wanda[cis]>
(you have a mismatched RFC number in the header)
<whitequark[cis]>
it's a very clever bit of lib.wiring that has seen little appreciation... yet :)
<galibert[m]>
Ahhhhhhhhhhhhhhhhhhhhh
<galibert[m]>
Ok, that's a beautiful trick
<whitequark[cis]>
Wanda[cis]: fixed
<whitequark[cis]>
lib.wiring is hands down some of my most beautiful work, and it's all for good reason
<whitequark[cis]>
and that particular feature is a Chekov's gun
<galibert[m]>
Ok, I realize it was described but I just didn't understand the text
<galibert[m]>
Tell me if I'm wrong, but I think it's possible to implement a 0-latency fifo with the rules, right?
<galibert[m]>
I like that rfc, honestly. It's simple, to the point, yet quite flexible
<polysci_00232[m]>
lovely rfc to wake up to. Look forward to reading this afternoon
jfng[m] has quit [Quit: Idle timeout reached: 172800s]
<Wanda[cis]>
hmm
<Wanda[cis]>
it'd be interesting to have some way to encode the "must not depend combinatorially on" relationship in the language
<Wanda[cis]>
and be able to verify it e.g. in NIR via a similar mechanism to the (upcoming) comb loop detection
<galibert[m]>
Also a “must be in that clock domain if not constant”
<Wanda[cis]>
that's kinda blocked on the "clock domains in interfaces" thing
<galibert[m]>
True
<Wanda[cis]>
oh it can actually be done as soon as we have comb loop detection, without any extra stuff
<Wanda[cis]>
just introduce a fake dependency the other way
<Wanda[cis]>
all the pieces are in place already, it just needs a simple DFS
<galibert[m]>
Nice
<Wanda[cis]>
> Once the transmitter asserts valid, it may not deassert it until a transfer is performed.
<Wanda[cis]>
shouldn't this be "must not"? (or even MUST NOT) if we want to cosplay the other RFCs
<whitequark[cis]>
changed
<zyp[m]>
I'm a bit uncertain whether I like rule 4, on one hand it allows the recipient to process the payload for multiple cycles without storing a copy, on the other hand it disallows discarding a pending transfer in favor of newer data
<zyp[m]>
or to be more precise, it disallows the sender to discard data, which means that any discarding has to happen on the receiver side, with the receiver always being ready
<whitequark[cis]>
I think rule 4 is quite important; for one, if we dispose of it, then adapting a component to AXI-Stream will require a functional check that wasn't there before
<whitequark[cis]>
and the RFC is written so that if your component deals in bytes, you can adapt it to AXI-Stream without any logic gates
<whitequark[cis]>
(a functional check and, potentially, a register)
<whitequark[cis]>
but also, I think it's important for streams to be defined as "push-only" (except reset, where it's unavoidable) as that simplifies reasoning about semantics
<whitequark[cis]>
for example, a down-converter can start issuing transfers with individual bytes upon receiving a word, and only assert ready once it's done
<whitequark[cis]>
without rule 4, this is impossible, the FSM inside will get desynchronized unless it registers it first (and adds latency, or complicates the critical path)
<galibert[m]>
Errr, isn't rule 4 missing a "until the payload content are transferred"? Otherwise you can't have back-to-back transferts
<whitequark[cis]>
F5
<whitequark[cis]>
it was
<zyp[m]>
whitequark[cis]: IIRC I once got burned on litex' downconverter doing exactly that
<galibert[m]>
oh ok
<zyp[m]>
so yeah, I can see the argument for it
<whitequark[cis]>
also a big problem with litex (as elaborated on in the RFC) is that it's so under-defined it's not clear what you can even do
<zyp[m]>
yeah, clear rules that I don't like are way better than no rules
<galibert[m]>
zyp: sender can discard data, it just has to do it just after a transfer rather than just before
<zyp[m]>
what I mean is that it can't discard pending data
<galibert[m]>
mostly those rules maximize the decoupling between the two players
<galibert[m]>
unimportant, but I think "to assertion" is a typo for "to an assertion". It's there multiple times
<whitequark[cis]>
I wrote it that way on purpose, it reads fine to me
<whitequark[cis]>
"assertion" is like... participle?
<whitequark[cis]>
yep, the signature is the same, the interface is different
<tpw_rules>
ok, i wonder if there is a better verb than "tie". i wasn't sure if you meant just a combinational assignment
<whitequark[cis]>
"tie off" is the industry term for this
<whitequark[cis]>
actually, no, you have a point
<whitequark[cis]>
it's ambiguous with just a comb assignment
<tpw_rules>
yeah it's very different from m.d.comb += ready.eq(1)
<tpw_rules>
(and .eq(Const(1)) for that matter)
bl0x[m] has joined #amaranth-lang
<bl0x[m]>
I just stumbled over the mention of data port instead of payload port in the section on the litex protocol, point 1.
<tpw_rules>
is it possible to express that in the attribute definition pattern?
<whitequark[cis]>
bl0x[m]: fixed
<whitequark[cis]>
tpw_rules: no, you just assign it in the constructor after `super()`
<whitequark[cis]>
or in this particular case stream.Interface does it for you
<tpw_rules>
with those constructor parameters, okay
<zyp[m]>
yeah, you don't express the individual stream interface members
<whitequark[cis]>
yeah, that's one half of what the parameters are for. the other half is so that __eq__ fails if their always-x doesn't match
<tpw_rules>
can that not defer to checking for Const(1)ness?
<bl0x[m]>
In fact both the linked wiki entry and the implementation show that the litex stream data port is also called payload. To be fixed in points 1 and 5.
<whitequark[cis]>
no, because the __eq__ is on a signature
<whitequark[cis]>
bl0x: oh, you're right, no idea how I missed that
<whitequark[cis]>
probably hours of looking at axi and avalon
<zyp[m]>
the litex stream payload virtually always contains a data member, and there's a getattr that forwards to the payload, so .data is shorthand for .payload.data
<zyp[m]>
well «always» is a bit strong, but pretty much whenever you don't have a good reason to use other names
<whitequark[cis]>
we could add shorthands like that but only on an include-list basis, so that we are able to add a bunch of names on the Interface
<whitequark[cis]>
it's also easier to read the code if you see .p in it, because that is a hard separation between "is this a part of the stream itself?" vs "is this a part of the payload?". same as .f in CSR
<whitequark[cis]>
so adding def data(self): return self.payload.data is kind of weird but I see no reason to oppose it on principle, if for some reason .p isn't enough
<whitequark[cis]>
but I want to gain experience with streams first, and also probably ship stream transformers first.
<tpw_rules>
i don't understand the something about rationale C2/C2, what does "not connect a single member to multiple members" mean and how would you connect to a FIFO with option C3?
<tpw_rules>
C2/C3*
<whitequark[cis]>
anything you build with lib.data is a single member, therefore if you create a FIFO with the same width and the FIFO gives you a ready,valid,payload interface, you can connect them even if the FIFO's payload is just unsigned(whatever) and your thing is a struct or a signed number
<zyp[m]>
<whitequark[cis]> "so adding def data(self..." <- it doesn't really make sense for Amaranth streams, we don't have to have an aggregate `payload` with a single `data` member, we'll only have an aggregate `payload` if it actually needs to be aggregate
<whitequark[cis]>
ah I see
<tpw_rules>
ok, so it's saying that `connect` can't jam all the sub-members of the payload interface into the one payload member on the FIFO
<whitequark[cis]>
yeah. or more generally, it won't jam an interface member into a port member
<tpw_rules>
i think that sentence should be like "will not connect a multiple sub-members to a single member" because the way it reads now is like... the entire definition of connect
* tpw_rules
needs to learn more about shapes
<whitequark[cis]>
that's true but inaccurate (insufficiently complete)
<whitequark[cis]>
lol I forgot I put "Altera (Intel (Altera))" into the text
<whitequark[cis]>
the next time Intel buys it I should expand that to four linked list items
<tpw_rules>
should `payload` have a reserved name for future expansion of the omitted features? or that be a separate member which has 0 length? (which would break the FIFO case again)
<whitequark[cis]>
have you read the very last section?
<tpw_rules>
not yet
<whitequark[cis]>
it addresses that
<zyp[m]>
apart from the rule I mentioned earlier (where I agree that it makes sense to stay compatible AXI-Stream), I don't think I've got any concerns about this RFC
<zyp[m]>
* stay compatible with AXI-Stream), I
<bl0x[m]>
whitequark[cis]: I liked that bit 🤣
<zyp[m]>
payload/ready/valid is all very noncontroversial
<bl0x[m]>
Does the transmitter have to deassert `valid` after every transmission? Or does point 1 in the guide level explanation mean that the agreement is such that when in one click cycle both 'ready' and 'valid' are asserted, that the transfer is assumed to be completed? Meaning that the receiver signaling 'ready' must consume the payload in one clock cycle.
<tpw_rules>
very minor typo: "furhter"
<galibert[m]>
bl0x: the latter, no valid toggling
<zyp[m]>
bl0x[m]: the transmitter have to deassert `valid` iff `ready` is asserted and it has no more data to send on the next cycle
<bl0x[m]>
Ok, I assumed so. That's why receiver and transmitter must run on a common clock
<galibert[m]>
yep, it's pure sync
<bl0x[m]>
Any CDC is outside the scope.
<galibert[m]>
well you can have a cdc-ing component in the middle of a stream
<zyp[m]>
stream CDC would happen in a component with two stream interfaces
<galibert[m]>
yeah that
<galibert[m]>
makes it explicit
<zyp[m]>
i.e. an async FIFO
<bl0x[m]>
Yes, very good.
<tpw_rules>
whitequark[cis]: okay, i read the last section (and the rest) and i don't think my point is completely answered. the way it's proposed means that you couldn't accidentally double-name something, but you would have to refactor your design if it used a name and you wanted to add a transformer, and it could be accidentally connected
<tpw_rules>
does that also imply that payload has to be a StructLayout, at least to use a transformer?
<zyp[m]>
the transformer would wrap the original payload shape in a suitable aggregate
<zyp[m]>
StructLayout if it adds sideband signals, ArrayLayout if it adds lanes, etc…
<tpw_rules>
how could that work if the original payload was anonymous?
<tpw_rules>
like my shape can just be `4`, right?
<tpw_rules>
(payload_shape specifically)
<galibert[m]>
unsigned(4)
<tpw_rules>
that would also break all code i think
<bl0x[m]>
Would the payload structure only be visible in python? It also when exported to verilog?
<bl0x[m]>
Or*
<zyp[m]>
tpw_rules: yes, so you'd have e.g a `StructLayout({'content': 4, 'first': 1, 'last': 1})`
<whitequark[cis]>
it is intended that adding a transformer breaks all code that doesn't have a transformer there
<whitequark[cis]>
bl0x[m]: exported to Verilog *where*?
<tpw_rules>
zyp[m]: ok, that is plausible. the way the RFC is written does not suggest a wrap though
<whitequark[cis]>
tpw_rules: there are literally examples of it...
<whitequark[cis]>
actually I found a typo
<whitequark[cis]>
updated
<tpw_rules>
so if you just had a Packet transformer, then your previous packet_shape would become the data member of the new shape, and then have additional first and last members
<whitequark[cis]>
yes
<tpw_rules>
so transformers literally transform the existing stream shape (and semantics). makes sense
<bl0x[m]>
whitequark[cis]: Not sure. At some point one needs to address e.g. the different lanes. These fields are exposed in python as described, but would that also be transparent, when a stream+transformer is converted to verilog. Or would the converted module only have the opaque payload visible?
<tpw_rules>
could you add a 5th example with just a Packet? that seems to be the first thing many would want
<whitequark[cis]>
added another example with just a packet
<tpw_rules>
thank you
<whitequark[cis]>
bl0x[m]: in the signature there will only be the opaque payload visible; however, we have an upcoming "component metadata" RFC (accepted but not implemented) which will expose such internal structure so you could e.g. write a Emacs script that generates a SystemVerilog struct for you
<whitequark[cis]>
* internal structure as JSON so you
<whitequark[cis]>
inside of the implementation there's probably going to be aliases to the different bits of the payload member, so that waveform viewers pick them up
<bl0x[m]>
Ok, got it. Thanks
<zyp[m]>
if you're exporting a verilog module with a stream interface, it could probably make sense to have an adapter that broke out any transformer aggregates to individual signals
<whitequark[cis]>
yeah, basically the same as if you were dealing with AXI
<tpw_rules>
okay, i think this all looks good. thanks again for your work
<zyp[m]>
that probably goes for other things that wants to expose a lib.data shape on a verilog module interface too
<whitequark[cis]>
we could even build something that automatically introspects lib.data
<whitequark[cis]>
we kind of do this already for lib.wiring.Component specifically
<bl0x[m]>
whitequark[cis]: That would be so convenient
<tpw_rules>
i have a mild preference to having a "last" member, i saw the old thread and don't want to cause trauma, but it's immediately necessary for AXI and as that thread mentions is sort of a primitive. i understand that it encourages errors too....
<zyp[m]>
that's what I mean, you could have something that introspects a lib.data shape and turns it into an unidirectional lib.wiring signature
<galibert[m]>
tpw_rules: you don't always want packetizing
<whitequark[cis]>
tpw_rules: literally never happening
<tpw_rules>
whitequark[cis]: expected response
<whitequark[cis]>
even if I was fine with it on principle, there's no consensus on whether it should be first/last/both
<whitequark[cis]>
I would probably mandate both or neither, and the Packet transformer will probably not actually have them configurable
<tpw_rules>
according to the thread last is the minimum necessary; that's all AXI uses
<whitequark[cis]>
but you can't reconstruct first from last
<whitequark[cis]>
so it's not really true
<tpw_rules>
ok, that directly contradicts the thread
<zyp[m]>
I'd really prefer to have it configurable, you could argue for and against having either depending on the use
<whitequark[cis]>
actually nevermind, you can, it just requires a stateful adapter
<whitequark[cis]>
but anyway see @zyp's reply for the "no consensus" part
<tpw_rules>
or logic in the receiver
<tpw_rules>
if they could make use of first
<whitequark[cis]>
I'm tempted to just leave the specific design of the Packet transformer on the table until everyone agrees on something specific ]:3
<tpw_rules>
what does your hat signify
<zyp[m]>
reconstructing first from last means you can't output the last token of one packet before you've got the first token of the next, which is a dealbreaker for some uses
<whitequark[cis]>
it's evil horns
<tpw_rules>
ah
<whitequark[cis]>
yeah, first from last is unworkable, last from first is viable
<zyp[m]>
sorry, last from first, I mean
<zyp[m]>
first from last is the easy one
<whitequark[cis]>
what I was thinking of is that you can't reconstruct first from last in the specific case of also having lane strobes, and transmitting first with no strobes enabled
<whitequark[cis]>
ie signaling start of packet without actually transmitting any data
<tpw_rules>
which is why i decree only last and you can reconstruct first if you want it. are there cases where last is hard to generate?
<tpw_rules>
i guess that wouldn't matter if "both" was the answer and "first" is unworkable, you always have to generate it anyway
<zyp[m]>
you might not know that a given token is last
<tpw_rules>
why not
<zyp[m]>
consider receiving e.g. a zero-delimited bytestream (COBS or similar)
<zyp[m]>
when you get a byte, you don't know whether it's the last byte until you've checked whether the next byte is a zero
<whitequark[cis]>
(that's actually almost the same thing I was thinking of, but much better reasoned)
<tpw_rules>
so then you just couldn't connect a COBS receiver to something that needed last?
<zyp[m]>
sometimes you also want to use the first signal as a «discard an incomplete packet and start over» signal
<whitequark[cis]>
but how would you use this in a standardized interconnect?
<bl0x[m]>
zyp[m]: Mostly to save bandwidth, right? Packet error handling should not be the concern of the protocol on this level.
<whitequark[cis]>
because that is actually incompatible with Packet(), which carries an implicit assumption that each packet is complete
<whitequark[cis]>
I guess this actually spells out the terms for acceptance of a future Packet transformer (and all other transformers): they have to show the value in standardizing them by building out standardized interconnect and thinking the consequences through
<bl0x[m]>
Gotta go, thanks for all the effort! I'm really looking forward to this.
<zyp[m]>
<bl0x[m]> "Mostly to save bandwidth, right?..." <- «discard» is a bit loosely used here, for e.g. CRC it means that you'd reset the checksum before processing the next token
<whitequark[cis]>
but sometimes actually discard, like with a packet FIFO
<zyp[m]>
yep
<whitequark[cis]>
btw, it is expected that 0.5 will likely ship without any stream interconnect at all
<whitequark[cis]>
maybe FIFO adaptation for streams, still not quite decided
<whitequark[cis]>
(there's the downside that if we later have to add a separate transformer-aware FIFO things get quite confusing)
<whitequark[cis]>
(also FIFOs aren't components and... it gets messy)
<tpw_rules>
would it still ship with the interfaces?
<whitequark[cis]>
that's the idea yeah
<whitequark[cis]>
stream.Signature and stream.Interface, nothing else
<zyp[m]>
still a good starting point
<tpw_rules>
yes
<zyp[m]>
I figure while awaiting transformers, I'll inherit stream.Signature and provide similar functionality
<cr1901_>
equivalent playground encoding is 3525 chars
<whitequark[cis]>
gzip -9?
<cr1901_>
Same: 1216 (FWIW, "gzip -1" is 1293)
<tpw_rules>
how many bytes before base64?
Lunaphied[m] has joined #amaranth-lang
<Lunaphied[m]>
LZ4?
<cr1901_>
tpw_rules: 900
<tpw_rules>
hm, my gzip is 5 bytes more efficient
<tpw_rules>
or i dropped a newline in copypasta or somethin
<cr1901_>
Lunaphied[m]: lz4 is 1751
<Lunaphied[m]>
oof okay
<cr1901_>
xz and bzip2 do worse than gzip here
<cr1901_>
(by a little bit)
<tpw_rules>
zstd at max does 897 bytes
<cr1901_>
There's probably Just Not Enough text in my example to make a huge difference
<cr1901_>
But I think my example would be representative of a medium-sized snippet you'd put into playground
<tpw_rules>
using a dictionary trained on the amaranth repo (minus dotfiles and top level files) is 735 bytes
<tpw_rules>
(the dictionary itself is 110K)
<cr1901_>
That's cool! How do you get a custom dict for zstd?
<tpw_rules>
there's a --train flag, read the manpage
<cr1901_>
that's 735 before or after base64?
<tpw_rules>
before
<cr1901_>
So ~980 after base64
<tpw_rules>
ooh, 724 bytes!! idk if that's quite worth the effort, depending on library setup
<tpw_rules>
(i also removed the docs)
<Wanda[cis]>
wouldn't training on some codebase written on top of Amaranth be better than training on Amaranth itself?
<tpw_rules>
there are examples and tests which i figured would be close enough, but theoretically yes. i don't think the training is that sophisticated though, it's still just looking for likely matches
<nates93[m]>
typo in rfc ", but this is actually incorrect: unlike param in LiteX, TUSER in AXI-Stream in specifies ancillary data per-byte or per-transfer, not per-packet)" ("in AXI-Stream ~~in~~")
<whitequark[cis]>
tpw_rules: so, most of the things you dislike are in fact things that I came up with and insisted on (out of band)
<tpw_rules>
whitequark[cis]: hm, i see
<whitequark[cis]>
some of the things JF is going to fix tomorrow as I raised them already in internal review
<whitequark[cis]>
but the under-opinionatedness, while a group effort, is something I took to an extreme, very intentionally
<whitequark[cis]>
I think we need a way to expose what is basically just a stream (either direction) to the SoC, and a basic UART is exactly that, so they're a great match
<tpw_rules>
i don't think it has to be 16550-alike but i think there's a weird halfway point that we are on right now which doesn't seem right either way
<whitequark[cis]>
it's not halfway at all
<tpw_rules>
like do lots of streams have baud rates?
<tpw_rules>
or have to be enabled/disabled?
<whitequark[cis]>
just about any stream that talks to the outside world, in fact
<whitequark[cis]>
divisors rather than baud rates, but yeah
<tpw_rules>
if it's about the streams, i think it would be better to put buffering in the peripheral?
<whitequark[cis]>
that would make it less basic :p
<tpw_rules>
does it have a 1 char buffer right now or are the registers just glued directly to the stream port?
<whitequark[cis]>
directly to the stream port
<tpw_rules>
ok, then amaranth-stdio is so weird it doesn't actually work
<whitequark[cis]>
it doesn't matter at all what amaranth-stdio currently does
<whitequark[cis]>
cause we're just going to scrap the amaranth-stdio UART and start over
<galibert[m]>
I think your clockrate computation is incorrect?
<whitequark[cis]>
galibert: hm?
<galibert[m]>
0xe000 would give a clockrate of 7 times the base clock?
<whitequark[cis]>
oh, right
<whitequark[cis]>
well, someone who isn't me in the evening is going to fix that~
<galibert[m]>
huhuhu :-)
<tpw_rules>
i think that dramatically changes the output of your baud.py script
<galibert[m]>
psc is a power-of-two pre-divider I presume?
<whitequark[cis]>
yes
<zyp[m]>
so the division factor is effectively a form of floating point :)
<whitequark[cis]>
correct, it was intended to be that
<whitequark[cis]>
3 bit exponent, 13 bit mantissa
<tpw_rules>
should the peripheral expand it back to an integer?
<whitequark[cis]>
I figured that that's probably enough for most clocks and bauds rates
<whitequark[cis]>
* I figured that that's probably enough for most clocks and bauds
<zyp[m]>
tpw_rules: sounds unlikely to make sense
<whitequark[cis]>
tpw_rules: you should ask these questions after the RFC is edited to address my feedback tomorrow
<whitequark[cis]>
(it really should've been posted in a draft state)
<whitequark[cis]>
tpw_rules: also, basically all peripherals in SoC are (and should be) just thin wrappers over stdio stuff
<tpw_rules>
ok, i'm concerned about the equation in baud.py being wrong too
<whitequark[cis]>
adding CSR support, local memory where makes sense, DMA support where makes sense
<tpw_rules>
whitequark[cis]: i frankly don't like that you have to manually find the right stdio thing and plug it in, that should be a default
<whitequark[cis]>
tpw_rules: amaranth-soc is not litex
<whitequark[cis]>
you can copy example code if you wish (which we should have for every peripheral), but it's a toolkit for people who want to have configurable hardware, not a toolkit for people who want turnkey linux on an fpga
<tpw_rules>
what other PHYs do you imagine having and being compatible?
<whitequark[cis]>
from the beginning of the project, like literally the moment when i created the repos, i wanted the split to give a lot more emphasis to using peripherals in a non-SoC context
<zyp[m]>
tpw_rules: USB CDC-ACM
<whitequark[cis]>
tpw_rules: an easy answer is "the same UART but in a different clock/power domain"
<tpw_rules>
zyp[m]: with that register set??
<zyp[m]>
sure, why not?
<whitequark[cis]>
but CDC-ACM also falls under the intended compatible things, yep
<tpw_rules>
that makes divisor even more confusing...
<tpw_rules>
cause it's not about the peripheral, it's about the PHY
<zyp[m]>
I've got LUNA ACM hooked up to litex UARTs
<whitequark[cis]>
it's impractical to ship a UART which doesn't have a divisor at all
<whitequark[cis]>
so sometimes it's just a reserved 16 bit register
<whitequark[cis]>
and sometimes it's a divisor with an implementation-specific formula to it
<tpw_rules>
that's what i mean about weird middle place. why not just have it be read/write/enable/status and have a mechanism to add baud rate or CDC-ACM port or parity or whatever, if you want a generic stream to registers thing
<whitequark[cis]>
I don't want an extensible thing, I want a basic UART
<whitequark[cis]>
and the cost of carrying around a dummy 16-bit register is essentially nil
<whitequark[cis]>
and that lets you use the exact same assembly (!) with both the USB CDC-ACM impl and the physical UART impl
<whitequark[cis]>
the fact that this can be achieved by making it essentially just a pair of streams (one register cluster per stream) is an extremely nice bonus that allows you to reuse the same register layout for other stuff that's essentially just a pair of streams
<Wanda[cis]>
wait, by "CDC-ACM impl", do you mean something that has a stream on one end, and USB signals on the other end, and is a valid USB device that a host can enumerate?
<galibert[m]>
IME async serial blocks tend to use 16x oversampling, at least on recieve. It's kinda hard to make reliable otherwise
<Wanda[cis]>
that's... holy meowing god, nice
<whitequark[cis]>
Wanda: like how an arduino presents a serial console on USB with just serialRead() and serialWrite()
<tpw_rules>
but there's not really a way to expand it, or does it get more dummy registers?
<whitequark[cis]>
galibert[m]: yeah, this is why the divisor format is left unspecified in the RFC
<tpw_rules>
the only thing that applies to the CDC-ACM port is interrupts and DMA
<tpw_rules>
it's definitely specified in the RFC?
<whitequark[cis]>
it's not (after the edit tomorrow)
<tpw_rules>
ok
<galibert[m]>
magic
<whitequark[cis]>
whitequark[cis]: so you'd look up the formula for your particular system (which you need to do anyway, due to how clock trees work) and then it will have a factor of 16 somewhere
<whitequark[cis]>
tpw_rules: the only things we anticipate adding to this particular peripheral is interrupt support, whenever we get interrupts, though even that isn't quite clear (you generally want a high/low water mark on the FIFO for interrupts which this design doesn't really offer)
<tpw_rules>
how do the overflow and error mix with buffering? is the assumption you have to read the entire buffer contents before clearing those sticky bits?
<tpw_rules>
okay, so the future possibilities need trimming too
<whitequark[cis]>
probably do want an interrupt just so that you can boot Linux on it
<whitequark[cis]>
the "future possibilities" definitely do not refer to the basic UART itself, that's an omission in the text. they talk about a different peripheral, mostly (just an "UART" with no "basic", or maybe "RS232UART" or something)
<whitequark[cis]>
* or something. UART16550)
<tpw_rules>
they say two separate things, but i will wait for the revision
<whitequark[cis]>
tpw_rules: they're purely informative and not something you're expected to act on in code. basically you check the flags if something else doesn't quite work to find out if it was because you fucked up or someone else fucked up
<whitequark[cis]>
or like light a LED or something
<whitequark[cis]>
debug functionality
<whitequark[cis]>
(thinking about it more, yeah definitely do want interrupts)
<tpw_rules>
still, how do i return the peripheral to a usable state after that?
<whitequark[cis]>
do you recall the semantics for RW1C?
<galibert[m]>
power cycle :-P
<tpw_rules>
yes
<tpw_rules>
it's obvious how to un-set those flags
<whitequark[cis]>
oh
<whitequark[cis]>
the peripheral continues functioning as before with the flags set
<tpw_rules>
how do i know i've gotten all the bad data out?
<zyp[m]>
just like with every other uart
<whitequark[cis]>
the receiver sets error xor sends data
<zyp[m]>
it'd be protocol/framing specific
<whitequark[cis]>
and yeah you deal with this in a higher level protocol like xmodem
<tpw_rules>
xor?
<whitequark[cis]>
either sets the error flag, or sends a data byte
<zyp[m]>
see an overflow? drop data until you get another frame header or whatever
<whitequark[cis]>
so there's no "bad data"
<tpw_rules>
so once there's an error you get no more data?
<whitequark[cis]>
and yes, zyp is exactly right, in an xmodem implementation you could treat error/overflow flags as "clear, drop the entire buffer"
<whitequark[cis]>
no?
<tpw_rules>
okay, that's what i asked originally
<tpw_rules>
well, i think you would want drop entire buffer, clear
<whitequark[cis]>
error flag and receive stream are completely separate
<whitequark[cis]>
whenever there's an error the stdio thing pulses an error line, and the soc thing sets a flag. that's it. no other behavior
<whitequark[cis]>
whenever there's a non-error received byte the stdio thing sets it as payload (if ready) or pulses overflow flag (if not ready)
<whitequark[cis]>
tpw_rules: just to be clear by "buffer" i mean "in your xmodem implementation in software"
<whitequark[cis]>
the hardware does absolutely nothing to help you with this
<zyp[m]>
support for break frames should maybe go on the future possibilities list
<whitequark[cis]>
zyp: in my view if you want that you want a full 16550 implementation or something like that
<whitequark[cis]>
it's not necessary for the use case of "serial console for SoC bringup" or "vmlinux bootloader" and everything else needs a bigger and better UART
<zyp[m]>
maybe, I'm not really familiar with the 16550 register interface
<whitequark[cis]>
the problem with the 16550 interface is that it's very 80s and pretty crufty (although well supported in just about any OS, including many RTOSes)
<whitequark[cis]>
so it's a good way to become compatible, but bad way to spend limited developer resources
<zyp[m]>
IIRC modbus RTU uses break frames, and I can imagine people wanting to do that on a FPGA
<whitequark[cis]>
maybe galibert or another 80s enjoyer can submit one :p
<galibert[m]>
break detection is just framing error + reading the rx line and seeing it's still 0
<galibert[m]>
you don't need it explicitely in the peripheral
<tpw_rules>
whitequark[cis]: i think that's still under specified, because then those flags mean you have an unspecified problem in the future and no way to get rid of it. is it true that, in the current setup, so long as rx.rdy is 0, you can clear the flags and not get data from before the error? does that need to be documented?
<galibert[m]>
framing error being stop bit not being one
<whitequark[cis]>
> because then those flags mean you have an unspecified problem in the future and no way to get rid of it
<whitequark[cis]>
what?
<tpw_rules>
well, at least combined with buffers
<galibert[m]>
also IME again "error" is just annoying. It's better to have separate lines for the different error types, which are parity, framing, overflow and underflow
<tpw_rules>
suppose i have a 64 byte fifo between the peripheral and the PHY, and the error flag gets set. i have to read at least 65 bytes to be sure that i'm past the error, because i don't know the fifo level
<whitequark[cis]>
galibert[m]: once more you're not the target audience for this peripheral; a 16550-style UART is in scope and we're happy to add it
<whitequark[cis]>
(also I don't think you can get underflow at all)
<galibert[m]>
Yeah, underflow needs very special conditions which usually don't exist
<whitequark[cis]>
tpw_rules: basically, if you want reliable UART communication, your higher-level framing like XMODEM needs to handle that, which doesn't require an `error` or `overflow` flag at all
<zyp[m]>
if this is supposed to be super minimal, you could also argue for leaving out the baudrate register and leaving that up to the gateware
<whitequark[cis]>
I wouldn't mind removing error entirely, but it seems like if your UART doesn't work, like if you got baud wrong, you'd want to learn that, which you can only do by looking at error
<whitequark[cis]>
zyp[m]: potentially, yes
<whitequark[cis]>
I don't think it's a good tradeoff though
<tpw_rules>
whitequark[cis]: it could be a benefit to have bounds and improve efficiency
<galibert[m]>
all the complexity in is the reciever
<whitequark[cis]>
tpw_rules: this isn't how anything works really
<tpw_rules>
well i will wait until tomorrow, i have to run for dinner
<whitequark[cis]>
whitequark[cis]: but also that's probably the one part where I'm least certain what the right decision is
<zyp[m]>
I just figure if you're gonna have baudrate config, it'd make sense to get the other reasonable PHY config signals into the interface while you're at it
<whitequark[cis]>
it's possible that a better tradeoff would be "generic stream RX / stream TX endpoint, plus optional FIFO with high/low water mark, plus optional interrupts, plus optional DMA" and then you attach that as a sidecar to a UART PHY config word
<whitequark[cis]>
that basically gets you way ahead of the 16550 with its buggy FIFO handling (remember to use 16550A! or maybe 16550AF!!)
<zyp[m]>
I think that could make sense
<whitequark[cis]>
and you can use the same thing for, for example, Ethernet
<whitequark[cis]>
and you could shove some of the status bits, like "break frame received", into the high bits of the FIFO (which makes your FIFO annoyingly wide potentially, but then you get precise break handling for something like Modbus RTU)
<whitequark[cis]>
unfortunately having a very configurable thing comes with the tradeoff that your firmware may not be aware of what the configuration is
<whitequark[cis]>
and also we can't really ship anything that relies on interrupts right now, much less DMA
<whitequark[cis]>
anyway. I think my broader impression of the recent SoC discussions is that we're design-by-committeeing things into a really inefficient place and maybe the RFC process is just not being effective at this stage of the project lifecycle
<whitequark[cis]>
Rust wasn't run by RFC for the first few years of its existence and maybe SoC/stdio need that phase too
<whitequark[cis]>
it's definitely working well for the core subsystem though
<whitequark[cis]>
Amaranth 0.1 has some really stupid things, like Record and the amazing mess I made of the platform interface, but it also got the ecosystem where it is now so it;s impossible to say if it was bad or not
<whitequark[cis]>
I suppose that if the conclusion is that we just need something to tide us over until we have a methodology and a small library of minimally usable peripherals set up, basic UART minus the divisor works basically as well as basic UART with the divisor, and we can ship either
<adamgreig[m]>
0.1 had a lot of good stuff too though!
<adamgreig[m]>
did 0.1 even have platform..? I feel like I ended up making my own platform stuff at the start
<whitequark[cis]>
yes, the overarching point is that projects very early on in the lifecycle probably don't need input from every stakeholder
<whitequark[cis]>
0.1 definitely had platform
<whitequark[cis]>
whitequark[cis]: but just a good enough thing to start with and iterate on
<whitequark[cis]>
that's not to say that the entire evolution should be iteration-driven, it is both reasonable and necessary to have a sky-high goal in mind and work towards that
<whitequark[cis]>
it's just that people have strong opinions about which exact itch they want to scratch right now and projects early in the lifecycle are fragile and easily disrupted by those
<whitequark[cis]>
actually, the platform stuff was created in more or less that way, just not in the open
<adamgreig[m]>
yea, I think it's hard to have overall design cohesion from piecemeal RFCs, and also hard for someone to author or review one do-it-all rfc
<zyp[m]>
components are also easier to iterate on than infrastructure
<adamgreig[m]>
I was thinking of early nmigen I think, at least at that point I was writing my own platform from zero including requests and pins and resources and such, until I later ported it to the then-new amaranth platform?
<whitequark[cis]>
ah maybe!
<whitequark[cis]>
adamgreig[m]: the reason Amaranth has a high degree of design cohesion is because I spend dozens of hours discussing RFCs with their authors, sometimes openly, sometimes in private
<whitequark[cis]>
they are a product of a higher level design process, not the process itself
<whitequark[cis]>
well, they're part of the process, but not the whole
<whitequark[cis]>
and not even the majority of it
<adamgreig[m]>
right
<adamgreig[m]>
but you don't think that approach is being effective for soc/stdio?
<whitequark[cis]>
the way I describe it in general terms: the RFC process is there to serve the project, not to be democratic. its goal is a better design. we call the things we do "voting" because people really liked that term, but it's not really accurate in that the ultimate outcome is determined solely by the responsible maintainer
<galibert[m]>
amaranth is being used, -soc not that much, -stdio not at all, so it's getting very design-by-commitee rather than design-by-experience
<whitequark[cis]>
adamgreig[m]: I think the way we're currently working on SoC is only partially ineffective and getting less effective
<whitequark[cis]>
I think the work done on the CSR infra was quite good
<zyp[m]>
and also the metadata
<whitequark[cis]>
amaranth-soc has never even had its 0.1 release
<whitequark[cis]>
it doesn't have to be perfect, it doesn't have to serve everyone's use cases, it doesn't even have to be good
<whitequark[cis]>
and because of that, I'm more worried about inefficiency of collaboration in this manner rather than the fact that the basic UART is weirdly shaped
<whitequark[cis]>
who cares, we can remove it by the time 0.3 exists
<whitequark[cis]>
there's still usefulness in the RFC process, external review nearly always helps at least to some extent, but I'm really not happy with the amount of "it doesn't serve my use case personally" that we've been getting
<whitequark[cis]>
it's not about your use case personally, it's about laying a foundation and coming up with useful methodologies
<whitequark[cis]>
which aren't necessarily pursuits that have immediate practical use, kind of like how basic science
<whitequark[cis]>
s/how//
<whitequark[cis]>
yes, you can't detect a BREAK frame. who cares? we don't even have any way to test the BSP generator (that we've yet to come up with) with such basic functionality as "a peripheral with two register clusters"
<galibert[m]>
Don't mix "It doesn't serve my use case" and "I don't understand what the use case could be" though
<whitequark[cis]>
I don't
<whitequark[cis]>
that said, it's also true that I've focused a lot in wanting to hear everyone's use cases in the core subsystem RFCs (because the core language is relatively mature and it's time to ditch old crutches and fly) so the same assumptions/expectations carried over to the SoC subsystem RFCs
<whitequark[cis]>
which isn't mature at all and so the same considerations do not apply
<adamgreig[m]>
I think this was sort of talked about earlier, but with the streams rfc, if I had a transmitter that generates a new payload every cycle so is always valid, and receivers should always use the fresh value (i.e. we don't want to buffer the very first value for ages until someone transfers it and then suddenly have a new one), the only way to comply with the other requirements is to require ready=1 on the receiver so that a transfer
<adamgreig[m]>
will happen every cycle, right?
<zyp[m]>
yes
<adamgreig[m]>
i've occasionally had a transmitter where valid=1 and it just updates payload every cycle and receivers "know" they'll have to capture it each cycle, I guess
<whitequark[cis]>
yeah, I think always_valid should imply always_ready
<adamgreig[m]>
hm, I can imagine always_valid but not always_ready still making sense, it just means transfers are entirely driven by the receiver
<whitequark[cis]>
oh wait right
<whitequark[cis]>
ok, you know what, I am going to sleep, I'm deliriously tired
<whitequark[cis]>
talk to you tomorrow
<adamgreig[m]>
but crucially the transmitter isn't allowed to update the payload except on a cycle where ready=1, so if it wants to always be updating the payload, it then has to require ready=1
<adamgreig[m]>
goodnight!
<whitequark[cis]>
adamgreig[m]: yeah
<whitequark[cis]>
nini
<galibert[m]>
'night
<adamgreig[m]>
streams rfc looks really good to me otherwise, definitely fills a long-awaited gap
<whitequark[cis]>
sweet!
<tpw_rules>
zyp[m]: this is my point, i like the idea with a sidecar. you subclass/compose it and can add your own config options. i had tried to share that earlier