<d1b2>
<cptncodon> Thanks for the update @esden, and thank you all for all your hard work. Looking forward to getting mine (be it next year or whenever 🙂 ).
<d1b2>
<esden> Thank you @cptncodon I appreciate it! 🙂 We are all doing our best here.
<d1b2>
<vvx7> just dropping by to say wow the case looks awesome. appreciate all the hard work you cats put into this ^_^
<d1b2>
<esden> Was about to post it o. Desk. The case did indeed come out nicely. I really hope all the 1k units we ordered will pass our inspection. 🧐
<d1b2>
<esden> lol... (thanks auto correct) I meant to say post it on fedi >_<
<sorear>
quick, somebody start the hot new social network Desk
<d1b2>
<esden> /me lists a laundry list of requirements where at least some of them are mutually exclusive 😄
<d1b2>
<perigoso.> Boi, that is one good looking thingamabob
<d1b2>
<esden> @sorear then it should be renamed to D so that it can't be searched in the apple app store ... just for good measure 😄
<whitequark[cis]>
attie: esden: ping
<whitequark[cis]>
the modification I want to make before esden starts shipping is to move the API level into the bcdDevice field
<whitequark[cis]>
right now bcdDevice is: 0x0000 for devices without flashed firmware (the FX2 config block only), 0xA000 for devices with completely erased flash, and 0x01RN where RN is something like C3 for devices with flashed firmware
<whitequark[cis]>
what i want is to make the low byte of bcdDevice in the FX2 config block the primary/only revision field, the revision field in the glasgow config block obsolete/reserved, and the high byte to indicate the API level of a firmware (if there is firmware), or 00 if there's no firmware
<whitequark[cis]>
or in short: high byte = software rev, low byte = hardware rev
<whitequark[cis]>
all of which is available whether there is or isn't the glasgow config block
<d1b2>
<ewenmcneill> whitequark: FYI, as it comes through the bridge to Discord, the "@" of Attie is doing a "shortest prefix" match, and matching someone else. So: ping @ATtie ^^^^
<d1b2>
<perigoso.> poor AT getting randomly pinged, ping @attiegrande, alas
<d1b2>
<ewenmcneill> Yeah, I noticed after I sent it my ping also failed. My guess this is also related to the Discord username thing: Attie is attiegrande as a username now
<sorear>
is having to wrap when revG becomes a thing a big deal?
<whitequark[cis]>
by 2035 we'll figure something out >_>
<whitequark[cis]>
but also yeah, we could have 0x0? mean revG?
nafod has quit [Ping timeout: 245 seconds]
<d1b2>
<rwhitby> Is there any testing / guinea pig help I can provide with a revC2 device? Testing macOS installs, etc? I plan to forward-port the glasgow-addons control.fusb302 applet for the USB-PD add-on to latest main at some point ...
<d1b2>
<rwhitby> (Just caught up on over 2500 messages in this channel since last I looked)
<d1b2>
<rwhitby> The two items that I am personally interested in are:
<d1b2>
<rwhitby> 1) Background python tasks in parallel with the CLI
<d1b2>
<esden> I think it is a good idea to install the current master of glasgow and see if/how well it works for you. There was a bunch of fixes that happened including port of all the applets to native amaranth.
<d1b2>
<rwhitby> 2) Multiple applets operating on parallel (e.g. USB-PD applet to put the target in a debug mode over SUB-PD messaging and then UART or I2C applet running on the SBU signals)
<d1b2>
<rwhitby> ah yes, I will need to make sure the control.fusb302 applet is similarly ported so it doesn't become stale
<whitequark[cis]>
to have (2) working a significant amount of infrastructural work is required
<whitequark[cis]>
it will take months
<d1b2>
<rwhitby> yes, that is understood, and appreciate that it is well off the bottom of the priority list
<whitequark[cis]>
it's actually pretty high up (unless that's what you mean?)
<d1b2>
<rwhitby> oh, well that's good news then.
<whitequark[cis]>
it just requires a ton of engineering work, and until recently I was not in a situation where it was safe to live
<d1b2>
<esden> I think multiple applets at the same time and out of tree applets are the two most requested things I have seen.
<whitequark[cis]>
so not exactly a place where one would do a lot of unpaid engineering work
<whitequark[cis]>
esden: once attie reviews #357 we will have the technical basis for out of tree applets
<whitequark[cis]>
however they will still be disallowed by policy until we can have a saner applet API
<whitequark[cis]>
i might enable them e.g. if the environment variable "GLASGOW_OUT_OF_TREE_APPLETS_ARE_UNSUPPORTED_AND_WILL_BE_BROKEN_BY_FUTURE_UPDATES" is set or something like that
<d1b2>
<rwhitby> yes, I've been developing control.fusb302 in-tree so far, but will be happy to move it out of tree and guinea pig the out of tree applet infrastructure
<whitequark[cis]>
I don't think it requires much testing
<whitequark[cis]>
basically, all of the in-tree applets are currently treated as if they were out-of-tree applets defined by the "glasgow" package already
<d1b2>
<crobi> Excited to hear an update today! And more importantly that Catherine is back and doing better <3
<whitequark[cis]>
there's an if entry_point.dist.name != "glasgow": check there that doesn't let you load any actual out-of-tree applet
<whitequark[cis]>
that's the only difference between "in-tree" and "out-of-tree".
<whitequark[cis]>
for the onlookers: multiple simultaneous applets are hard for the same basic reason TCP and HTTP can be hard (and why QUIC and SPDY were developed)
<whitequark[cis]>
it is really easy for one applet to be unfair to others, or for there to be head-of-line blocking
<whitequark[cis]>
this is a complex engineering problem that no one in the entire world has completely solved
<d1b2>
<rwhitby> Is there a canonical end-user install doc you'd like to me to follow as a "dumb user" to find missing bits?
<d1b2>
<rwhitby> Or just do the venv install as I've done previously.
<whitequark[cis]>
rwhitby: I don't really like equating not being specialized in software with lacking intelligence, but that being said, yes I do have such a doc
<sorear>
does "background tasks in parallel with the CLI" mean that you want to run an applet uninterrupted while you reconfigure glasgow to use different applets in parallel? ...
<whitequark[cis]>
sorear: wanna talk about the ideas I have for communication protocols / on-FPGA interconnect?
<d1b2>
<rwhitby> I can report glasgow install on macOS according to instructions is fast, painless, warning-free, and the control.fusb302 applet rebased against lastest master works without any changes.
<d1b2>
<rwhitby> @sorear I mean having something in the background responding to incoming USB-PD messages while in the CLI you can do things like send new messages.
<whitequark[cis]>
this is technically already supported
<whitequark[cis]>
nothing stops you from making a REPL that runs something in the background too
<whitequark[cis]>
e.g. look at the UART applet (which runs a tty/pty/socket not a REPL, but same idea)
<d1b2>
<rwhitby> thanks for the pointers, I'll take a look
<sorear>
whitequark[cis]: (protocols) sure
<whitequark[cis]>
okay, let me introduce some baseline constraints:
<whitequark[cis]>
1. we are sending octets from endpoint A to endpoint B, where one of them is on the PC and another of them is on the FPGA
<whitequark[cis]>
2. there are many such endpoints, and they all have to share the bottleneck of the FX2<>FPGA interface, which allows transferring one octet (either direction) per 20.83 ns
<whitequark[cis]>
3. we want all of these transfers to share this bottleneck "fairly", where "fairly" means "without frustrating the user of the tool, as much as possible" but is obviously subject to further elaboration
<whitequark[cis]>
4. we want the actual octets that go from A to B to appear on the FX2<>FPGA interface unmodified (mostly because our USB drivers are in Python and it would be expensive to broadly modify the octets with some sort of line code or sth)
<whitequark[cis]>
that's all four.
<whitequark[cis]>
right now we have, effectively (ignoring the 4-endpoint interface that we have to remove because it causes Windows-related issues) two buffers that are processed in a round-robin interface, meaning we have one FPGA>PC and one PC>FPGA pipe
<whitequark[cis]>
each time a 512-byte packet of data (or a smaller one capped by an end-of-packet signal) is processed, the other direction is handled, repeat endlessly
<sorear>
does USB and the FX2 FIFO give us meaningful and usable framing semantics or is it just a full-duplex reliable byte stream?
<whitequark[cis]>
the existing scheme is severely limiting in ways that are described at length on the issue tracker and I won't rehash that here
<whitequark[cis]>
sorear: excellent question
<whitequark[cis]>
the answer is "no" and also "it's complicated"
<sorear>
i feel like it would be a very good thing if someone who knew the first thing about (a) USB (b) network flow schedulers was answering these questions, but i'll make due
<sorear>
(instead of me)
<whitequark[cis]>
the reason the answer is mainly "no" is because if you want it to be anything but a reliable byte stream, you instantly end up with an insane amount of overhead in software and also overhead that is mandated by the USB controller (not specification)
<sorear>
i can pretend this is a question about interactive CPU scheduling, i sort of understand that
<whitequark[cis]>
so if we only had the FX2 and the FPGA and the USB host was e.g. entirely programmed by us, we could use USB level packets as a framing
<d1b2>
<rwhitby> Is the GLASGOW_TOOLCHAIN variable setting intended to be native,builtin or system,builtin? (just want to know which bit to raise the PR against, doco or code)
<whitequark[cis]>
but because we have Linux, and XHCI, and libusb in the middle, we have no real option besides treating it as just a byte stream, somewhat aggravatingly
<d1b2>
<rwhitby> (doco says native, code requires system as the keyword)
<whitequark[cis]>
sorear: you'll see soon enough why I'm fine with you talking about it
<whitequark[cis]>
so now that we're moving from "one endpoint per side & direction" (four in total) to more endpoints, obviously some kind of framing and scheduling is necessary
<whitequark[cis]>
at first, I naively thought that I'd add endpoint address and length, then route/frame it on the device, a few iterations of this are described in https://github.com/GlasgowEmbedded/glasgow/issues/354
<d1b2>
<rwhitby> system,builtin seems to work as intended, will look into the /.config and /.local issues (probably a missing environment variable for oss-cad-tools at my end)
<whitequark[cis]>
rwhitby: no, it's a known issue, the cause is that I am doing "hermetic" builds and e.g. nextpnr-ice40 is run without any $HOME variable
<whitequark[cis]>
but oss-cad-tools wrap it in a shell script that breaks without $HOME
<whitequark[cis]>
I think this should be fixed in oss-cad-tools
<d1b2>
<rwhitby> ah, yes, I remember reading about that in the backlog now.
<sorear>
probably a good thing in the long run if we're serious about TCP, although that makes me wonder about session semantics (if your connection drops/the CLI is killed, can the CLI reconnect/restart without resetting the applets? may be two different questions)
<whitequark[cis]>
sorear: but there are various icky design decisions when adding address/length that most people would probably not think much about, like which order should they go
<whitequark[cis]>
so I started to think about why it is these two particular things that I'm adding, and what is their fundamental significance
<sorear>
if the address is logically inside the length and you're losing the address a byte at a time a la wormhole routing you have to worry about rewriting the lengths...
<sorear>
if I understood 354 correctly the lengths would be added and removed at the FIFO boundary and everything inside the FPGA would be 9-bit in both directions?
<sorear>
I am unsure if the diagrams with their unidirectional arrows are actually representing bidirectional flows
<sorear>
…weren't we using or planning 16-bit FIFO operation at some point? may complicate this
<whitequark[cis]>
(9-bit in both directions) well, it's not clear. going from PC to FPGA having 9-bit flows (data + eop) is convenient, but the other way not so much
<whitequark[cis]>
(16-bit FIFO operation) no, electrically not possible
<whitequark[cis]>
though anything with FX3 will use a 32-bit FIFO
<whitequark[cis]>
(fundamental significance) continuing the exposition
<whitequark[cis]>
given the constraint (4) above, we can look at the stream of endpoint data octets going through the bottleneck, and see that each one of them is associated with routing information
<whitequark[cis]>
but this association does not need to even be explicit in the octet stream. for example, consider an FPGA with 3 endpoints that transmits an octet stream like 11 22 33 11 22 33 ... where 11 is an octet sent by 1st endpoint, and so on
<whitequark[cis]>
this is a scheme that has zero bandwidth overhead
<whitequark[cis]>
however, it potentially (if any of the 3 endpoints does not produce data at the exact same rate as the other two) has infinite latency overhead
<whitequark[cis]>
this made me realize that it does not actually matter exactly how the framing looks (it can be arbitrarily weird) as long as it does the job of "smearing" the associated routing information for the data octets as thinly over the octet stream as it is feasible
<whitequark[cis]>
moreover, the only purpose the framing bits have is to synchronize the models of the interconnect on the sender side and the receiver side
<whitequark[cis]>
basically, the framing is a compression algorithm for octet routing information!
<sorear>
insofar as we are treating _each endpoint_ as a full-duplex byte stream, yes
<whitequark[cis]>
it does not require being full-duplex
<whitequark[cis]>
and in fact i consider, for routing purposes, all links to be dual simplex
<whitequark[cis]>
or rather at most dual simplex, since the directions are independent
<whitequark[cis]>
so if you look at the specific framing proposals in #354, you can think of these as building blocks of a compression algorithm
<whitequark[cis]>
"address followed by 0x00" compresses "where the next unspecified number of bytes is routed" into this prefix string
<whitequark[cis]>
"length as uleb128 (or uint32 or w/e)" compresses "when the state of the router must be reset"
<whitequark[cis]>
it actually need not be followed by another address. you can think of a router that, after passing length bytes through itself, simply switches to the next endpoint
<whitequark[cis]>
I think this compression analogy is very powerful because it clearly shows that there cannot be any perfect framing because our choice of framing inherently reflects our uncertainty of where the bytes will go at which time
<whitequark[cis]>
so instead of trying to build some one thing that is supposed to work for all endpoints (and fails at it), I think that endpoints must declare what kind of data is expected, and the interconnect and framing will be generated from it
<whitequark[cis]>
for example, an endpoint that corresponds to a 1-bit register will declare "packet oriented, 1 octet packet max" (for writes, supposing that a read request is a 0 octet packet where the response is written on the complementary pipe, or something like that)
<whitequark[cis]>
inside the FPGA, if you decide to prefix something with a length, and want a forward progress guarantee, there are two options: either you know the length beforehand (e.g. because you will generate that many octets), or because you have a buffer of that many octets, which you will fill and then drain
<whitequark[cis]>
(it's the same on the PC, but on FPGA, buffers are particularly expensive)
<whitequark[cis]>
for stream endpoints, we want to have max throughput, which creates an additional issue: head-of-line blocking
<sorear>
to what extent do we want to support using the FPGA side of applets with a host side other than the python/libusb (or python/tcp) stack? allowing infinitely variable framing that is declared in the python code works against that
<sorear>
i will save questions, carry on
<whitequark[cis]>
if you have e.g. a sole endpoint in an FPGA, it would be good if that would generate data, uninterrupted, at max line rate, but that doesn't work anymore if you want to have e.g. regularly sampled statistics about that data stream
<sorear>
(max line rate) agree
<whitequark[cis]>
the statistics endpoint could in that case have a constraint: "serviced at least every 1000 ms", and then the multiplexer will have a bucket that will count up to 48000000 and once it overflows, switch to the other endpoint. this will also influence the length it will prefix to the data sent from the high throughput endpoint
<whitequark[cis]>
the length can be at most 48000000 of course, but it will be less than that unless the high throughput endpoint can guarantee that when it begins generating a packet, it will always send 48000000 octets at the line rate
<_whitenotifier-8>
[GlasgowEmbedded/glasgow] rwhitby e1da219 - README: Update the doco to match PR#353 ("native"->"system")
<_whitenotifier-8>
[glasgow] github-merge-queue[bot] created branch gh-readonly-queue/main/pr-365-3a783fad9d3f06b026dbb1bb0adac1f0f3352c6d - https://github.com/GlasgowEmbedded/glasgow
<whitequark[cis]>
this will rarely be true, so usually you will stick e.g. a 512 byte buffer in front of it which will be sent once it fills or e.g. once the applet hits "flush" (in our current scheme)
<whitequark[cis]>
there is a number of options possible here if we use something like COBS framing
<whitequark[cis]>
(infinitely variable framing) I think we can stick a self-description of the framing into the FPGA, it will be concise enough since it's O(n) where n is the size of interconnect
<whitequark[cis]>
so with this in mind, the base elements will be something like:
<whitequark[cis]>
- demultiplexer: consumes addr+NUL, connects one of the output streams to the input stream, can be reset on EOP on input stream, on EOP on output stream (usually more efficeint, but requires trusting the output stream to report EOP soon enough), or potentially on some other event
<whitequark[cis]>
- multiplexer: the inverse of multiplexer, prepends addr+NUL, can be reset on EOP on input streams (in which case the full packet will be transmitted) or EOP on the output stream (in which case the packet will be torn apart and it will be necessary for it to have its length be known in some other way)
<whitequark[cis]>
- bufferer: consumes up to a maximum amount of octets into internal buffer, emits length plus octets
<whitequark[cis]>
- lengther: emits length by trusting the source of the data, then emits the octets from it
<whitequark[cis]>
(EOP on output stream) suppose you're stuffing FX2 buffers. the FX2 arbiter will tell you once the buffer is done, and you know for a fact that whatever you send next will be on a multiple of 512 byte index in the received data
<whitequark[cis]>
you can use this to cut down on sending some length bytes
<whitequark[cis]>
(yes, I lied, you can sort of make use of USB packets as framing. like I said it's complicated)
<whitequark[cis]>
for the gateware registers, where we always know exactly what the length is, buffering isn't necessary at all, you can just generate/use the length on the fly once you know what register it is
<whitequark[cis]>
or omit it completely for that matter
<whitequark[cis]>
e.g. you know that register 1 is 1-byte, register 2 is 3-byte, register 3 is 1-byte, you can do 01 aa 02 ab cd 03 ff to write them all
<whitequark[cis]>
there is another topic: ordering
<whitequark[cis]>
requiring everything to happen in-order is completely impractical, because... imagine you are writing 1 gigabyte into the UART. perfectly reasonable, yes?
<sorear>
i'm still not quite following the multiplexer/demultiplexer logic. checking my assumptions, these are both simplex components, and the "output stream" is write-only? checking how EOPs and backpressure work for the fx2 interface
<whitequark[cis]>
yes, each of these components is simplex, the output stream is write-only but it has the classic ready/valid signaling of AXI stream
<sorear>
gigabyte of data on uart is totally reasonable. people still use slip right?
<whitequark[cis]>
probs
<whitequark[cis]>
so if everything is in order, now you cannot (by fiat) do anything with the deice until the entire gigabyte is gone down the TX line
<whitequark[cis]>
this is not ideal.
<sorear>
and EOP is in the same direction as ready, but has different semantics that are too complicated to explain here
<whitequark[cis]>
EOP in this case is just a sideband, really; it's more generic than what's in the AXI spec
<whitequark[cis]>
EOP is "some function of the bytes"
<whitequark[cis]>
(ordering) I think endpoints need to be separated into "asynchronous groups"
<whitequark[cis]>
within a single async group (declared when creating the endpoints), everything is in-order
<whitequark[cis]>
however, async groups themselves are guaranteed to be interleaved in a way that makes sure every individual endpoint in each async group is serviced to fulfill its latency requirement
<esden[cis]>
whitequark[cis]: Yes it is since the fade before enumerate patch
<esden[cis]>
I meant to mention it sorry
<whitequark[cis]>
"fade before enumerate patch"?
<whitequark[cis]>
ohh
<sorear>
how are we shipping libusb? are we in a position to easily ship C framing code alongside it?
<whitequark[cis]>
sorear: we rely on python-libusb1, which ships (on Windows) unmodified upstream libusb1 DLLs, and on Linux/macOS uses the system ones
<whitequark[cis]>
it does not have any C code
<whitequark[cis]>
and our installation procedure does not rely on having a C compiler
<whitequark[cis]>
now... I did publish ziglang on PyPI, which means we have an easily accessible C compiler, even on Windows, but dropping a DLL in a temporary directory and immediately LoadLibrary'ing it is seen as sus
<whitequark[cis]>
that being said, you're touching on something else I wanted to bring up
<whitequark[cis]>
network transparent Glasgow would be very cool, but right now we have a bunch of USB requests that are vital
<whitequark[cis]>
and my idea is to squish them into the same sort of system, basically just make these requests look like another endpoint that could be on an FPGA or wherever
<whitequark[cis]>
on the FX2, there is a free pair of endpoints (EP1IN/EP1OUT) that we can use. technically it's not allowed to use them for BULK endpoints because they're only 64 byte, so we could use control endpoints instead, or we could say "sending more than 64 bytes is UB". it doesn't matter a whole lot what we do here
<sorear>
i don't think it's unreasonable to keep using the USB requests for revABCD, and translate them to endpoint data in the glasgow server process; it's not like we're going to implement cdcacm on the fx2
<whitequark[cis]>
I don't understand
<whitequark[cis]>
oh, you mean, dedicated USB control requests?
<whitequark[cis]>
switching to parsing them on the FX2 will likely save us a considerable amount of firmware space because the current "giant if/elif" approach is actually quite wasteful
<whitequark[cis]>
and this means the Rust process that shuffles data between USB and some sort of socket now can be blind to the exact USB details besides "this is the local FX2 pipe, this is the FPGA FX2 pipe"
<sorear>
I mean that if parsing the data stream is too much work for the 8051 it doesn't semantically have to be done there; I haven't thought about USB transactions in much detail and I still haven't actually read the USB spec
<whitequark[cis]>
right. it's less work, probably
<sorear>
i see
<whitequark[cis]>
the idea here is that the Rust process would expose each asynchronous group (where packets go in-order) as its own TCP port, probably by parsing the metadata
<sorear>
how are we doing 480mbps with the 8-bit fifo?
<whitequark[cis]>
480 Mbps is the line rate
<sorear>
what, is it using manchester or something and only getting 240 Mbit/s
<whitequark[cis]>
max BULK bandwidth is approximately 336 Mbps and it is limited by the scheduling considerations
<whitequark[cis]>
meanwhile the 8-bit FIFO at 48 MHz allows us to hit 384 Mbps in theory, which exceeds the number above
<whitequark[cis]>
USB uses NRZI, where five zeroes in sequence trigger insertion of a fake bit
<whitequark[cis]>
it's like the cheap knockoff of 8b10b
<whitequark[cis]>
anyway, it's well documented we can hit maximum USB 2 bandwidth with the Glasgow and Python
<whitequark[cis]>
actually, it requires a USB 3 controller to do that, since EHCI has its own overhead, or possibly scheduling issues, that doesn't let it do that
<whitequark[cis]>
with a USB 3 controller you can hit the theoretical maximum easily
<whitequark[cis]>
re: framing for asynchronous groups, someone (marcan iirc?) proposed that the first byte of every USB packet would be the endpoint number
<whitequark[cis]>
every time you send or receive a non-full-sized USB packet it kills your bandwidth because the controller starts scheduling other devices more often
<whitequark[cis]>
so you need to have a "real" framing there
<sorear>
i remember that
<whitequark[cis]>
what makes it a little easier is that the FX2 (and the USB) works in 512-byte chunks, and you know you can buffer at least 512 bytes into the FX2 each time it has a, well, spare buffer
<whitequark[cis]>
so we can evict a bunch of buffers from the FPGA into the FX2, leaving just the associated counters (which would free up a LOT of BRAM)
<sorear>
are we using master-fifo or slave-fifo mode?
<sorear>
whether the control signals and clocks are generated on the fx2 or fpga, i think, but I've only read a few pages of the fx2 manual this year
<whitequark[cis]>
the FPGA is generating the control signals
<whitequark[cis]>
that is the fastest way to talk to the FX2
<whitequark[cis]>
also, I think 16-bit mode might actually be slower, or not let you know when you have a partial read, or something
<whitequark[cis]>
it's useless unless you're making like, a cheap USB ATAPI converter
<sorear>
i understand that now
<sorear>
using the UART example, TX and RX probably need to be separate async domains because one can stall without the other. we want the configuration to be synchronous to _both_ - it's not useful to send a byte and a baud rate change and have the byte by sent at a nondeterministic baud rate.
<sorear>
would it make sense for any such applet to have one set of config registers per async domain?
<sorear>
would it make sense to extend that to require one _reset_ per async domain, and no global reset?
<whitequark[cis]>
these are really good (and hard) questions. for context, right now we use the fact that a certain USB request (Set Interface) is required to flush the buffers (and it's serialized wrt other USB requests). this is a very big hammer though
<sorear>
i'm thinking there may be relevant examples of applets that can't be broken into completely independent async domains, and require non-transitive relationships (A and B are both synchronous with C, but asynchronous with each other)
<whitequark[cis]>
and it wouldn't work unless I used an out-of-band command to assert applet resets
<whitequark[cis]>
I think the ideal here is that Set Interface would be used exactly once, at startup, and that it would cause the FX2 to reset the entire FPGA automatically (probably the only actual I2C register)
<sorear>
does Set Interface have an explicit acknowledgement which is ordered relative to all other device->host messages?
<whitequark[cis]>
like all control requests, yes
<whitequark[cis]>
but in this particular case it also resets the in-kernel buffers and such
<whitequark[cis]>
specifically it will interrupt any scheduled transfers
<whitequark[cis]>
so by the time Set Interface returns you know there is nothing in-flight (provided the Glasgow-side queues are also emptied)
<whitequark[cis]>
re not being possible to break into completely independent async domains: yeah, probably
<whitequark[cis]>
it's difficult to even figure out how to serialize a reset
<whitequark[cis]>
I don't fully know
<whitequark[cis]>
I think the answer might be "we need to transfer an end-of-packet/end-of-stream over to the PC side"
<whitequark[cis]>
and with UART the packets would just be infinitely long unless explicitly terminated (e.g. by a reset)
<sorear>
i think that most configuration register writes need to be acknowledged so that they can be ordered relative to IN packets whose interpretation they affect
<whitequark[cis]>
I don't know how you would expose this in software
<whitequark[cis]>
like, yes, I can see there being a protocol like... a write is width bytes, responded with 0 bytes, a read is 0 bytes, responded with width bytes
<whitequark[cis]>
nice and symmetric and it puts markers into the overall stream
<whitequark[cis]>
except, is that really the case?
<whitequark[cis]>
you'd need to the write to be performed at the moment when the reply is leaving the endpoint
<whitequark[cis]>
I guess you could do that yeah
<whitequark[cis]>
(expose this in software) ah, await reg.write(...) does it
<whitequark[cis]>
the await statement doesn't return until the write is acknowledged
<whitequark[cis]>
I'm starting to like this
<sorear>
any chance we can continue this tomorrow? i'm having trouble focusing on the requirements at this point
<whitequark[cis]>
sure
<whitequark[cis]>
see you
<esden[cis]>
<esden[cis]> "Great I will test it in a few" <- Ok, tested it, it seems to be doing the right thing now.
GNUmoon has quit [Read error: Connection reset by peer]
GNUmoon has joined #glasgow
<d1b2>
<attiegrande> I'd like to draw up my thinking on the routing stuff... see what you think.
<whitequark[cis]>
sure
<d1b2>
<attiegrande> I was going for more - a router consumes a byte, and addresses the downstream. it's cut through until downstream "releases" it
<d1b2>
<attiegrande> then a "consumer(?)" signals back to indicate it's there and wants data... it can take whatever logic it likes to determine length, before releasing a "busy/more" type signal back upstream
<d1b2>
<attiegrande> I'm always better with diagrams, will try to put something together before the meeting
<d1b2>
<attiegrande> (this results in a series of bytes that describe the route into the FPGA, and then an endpoint that might be variable length (e.g: standardize on uleb128), fixed length with no len field in the payload, or a zero length "signal", etc...)
<d1b2>
<attiegrande> for return path, routers add the port address
nyanotech has quit [Quit: No Ping reply in 180 seconds.]
<d1b2>
<esden> aka, just factory or factory and flash and if we have everything in place for that to happen.
<d1b2>
<attiegrande> yes, very sorry - I've had a cursory look at a couple of PRs, but haven't been able to get my head into them properly yet
<electronic_eel>
esden: don't you need a working selftest for shipping? or did someone implement that already?
<d1b2>
<attiegrande> selftest does indeed need some attention
<d1b2>
<esden> What I have is good enough to ship the first units. (loopback works)
<d1b2>
<attiegrande> ah, great
<d1b2>
<esden> and voltage test also works
<d1b2>
<esden> which tests the chain
<d1b2>
<esden> We are missing LVDS IO check and Trigger port test.
<d1b2>
<esden> this is what I would like to see.
<electronic_eel>
yes, that should be ok for the first units
<electronic_eel>
for testing lvds i think we'd need some extra test board or test jig
<d1b2>
<esden> the units that pass the current tests the probability that they work fully is high enough for me to go ahead and ship them. If someone ends up having a problem with LVDS I am planning to replace their units. But very few people will be using them from the get go.
<d1b2>
<esden> for LVDS I "just" need some type of cable to connect to another glasgow and a test routine.
<d1b2>
<esden> it is more tricky than that as a normal ribbon cable does not fit next to the IO connector so a riser or adapter PCB is needed
<electronic_eel>
esden: do you have a working test jig for glasgow?
<electronic_eel>
i remember you posted some pictures from a chinese test jig supplier some time ago
<d1b2>
<esden> no, I am doing the tests using a ribbon cable I made
<d1b2>
<esden> I did not have the time to build a full jig.
<d1b2>
<esden> But I had time to start prototyping a process for making a jig on a simpler project. aka Black Magic Probe... it is not finished either.
<d1b2>
<esden> (I got that done last week finally... but I now know what tolerances to have to build something similar for glasgow. Which will need top and bottom connections unfortunately which makes things harder.
<electronic_eel>
wouldn't some kind of loopback pcb be enough for testing the lvds? where you connect one lane back to another lane. but not one right next to it so it would detect shorts
<d1b2>
<esden> Potentially. Yes. I have not thought through all the scenarios that it does or does not cover. We want to detect shorts between pins and opens of a pin.
<d1b2>
<esden> I guess if the connections are between pins that are far enough from each other this could work.
<d1b2>
<attiegrande> I was wondering about a series of shift-registers (or similar) on port A/B, which then interfaces with all the LVDS pins
<d1b2>
<attiegrande> (these pins can be operated in standard TTL mode, right? might that miss any potential issues)
<electronic_eel>
yes, connecting a+b ports to lvds pins would also work for testing
<electronic_eel>
you have to feed the correct voltage into the lvds port. but that is easy from the a+b ports
<d1b2>
<esden> Ideally we would like to get to a stage where we have a setup that can run through factory flash selftest print_serial in one go without any hardware shuffling
<d1b2>
<esden> factory needs hard reset... I know
<d1b2>
<attiegrande> yes, true... the supporting system could run the shift regs
<d1b2>
<attiegrande> (or a second Glasgow)
<d1b2>
<attiegrande> LVDS <-> LVDS
<d1b2>
<esden> Yeah that was my idea, to use a Glasgow to test the Glasgows.
<d1b2>
<attiegrande> nice
<d1b2>
<esden> We can add extra hardware if needed to mux things.
<d1b2>
<esden> Selftest is important for end users to check system validity.
<d1b2>
<esden> But here making the boards we can have an extra Glasgow in the test jig.
<d1b2>
<attiegrande> The supporting glasgow could also interface with the Trigger I/O
<d1b2>
<attiegrande> I'd be very keen to have any production jigs / setup documented in the repo... happy to discuss that with you later
<Wanda[cis]>
<del>imagining the Portal 2 scene with turret testing facility, with someone substituting the template glasgow with an evil defective glasgow and destroying a whole production run</del>
<d1b2>
<esden> If we add some solid state switches we can also measure the power rail voltages. But something dumb and simple to get started where we just hook all io of one Glasgow to another is totally sufficient.
<d1b2>
<attiegrande> great
<d1b2>
<attiegrande> is there anything else you'd like to discuss on the production testing side?
<d1b2>
<attiegrande> sounds like you have things in-hand... and specific actions / requests?
<esden[cis]>
Wanda: HEHE :D
<d1b2>
<attiegrande> re portal: 🙈
<electronic_eel>
looks like there are more people that are in here three times ;)
<d1b2>
<esden> uuugh... so test jig building is 100% on me... it has to be done by me, applet for a glasgow to test another glasgow is something that could be done by someone else if they are so inclined, to speed things up and help out.
<galibert[m]>
they need to have two glasgows though, which is not the case of most people :-)
<d1b2>
<attiegrande> my minutes so far state that selftest needs attention, but isn't a blocker (yet), and is more for end-users... is that correct?
<d1b2>
<attiegrande> a 2-part prodtest type applet is what you're referring to, with the jig... correct?
<d1b2>
<esden> The current test that we have is good enough to start shipping ... I am hoping they will all be fine, but if not I will definitely replace the units. (at this point shipping something is more important than having a perfect test)
<d1b2>
<esden> @galibert that is true, this is why the selftest remains. But we are making thousands of those boards and need a more efficient and less fiddly test setup.
<d1b2>
<esden> so my request is for us aka 1BitSquared ... not for end users and those that make a few boards by hand
<galibert[m]>
Sure, I was more answering to the "applet for a glasgow to test another glasgow is something that could be done by someone else if they are so inclined"
<d1b2>
<esden> ohh right... true
<d1b2>
<esden> Some folks have multiple boards... developing the test would work between two different revC boards that are not the exact number
<d1b2>
<attiegrande> I could probably help with this, but might appreciate any interface PCBs you produce
<d1b2>
<esden> also... a blind implementation, that is not perfect is better than me having to find time to write the whole thing 🙂
<d1b2>
<attiegrande> let's discuss it further outside the meeting?
<d1b2>
<esden> ok sure sorry, and yes I will let you know @attiegrande if you or someone else wants to work on this will get all the necessary hardware to help with the work.
<d1b2>
<attiegrande> no worries - good to know (i have mutliple glasgows, so should be good to go 🤞)
<d1b2>
<attiegrande> ok, next topic - #362 / check if device is busy before loading firmware
<electronic_eel>
i think a lvds loopback would allow you to test the lvds with just one glasgow. that would make it easier for other people than 1b2 to use this too (and also help developing this)
<d1b2>
<attiegrande> @whitequark - i took a look at this the other day, and tried to reproduce my issue with the pre-PR code, but couldn't
<d1b2>
<attiegrande> do you have a repro for this / could you remdind me?
<whitequark[cis]>
open an uart tty session, bump the API level elsewhere, run glasgow list
<d1b2>
<attiegrande> ahh, of course, sorry - bumping the API is probably the key here
<d1b2>
<attiegrande> I'll get back into that one and finish the review
<d1b2>
<attiegrande> next topic - #357 / plugin system for applets
<d1b2>
<esden> @eletronic_eel I am happy to make an adapter for that kind of test too. We don't have to leave it at GG<->GG we can also improve the pallette of self tests.
<d1b2>
<attiegrande> anything open to discuss here, or just waiting on my review & input?
<whitequark[cis]>
latter
<d1b2>
<attiegrande> (and thanks for the detailed commit messages!)
<galibert[m]>
applets, are they pc side, fpga side or both?
<d1b2>
<attiegrande> ok, sorry / thanks for patience
<sorear>
i wonder how far you could get on testing just by measuring the parasitic capacitances of the I/Os in a clean dry environment
<d1b2>
<attiegrande> @galibert - both
<whitequark[cis]>
galibert: the crowdsupply page has a good diagram
<d1b2>
<esden> It is pretty complete as of this moment. The cases package currently seems to be stuck in Japan with some shipping exception, I am working with the vendor to resolve it.
<d1b2>
<esden> Thanks @galibert I really appreciate that. 🙂
<d1b2>
<esden> I am still working on finalizing the serial number sticker printing situation. Made some progress on it today as shown in the photo posted earlier.
<galibert[m]>
Who's getting SN #42? ;-)
<d1b2>
<esden> Cathrine fixed a led test merge issue. And did a provisioning flashing fix so that the firmware is ready to go onto the early bird backer glasgows.
<galibert[m]>
the flashing is for the 8051?
<d1b2>
<esden> I will likely finish testing (with the tests we currently have) and flashing the early bird glasgows tomorrow. (if my serial number sticker software work is wrapped up today)
<Wanda[cis]>
galibert[m]: yes
<galibert[m]>
make sense
<d1b2>
<attiegrande> very cool - well done!
<d1b2>
<attiegrande> that's 200x units iirc?
<d1b2>
<esden> My family offered to help out verifying and packing the cases if they arrive while I am gone for BlackHat and DefCon in a week. So that we can ship the glasgows and cases asap without experiencing a delay of me travelling.
<galibert[m]>
(just ordered my unit, wonder how real the feb 29th date is, update seems to show that things are much further than that)
<d1b2>
<esden> we will be shipping 204 early bird glasgows (there was a glitch in CS software and more got ordered in early bird than 200) and 1000 cases to Mouser.
<d1b2>
<attiegrande> ha
<galibert[m]>
(much further advanced I mean)
<d1b2>
<esden> the 1000 cases is obviously dependent on all of them passing QA
<d1b2>
<attiegrande> ofc
<d1b2>
<attiegrande> fancy putting a date on shipping to Mouser? 🙃
<d1b2>
<attiegrande> ... and any visibilty on their turnaround?
<d1b2>
<esden> lol ... NO 😛
<d1b2>
<attiegrande> (totally fine to say NO!)
<d1b2>
<attiegrande> haha
<d1b2>
<esden> but from experience it usually takes mouser two weeks from the product hitting their dock to them shipping the stuff
<d1b2>
<attiegrande> ok, good to know
<d1b2>
<attiegrande> anything else to add on this topic?
<d1b2>
<esden> I will definitely let you all know here when we ship the boxes to Mouser... I know you all are eager to know.
<electronic_eel>
oh, i didn't expect them to take that long for just stocking the stuff
<d1b2>
<esden> yeah, me neither ... they are extremely slow and confused on the regular
<d1b2>
<esden> completely out of my control unfortunately
<electronic_eel>
yeah, sure
<d1b2>
<attiegrande> of course, "it is what it is"
<d1b2>
<attiegrande> great, thanks for the update
<d1b2>
<attiegrande> any other business before we come on to #354 / routing?
<d1b2>
<attiegrande> (if we even want to discuss here / now)
<d1b2>
<esden> I don't have anything more on my end. 🙂
<d1b2>
<attiegrande> ok, thanks all
<d1b2>
<attiegrande> many thanks for all your work recently everyone, @whitequark in particular
<d1b2>
<attiegrande> Let's leave #354 / routing for now?
<d1b2>
<esden> For those lurking, if you can afford it, please consider supporting @whitequark she does need financial help. https://www.patreon.com/whitequark
<d1b2>
<attiegrande> 👆 thanks @esden
<whitequark[cis]>
yeah we don't have to discuss #354 right now
<d1b2>
<attiegrande> ok
<d1b2>
<attiegrande> i'll do what I can to get my head into those PRs soon
<d1b2>
<attiegrande> @rwhitby - I wasn't about, but thanks for running through the getting started... extra points for doing it on a Mac, as I suspect we don't have great coverage in the team
<d1b2>
<attiegrande> I'd do me best for the same, but literally "don't have the ability" when it comes to Apple stuff... I suspected a friend / fingers crossed were involved
<d1b2>
<attiegrande> *my
<whitequark[cis]>
hm, I should update the CI so that it doesn't rebuild the software on docs changs
<d1b2>
<attiegrande> is that a global CI thing, or can it be restricted to jobs? (e.g: I'm hoping we'll get to building & publishing docs at some point)
<whitequark[cis]>
oh, but it's a required check...
<whitequark[cis]>
it's per workflow
<d1b2>
<attiegrande> ok
<whitequark[cis]>
I guess a more important question is why does the cache not work
<d1b2>
<attiegrande> can cannot set I/O port(s) A voltage to 3.3 V be rewored to cannot set I/O port(s) A voltage to 3.3 V (limit is x.x) or something?
<whitequark[cis]>
right now the specific type of failure is just not reflected at all anywhere
<electronic_eel>
yeah, telling the user that there is a limit programmed would be nice
<whitequark[cis]>
and it could be a generic I2C failure for example
<d1b2>
<attiegrande> oh i see
<whitequark[cis]>
I guess on an error the driver should read the voltage limit and check if the requested voltage exceeds it(
<whitequark[cis]>
(otherwise it's another roundtrip that slows down device configuration)
<whitequark[cis]>
I'll do this in a sec
<whitequark[cis]>
thank you electronic_eel for your work, and sorry for the confusion
<electronic_eel>
yeah, checking from the driver should be easy to implement. extending the error message from firmware would be much more work
<electronic_eel>
whitequark[cis]: no worries
<whitequark[cis]>
and it expands firmware size too
<whitequark[cis]>
I want to switch to a pipe-style communication with the firmware too, just like it's planned for applets
<whitequark[cis]>
that should actually make the firmware smaller
<whitequark[cis]>
and we could define some sort of RPC protocol that would be usable even to non-Python hosts potentially
<electronic_eel>
whitequark[cis]: no more control messages?
<galibert[m]>
No control, only data
<whitequark[cis]>
yes pretty much
<whitequark[cis]>
API level is already folded into bcdDevice
<electronic_eel>
so the firmware would get it's own in / out ports for that?
<whitequark[cis]>
there is an issue that EP1IN/EP1OUT cannot strictly speaking be used as a bulk endpoint because maxPacketSize for BULK HS must be 512, but those endpoints are only 64
<whitequark[cis]>
and I guess the FPGA bitstream is bigger than 64 bytes
<whitequark[cis]>
so we might have to use just one control endpoint in the end
<galibert[m]>
Or it's a very small fpga
<whitequark[cis]>
you can transmit up to 4K to a control endpoint (not sure why 4K exactly)
<ali_as>
Scots single knife, PAL inside.
<whitequark[cis]>
so there would have to be some packetization code that slices it into ... I don't know, probably 64 byte chunks in the end because EP0BUF is also 64 bytes long and you might as well do that
<whitequark[cis]>
electronic_eel: USB cursedness aside, yes, the firmware would get its own in/out port.
<electronic_eel>
or would the packetization go to the control endpoint? since we already have to do something similar to send the fpga bitstream?
<galibert[m]>
the ports would be a protocol construct independant of the channel which is just a reliable stream of bytes?
<whitequark[cis]>
galibert: oh?
<whitequark[cis]>
I mean that there would be a reliable stream of bytes to the FPGA and also a similar one to the FX2
<whitequark[cis]>
this is important for network transparency
<galibert[m]>
you're making the fx2 and the fpga independant usb endpoints?
<whitequark[cis]>
well they already are, I'm making the FX2 use a stream of bytes type protocol instead of USB control messages
<sorear>
how are we handling reset ordering without SetInterface?
<whitequark[cis]>
you still have SetInterface
<sorear>
do we want to be able to combine the two streams of bytes into one stream of bytes?
<whitequark[cis]>
but it would be invoked e.g. when there is a new TCP connection to the FPGA pipe exposed by the Rust proxy
<galibert[m]>
I thought there was only one flow to/from the fx2, and the fx2 managed the routing to the fpga as needed
<whitequark[cis]>
combining the two is probably useful for things like SSH forwarding and awful firewall
<whitequark[cis]>
galibert: I don't know what is "routing to the FPGA"
<galibert[m]>
giving it the bytes, telling it was part of the chip they're for
<whitequark[cis]>
I don't understand
<galibert[m]>
I suspect I don't understand and messing up accordingly :-)
<whitequark[cis]>
you're using terminology that's nowhere else to be found
<whitequark[cis]>
which is why it's confusing
<ali_as>
Meaning why is there a separate end point and not a protocol for both operating with one end point.
<sorear>
having administered mosh I'd rather not deal with something that uses multiple listening port numbers, but N sockets to 1 destination port identified by something in the handshake could be an option
<galibert[m]>
yeah. You know what, I'll look at the existing code, which will be much better to actually understand what I'm talking about
<whitequark[cis]>
sorear: yeah it would be a pain, putting a socket multiplexer in front of this thing is probably just fine
<galibert[m]>
Because I don't really see the point in exporting my own confusion :-)
<whitequark[cis]>
sorear: it even fits into our existing scheme where you have endpoint prefixed to the rest of communication
<whitequark[cis]>
so I think it's good. one socket it is then
<whitequark[cis]>
0th byte of address selects {FX2, FPGA}; for the FPGA, 1st byte of address selects {async group 1, async group 2, ...}; etc
<whitequark[cis]>
I feel like I've reinvented a part of SNMP
<electronic_eel>
you are missing the asn.1
<electronic_eel>
you don't get the true snmp feel without asn.1
<whitequark[cis]>
I do need some sort of serialization schema
<whitequark[cis]>
currently I'm considering doing some sort of DIY serialization format, with the expectation that it will be easy to parse on the FPGA using Amaranth tools
<whitequark[cis]>
most serialization formats aren't really suitable because they're built for random access
<electronic_eel>
here we have packets with a defined start and end that could be used, right?
<whitequark[cis]>
for the FPGA, USB doesn't really give you useful framing for a number of reasons
<whitequark[cis]>
for the FX2, yes, we could easily coax USB into giving us framing
<whitequark[cis]>
e.g. wLength<64 means packet is done, wLength==64 means there is more
<whitequark[cis]>
one downside of using control requests for this is that on Linux, two processes can send control requests to a device simultaneously
<whitequark[cis]>
this would not be a problem if we made EP1IN/OUT into an endpoint
<electronic_eel>
that was the issue attie mentioned