<whitequark[cis]>
the yellow-orange wire is GND+TX_CLK
<whitequark[cis]>
before it was... something else. i think the black one?
<whitequark[cis]>
basically it was not a wiring harness of honor
<Wanda[cis]>
... absolutely no priest would bless this thing
<whitequark[cis]>
good
redstarcomrade has quit [Read error: Connection reset by peer]
<josHua[m]>
it sounds like somebody needs a littleprobe!
<whitequark[cis]>
wtf is a littleprobe
<josHua[m]>
littleprobe is my $0.80 solder-in transmission line probe. most of the logs for it are on Discord, and there are definitely improvements that should be made for rev2
<josHua[m]>
but roughly it is a flex with a SMA connector, and a 450 ohm resistor in series with the SMA, and a 3D printed base to hold the whole thing down that you bluetak to your board. so solder it in, bluetak it down, and connect an SMA and then connect a SMA-to-BNC to your scope (and, if your scope does not have 50ohm termination built in, add a 50 ohm terminator). good enough for 375 MT/s LVDS.
<josHua[m]>
forms a 10x probe
<josHua[m]>
it can probably be done with just a SMA connector and a 450 ohm resistor and some wires, and some hot melt. or cutting some coax in half and soldering an axial THT resistor in, also works. but this made me feel like a pro for fabbing out a flex and for having a solder-in flex probe, and that's the most important thing
<whitequark[cis]>
but there's no scope probes in sight
<whitequark[cis]>
nor transmission lines
<josHua[m]>
I assure you there are always transmission lines!
<josHua[m]>
(the reply was to the Rigol scope shot, though)
<whitequark[cis]>
i'm on matrix and don't see replies
<josHua[m]>
ahh
<whitequark[cis]>
it sometimes works, but so rarely that it's not worth considering
<josHua[m]>
well, now you know to what I meant to reply
<whitequark[cis]>
it might work in one direction only?
<whitequark[cis]>
but yeah i actually assembled a chain of SB_LUT4 because i did not want to remove a single resistor from this board
<whitequark[cis]>
i definitely did not want to bother with a solder-on probe or hot melt
<josHua[m]>
yeah, it does work matrix -> discord. obviously it does not work -> irc, but in this case, I tabbed over to discord so that I could reply. hah.
<whitequark[cis]>
which made it worse!
<joshua_>
a lesson has been learned
<Wanda[cis]>
wheeeeee we now compute proper FCS in gateware
<Wanda[cis]>
I like the amaranth.crc.catalog.CRC32_ETHERNET Just Solving The Problem
<whitequark[cis]>
adamgreig: yes, absolutely fantastic work
<whitequark[cis]>
only needs a little touch-up for stream compatibility in the next release :D
<whitequark[cis]>
omnitechnomancer: glasgow basically acts as a MAC
<whitequark[cis]>
i'
<whitequark[cis]>
* i'm honestly shocked that anything that works over HTTP works at all
<whitequark[cis]>
HTTPS, in fact
<whitequark[cis]>
much less, like, Element, which barely functions at all as it is
<Wanda[cis]>
watching the video thumbnail load was an experience
<whitequark[cis]>
i get realtime typing notifications because Wanda's machine sends an HTTPS request over the glasgownet
<Wanda[cis]>
it works great for small packets
<Wanda[cis]>
large packets, however....
<Wanda[cis]>
oh nice
<Wanda[cis]>
going to 100Mbit actually works and improves the experience
cr1901_ has joined #glasgow
cr1901 has quit [Ping timeout: 268 seconds]
<Wanda[cis]>
hmmmmmm
<Wanda[cis]>
the timing on this thing is actually good enough that we could actually think of making it do 1Gbit
<whitequark[cis]>
Info: Max frequency for clock 'multiplexer.U$$0.U$$0.rx_clk_$glb_clk': 138.83 MHz (PASS at 12.00 MHz)
<Wanda[cis]>
but it'll involve a bunch of infrastructure work that requires Amaranth 0.5...
<whitequark[cis]>
we actually forgot to constrain the clock properly, oops. but nextpnr managed to do 125 MHz with all the async FIFOs and stuff anyway
<whitequark[cis]>
yes, the reason I've decided to get an RGMII PHY instead of an RMII PHY despite wanting only 10/100 initially is that RMII is kind of ... bad
<whitequark[cis]>
wikipedia goes into detail on how it's bad exactly
<whitequark[cis]>
RGMII is just a very nicely designed interface, much easier to work with even at 10/100
<ewenmcneill[m]>
Yes, RGMII does seem surprisingly elegant (compared with the others) skimming that Wikipedia page.
<whitequark[cis]>
tbh it's more that RMII is unusually bad
<whitequark[cis]>
SGMII is quite good, GMII is fine too
<Wanda[cis]>
performance was limitted by: python side getting blocked on log output to VS code integrated terminal
<whitequark[cis]>
closing the terminal improved it considerably
notgull has joined #glasgow
<Wanda[cis]>
I have no idea wtf is going on with VS code terminal, but I have the impression it is... not very good
<Wanda[cis]>
(not just from this particular event)
<ewenmcneill[m]>
Wait, is this 100Mbps Ethernet going through USB to Python code on the PC side? And still working well enough that it's not worth connecting to WiFi?! (The Python on the PC side of the path is more surprising to me than the FPGA / wiring loom side TBH.)
<whitequark[cis]>
yes
<whitequark[cis]>
moreover, this is my python code:
<Wanda[cis]>
... you know what, actually it's probably packet loss
<Wanda[cis]>
because 1500 byte packets seem to ... not work well
<whitequark[cis]>
is this loss?
<Wanda[cis]>
(1480 byte packets seem to work perfectly on the other hand, and I have no idea why)
<Wanda[cis]>
Cat.
<Wanda[cis]>
FWIW the uplink here should be able to saturate fat ethernet
<Wanda[cis]>
*fast ethernet
<whitequark[cis]>
yeah, both ways (650/100 is what i'm getting because wifi, wired it would be 1000/100)
<whitequark[cis]>
when we add proper packetization and slightly increase buffering (right now the MAC can't even fit a single 1500 octet packet without splitting it between the FX2 FIFO and the default 512-byte FPGA FIFO) we should be able to saturate it
<ewenmcneill[m]>
There are people with, eg, ADSL connections who would be very happy to be getting 30Mbps down / 3 Mbps up....
<whitequark[cis]>
for sure
<Wanda[cis]>
(oh, and FWIW "upload" is actually "phy to glasgow")
<ewenmcneill[m]>
Wanda: what tool are you using for testing packet size that works? (I ask because, eg, every ping implementation has its own interpretation of "what the size parameter means" and if it includes L2 and/or L3 and/or ping overhead.)
<Wanda[cis]>
ping -s
<ewenmcneill[m]>
On Linux?
<Wanda[cis]>
yeah I'm confused by these numbers myself
<Wanda[cis]>
yes, linux
<ewenmcneill[m]>
Okay, so the default Linux ping normally has "-s" specify the "ping data payload", onto which it adds its own overhead.
<Wanda[cis]>
yeah whatever
<Wanda[cis]>
I'm just... still surprised by the cutoff being somewhere between 1400 and 1500
<Wanda[cis]>
because why would it be there
<ewenmcneill[m]>
It matters because IIRC 1472 + ping overhead + ICMP overhead + ethernet framing gets to "full ethernet frame".
<ewenmcneill[m]>
So if you go beyond that you start getting fragmented frames which maybe aren't being handled well?
<Wanda[cis]>
uhhh
<Wanda[cis]>
wait
<Wanda[cis]>
i'm actually going past the MTU?
<Wanda[cis]>
I um.
<Wanda[cis]>
okay I may have fucked up
<omnitechnomancer>
the ethernet framing is generaly not included in the 1500 MTU btw
<ewenmcneill[m]>
Just checked on Ubuntu: "ping -Mdo -s 1473" (which sets the "do not fragment" bit, with "-Mdo") is too big to send.
galibert[m] has quit [Quit: Idle timeout reached: 172800s]
<ewenmcneill[m]>
Yes, the Ethernet framing isn't counted to the 1500, but also the "standard ethernet frame" is normally either 1514 (untagged) or 1518 (with VLAN tag).
<omnitechnomancer>
indeed
<ewenmcneill[m]>
And there's 8 octets of ICMP/ping overhead, plus 20 bytes of IPv4 overhead.
<omnitechnomancer>
let us not speak of jumbo frames
<whitequark[cis]>
the whole point of this project is making a NIC that can cope with super jumbo frames
<whitequark[cis]>
(65536 octet frames)
<whitequark[cis]>
(i need this for ... something)
<whitequark[cis]>
as for Python, well, people underestimate how fast Python can be, but more importantly people really underestimate the effort I put into building this Python thing
<ewenmcneill[m]>
I guess 64k super jumbo frames would reduce the Python overhead on packets per second 🙂
<whitequark[cis]>
by default it's one-copy, so once it reads data from USB (the kernel copying it into userspace) it never copies it again unless you request so explicitly
<ewenmcneill[m]>
That's very impressive to get Python down to only one copy (outside Python: kernel -> userspace).
<whitequark[cis]>
if you read from USB, my Python framework gives data to you in memoryview objects, in the same chunks as it arrived from USB
<whitequark[cis]>
if you ask for a specific amount, it chunks that memoryview cleverly so that no data is actually copied
<whitequark[cis]>
so you can e.g. read 2 bytes for the packet size, then read 2000 bytes of payload, and this never copies anything
<whitequark[cis]>
it's not micro-optimized much, but it's extremely macro-optimized for performance
<whitequark[cis]>
you can rather easily saturate the 42 MB/s USB HS bus with just normal Python code
<whitequark[cis]>
glasgow run benchmark does just that, and the code there is very simple
<whitequark[cis]>
the USB requests are heavily pipelined, and the pipelining parameters have been tuned for a sweet spot in the throughput-latency tradeoff
<whitequark[cis]>
because not only I give you throughput, but I also aim for sub-millisecond median latency, yes, still talking about Python
<whitequark[cis]>
well, it only gets most of that performance on Windows, that's a work in progress (and I think I'm hitting a libusb bug? unsure)
<whitequark[cis]>
basically, you can write real-time media applications that are sensitive to roundtrip latency, in Python, with the glasgow framework, and they mostly work with fairly naive code to the extent that e.g. nanographs were able to build an entire electron microscope frontend without thinking about topics like pipelining or queueing essentially at all
<ewenmcneill[m]>
That all sounds like a very elegant design. And I agree, "macro-optimised" (ie, designed properly) is the thing that most matters for efficient Python-controlled code.
<whitequark[cis]>
I've been really aggressive with various architectural optimizations to the point where a fairly complex design where right off the bat you have four buffers (Python/libusb, XHCI, FX2, FPGA) with varying sizes and bottlenecks between them, and it feels as simple as a socket: .send() a bunch of bytes, then get them from a FIFO on the FPGA side
<whitequark[cis]>
the FX2 arbiter is actually optimized to the extent where it has no idle cycles
<whitequark[cis]>
using the combination of skid buffers, aggressive pipelining, clever FSM design, and continuous evaluation of scheduling decision
<whitequark[cis]>
plus on top of that you get to pick the easiest mode where data can't get stuck in the buffers, or the high-performance mode where you can control where USB packet boundaries go, to some extent
<whitequark[cis]>
I got really frustrated with unreliable FX2 communication so I sat down and did it properly (I think it took me like six months of real time to get it to that point, but it never bothered me again)
<whitequark[cis]>
the audio-yamaha-opx applet lets you load a series of commands for e.g. OPL3 and listen to the result using web audio and web sockets, without actually waiting for the whole song to complete
<whitequark[cis]>
it retrieves PCM data from the synthesizer and queues it in chunks for the audio player, which seamlessly transitions from one chunk to the next as they get loaded
<ewenmcneill[m]>
I remain impressed at how well you can truly understand a problem space, and then build something which is "simple to use", but performs optimally behind the scenes.
<whitequark[cis]>
there's also a feature which lets you voltage-glitch the synthesizer, which must happen relatively in real time, so the chunks are rather small to keep the response latency of the entire chain, from clicking your mouse button to hearing the difference, somewhere under... it depends on the settings, but it's less than 500ms in all conditions, iirc
<ewenmcneill[m]>
That skill brings a lot of "usable value" to a lot of other people. Thank you.
<whitequark[cis]>
you're welcome
<whitequark[cis]>
actually it looks like i misremembered and voltage glitching isn't real-time
<whitequark[cis]>
well, it could be made real-time
<whitequark[cis]>
also if you run it from the CLI, it has yamaha synthesizer audio shaders (in Python)
<whitequark[cis]>
a small Python file which gets to preprocess the commands before they reach the synthesizer, which lets you make them more interesting
<whitequark[cis]>
oh, and the VGM player is cycle-accurate, despite the VGM format strictly speaking not being such, as reading/writing takes some cycles to fully process. it has a phase error accumulator to make sure there's absolutely no long-term phase drift
<whitequark[cis]>
(in VGM, writes are non-physical in that the format assumes they can be instant)
<FireFly>
glasgow arrived earlier, but alas, work meetings.. playing with it will have to wait til after work I guess
notgull has quit [Ping timeout: 256 seconds]
notgull has joined #glasgow
theorbtwo[m] has joined #glasgow
<theorbtwo[m]>
Crazy question time... One thing I've always wanted to do is play with my boiler's control protocol, eBUS, which is essentially a UART with strange logic levels - 9-12v for 0, 15-24v for 1. (Apparently chosen so that you can parasitically power accessories over the same pair of pins as your data, which seems like a poor choice, but there you go.) If I understand correctly, the S pins are INA233 are good for up to 36v input... But
<theorbtwo[m]>
there's no way to output more than 5v?
<whitequark[cis]>
yeah
<Xesxen>
Yay, my glasgow arrived :)
redstarcomrade has quit [Read error: Connection reset by peer]
Attie[m] has joined #glasgow
<Attie[m]>
re glasgownet: that is incredibly cool... good job both!
<whitequark[cis]>
Ewen McNeill: lmfao, the Konsole terminal that's printing the debug (very brief, one line per packet, about the same as non-verbose tcpdump) output of the glasgow MAC is consuming like 5X more CPU time than the Python process of the glasgow MAC
<ari>
xD
<whitequark[cis]>
the latter isn't even in top 5 most of the time
<whitequark[cis]>
and that's with Wanda's laptop very cheerfully transmitting an amount of packets that would make a 56k modem explode
<whitequark[cis]>
> This lid was designed for and tested with a revC1, but should be easily modified for future revisions.
<theorbtwo[m]>
D'oh!
<whitequark[cis]>
it's also using f360 so it's a pain to modify
<gruetzkopf>
i could go do that
<gruetzkopf>
i have both F360 and one each of revC1 and revC3
<theorbtwo[m]>
Awesome. I would offer, but I don't do f360 ... and I'm more interested in a boltable case rather than a magnetic one.
q3k[cis] has joined #glasgow
<q3k[cis]>
i've posted some freecad files of a modified version a few months ago
<q3k[cis]>
i can big them up if anyone's interested
<q3k[cis]>
(but they're based on the f3d version)
<whitequark[cis]>
tbhq i don't find freecad all that better than f360 personally
<q3k[cis]>
if i had a windows/macOS machine i'd probably use f360, too
<q3k[cis]>
but the brain worms tell me to keep using linux
<vegard_e[m]>
I didn't realize f360 didn't have a linux version before I set up a linux pc for working with a CNC machine at the workshop I rent with friends, it's a bit surprising they don't have one, considering they've even got a browser-based version available for students
<vegard_e[m]>
nowadays I try to design stuff for 3d printing in build123d, but for CNC machining I haven't yet found a good alternative to f360's CAM (not that I have done much CNC machining yet)
<q3k[cis]>
freecad has a cam, and it's certainly one of the cam of all time
<q3k[cis]>
(i've done some cuts with it, the experience is as with the rest of freecad - it elicits a mix of confusion and anger)
<q3k[cis]>
(but it's better than no cam, and also the f360 cam is getting progressively enshittified unless you pay for a $$$ license)
<josHua[m]>
I have found that small changes like 'add a cutout' are not the end of the world to do with a step-file-output import-into-onshape flow
<ari>
i managed to install and run f360 on linux, but the rendering was effed up
<ari>
(half the screen black, and things like that)
<vegard_e[m]>
I'm intending to try ocp-freecad-cam at some point
<Wanda[cis]>
yay my internet connection got an upgrade
<Wanda[cis]>
(we now have proper packetization in the RX direction, and it shows)
<Wanda[cis]>
(glasgow RX is my upload)
<whitequark[cis]>
by "RX" she means "from Wanda's laptop to Glasgow"
<whitequark[cis]>
* Wanda's laptop via CAT5 to Glasgow"
<ewenmcneill[m]>
Wow, 39Mbps up is a huge improvement on the 3Mbps up of ~12 hours ago! (And I see from backscroll that "what if we didn't let the terminal consume lots of time logging a line per packet" is a non-trivial part of that speed up 😬 )
<whitequark[cis]>
it wasn't even my code!
<tpw_rules>
how does it get looped into the system?