azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/glscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal
veegee has joined #scopehal
Degi_ has joined #scopehal
Degi has quit [Ping timeout: 252 seconds]
Degi_ is now known as Degi
bvernoux has joined #scopehal
<_whitenotifier-6> [scopehal-sigrok-bridge] perigoso commented on pull request #2: use https instead of ssh for submodules - https://github.com/glscopeclient/scopehal-sigrok-bridge/pull/2#issuecomment-1635715621
<d1b2> <johnsel> hey @azonenberg not super related this question but it has been quite silent here for a while and people might take an interest. You are working on a DIY switch, correct? Do you have some insight into how to start making use of the SFP+ on the KC705 I got? Any projects I should take a look at or general approach tips?
<d1b2> <johnsel> now I should specify that I want to do 10G
<azonenberg> johnsel: So basically, a SFP+ is just a differential pair to light converter
<azonenberg> it has no intelligence, just some thresholding and a few simple feedback loops for sensitivity and tx power level
<d1b2> <johnsel> yup, the question is how do I build up something inside the FPGA to talk to/over it 🙂
<azonenberg> There's a few 3.3V GPIOs for things like enabling/disabling the transmit, detecting that a module is present, detecting faults
<azonenberg> an optional i2c bus that contains a descriptor EEPROM and (usually, but not required by spec) some sensors
<azonenberg> The actual data is 10Gbase-R coded
<azonenberg> Which is to say, 64/66b coded ethernet frames
<azonenberg> I have an open source MAC/PCS in my antikernel-ipcores repo that integrates nicely with a 7 series GTX
<azonenberg> oops
<azonenberg> XGMACWrapper is just a shell around those two to save you the trouble of instantiating the two modules directly
<d1b2> <johnsel> that's very useful already
<azonenberg> What you end up with is, on the internal-facing side, a data bus consisting of 32 data bits, a 312.5 MHz clock, a valid flag, and a bytes-valid counter
<azonenberg> plus a start flag that is asserted during the preamble (so you can reset per-packet state machines)
<azonenberg> and then at the end of a packet either commit goes high, indicating good checksum and everything went fine
<azonenberg> or drop goes high, indicating the packet was corrupted/malformed and should be ignored
<azonenberg> TX is the same bus sans drop flag, once you start sending you have to finish sending it
<azonenberg> on the other side, it expects to talk to the 7 series transceiver wizard configured for 10Gbase-R with, iirc, the asynchronous 64/66b gearbox
<azonenberg> also note that my XGMIIBus interface is not 802.3 compliant XGMII
<azonenberg> i swapped the lane numbering left to right, so that bytes would show up in a human readable order in logic analyzer / simulation traces
<azonenberg> and it's also single rate 312.5 MHz vs DDR 156.25 MHz since nobody uses ddr signals inside an fpga
<d1b2> <johnsel> Thanks, that's super useful already. I haven't looked very carefully, but it looked like you have some IPv4 packet related things written already, correct?
<azonenberg> I have a full IPv4, ICMP, ARP, and UDP stack
<azonenberg> It's intended as an embedded server, so it lacks client support for most of these protocols
<azonenberg> e.g. it can respond to incoming pings, but not initiate an echo request
<azonenberg> it also has a TCP server that is a WIP, it works great as long as you never drop a packet from the FPGA to the client
<azonenberg> it will correctly send ACKs and everything else so client-to-FPGA packet loss is well tolerated
<azonenberg> but it doesn't retransmit anything sent in the opposite direction
<d1b2> <johnsel> hmmm, do you have something you use to benchmark it?
<azonenberg> Not currently. I'm not actually using the stack for anything serious yet
<azonenberg> what i've actually used more seriously is the software tcp/ip stack, azonenberg/staticnet
<azonenberg> which is basically the same level of completion
<azonenberg> no tcp retransmits, no client support, no ipv6
<azonenberg> the difference is, this one has a ssh server implementation attached to it
<azonenberg> it's super bare bones and has no OS or library dependencies, in particular it explicitly does not use dynamic memory allocation
<d1b2> <johnsel> I see, not on a microblaze or other cpu core inside a FPGA I assume right?
<azonenberg> everything is based on fixed sized packet pools that are statically allocated
<azonenberg> It could hypothetically run on such
<azonenberg> but the intended use case is stm32h7
<azonenberg> i have a driver for the stm32h7 crypto accelerator to speed up SSH already, although it doesn't have elliptic curve functionality
<azonenberg> so i either do that in software or (in progress) integrate with an fpga curve25519 accelerator
<azonenberg> The intent for the all-FPGA stack is to be used on the open hardware scopes, since there's no way the stm32 tcp/ip stack can get remotely close to saturating a 10G link with packet data
<azonenberg> What i am beginning to explore is linking them
<azonenberg> so that things like arp, icmp, etc are handled on the MCU
<azonenberg> and low bandwidth management traffic like scpi goes to it
<azonenberg> but high speed stuff like the waveform sample datapath is all FPGA
<azonenberg> rather than having the waveform data and the management be considered two seaprate hosts with their own ip/mac i want to look into sharing state and packet data
<azonenberg> such that certain ports/protocols are implemented in software and others in hardware
<azonenberg> and you can trade back and forth depending on fpga area vs performance requirements
<d1b2> <johnsel> yeah you've told me about it before, it's an interesting idea
<azonenberg> anyway the reasdons for using the exxternal mcu are that it has a lot of sram (so doesnt compete with fpga block ram)
<azonenberg> it has a random number generator (so no need to use sketchy RNGs in the FPGA for crypto)
<azonenberg> and it can clock significantly faster than a typical softcore
<d1b2> <johnsel> You're basically doing Zynq but discrete now, haha
<azonenberg> Yes
<azonenberg> and with a cortex-M not an A
<d1b2> <johnsel> Anyway I'm looking into 10G for my scope project, so if I do build something useful I'll PR it back
<azonenberg> i like bare metal not linux
<azonenberg> And with the FPGA and MCU being explicitly decoupled
<azonenberg> e.g. the mcu cannot reprogram the FPGA unless you create an interface for it to do so
<azonenberg> one of the things i liked about the stm32h735 is that one of the package options (which i have not got my hands on yet, it's out of stock everywhere i looked) is a 68 pin QFN
<azonenberg> i could basically just have jtag, uart, quad SPI to the FPGA, and maybe a few debug LEDs
<azonenberg> and have it be a "brain on a stick" hanging off the FPGA
<azonenberg> xilinx's vision for zynq is an arm soc with an fpga accelerator as a peripheral
<azonenberg> my vision is an fpga with a microcontroller as a peripheral :p
<d1b2> <johnsel> Yeah different usecases
<d1b2> <johnsel> I get the industry move towards linux, it gets software people into hardware more easily, but there's definitely a lot of downsides to their current approach
<azonenberg> yeah. and the over-reliance on things like axi and linux makes it difficult to use any other way
<azonenberg> like you basically *have* to use the ip integrator in a zynq design
<d1b2> <johnsel> yeah that's the whole spiel, you get custom hardware in your SoC that you can drive from the fully featured Linux environment
<azonenberg> Yeah
<azonenberg> thats one of the things that bothers me about xilinx's future
<azonenberg> all of their marketing docs are presenting versal as the successor to ultrascale+
<azonenberg> they dont go out and say it, but it's strongly implied
<d1b2> <johnsel> they wouldn't, would they?
<d1b2> <johnsel> I think discrete FPGA will stay
<azonenberg> i.e. i fear that au+ / ku+ may be their last family of fpgas without an arm core you are forced to use to get any work done at all
<d1b2> <johnsel> it's just the AI craze taking hold
<azonenberg> I think it will stay across the industry
<azonenberg> I don't know if it it will stay *from xilinx*
<azonenberg> they seem all-in on versal and i dont like it
<d1b2> <johnsel> that would be the stupidest thing ever
<azonenberg> anyway, u+ isn't going away any time soon, even 7 series is going to be supported until at least like 2035 iirc
<azonenberg> So even if there's no next-gen platform afterwards, i have a long ways to go before my projects outgrow a ku5p :p
<azonenberg> Considering right now i'm working on a 7k160t and using a nontrivial amount of it, but nowhere near running out of space (yet)
<d1b2> <johnsel> Yeah for sure, I'm discussing a building an overpowered "Analog Discovery" with someone and he asked for Xilinx' latest series (as it would be good for marketing). I said their 7 series are still plenty fast enough for what we want to do.
<d1b2> <johnsel> it's a tough job to fully utilize one of those chips, especially on Kintex Serdes
<d1b2> <johnsel> and 12.8Gbit/s is plenty fast, especially if you have like 8 or 16 of them
<azonenberg> i mean, i have the opposite problem with ethernet lol
<azonenberg> LATENTORANGE is going to use as many serdes as i can find for switching N 10GbE lanes
<azonenberg> and then for the open scope project, i'll need a dozen JESD204B lanes to use the AD9213
<azonenberg> That's going to be my next big hardware project once i have the mini-switch done i think
<azonenberg> although it will be a multi step project, i need to do more work on the frontend (might borrow ideas from the thunderscope but i have my own frontend design i wanted to play more with too)
<d1b2> <johnsel> anyway to recap your stack set up 7 series transceiver wizard configured for 10Gbase-R with the asynchronous 64/66b gearbox set up the interface and protocols using your stack probably tinker with the SFP+ module to actually switch on, and maybe some clocking issues (KC705 has a weird clock for SFP+, not sure if you use that or pull a clock from somewhere else) and hope for some wireshark traffic
<azonenberg> Pretty much. There is a full TCPIPStack module that integrates all of the various protocol components if you want to use that for starting out
<d1b2> <johnsel> sound correct to you?
<azonenberg> you just have to instantiate the serdes wizard, the mac/pcs, and the stack and bolt them together
<d1b2> <johnsel> Cool. I'll let you know how far I get, I'm receiving the SFP+ PCIe module tomorrow and some transceivers and fiber
<azonenberg> I also have a 1000base-X core as well BTW
<azonenberg> which you can use with a GTX or GTP in 8b10b mode
<d1b2> <johnsel> Might be a good one to keep in de debugging toolkit if nothing goes as it should
<azonenberg> and then i have GMII and RGMII support of course
<azonenberg> and experimental SGMII. The 1000base-X block should support SGMII over a GTP no problem today (although this has never been tested)
<azonenberg> and it also should in theory work over ISERDES/OSERDES oversampling, but i had hardware problems on my last board that used it
<azonenberg> and the switch has two SGMII PHYs that i plan to use to continue testing this
<azonenberg> I also have QSGMII support using a GTP/GTX, which is broken out to four SGMII lanes with their own MACs. this is very lightly simulation tested but has never been tested in hardware
<azonenberg> hopefully that will begin this weekend once i stuff the other side of this board
<d1b2> <johnsel> cool, l've been seeing incremental progress on your mastodon
<azonenberg> Yep. All of these projects tie into each other
<azonenberg> the whole reason scopehal has so many networking protocol decodes is so that i can do debug and verification on the switch
<azonenberg> and i got into high speed networking so i could better build infrastructure to run high performance data acquisition
<azonenberg> and i got into high speed probing so i could collect waveforms to debug both of the above
<azonenberg> lol
<d1b2> <johnsel> recursive improvement
<d1b2> <johnsel> same-ish story here though. I wanted to do something high-speed. But to do high-speed you need an oscilloscope, thus i'm building a high-speed oscilloscope. Now I am working on the oscilloscope I have need for faster interfaces so I am looking at 10GBase
<d1b2> <johnsel> Although I was starting from 0, you started with some nice measurement capability already. But I like bootstrapping projects
<azonenberg> i mean that was the inspiration for FREESAMPLE
<azonenberg> Which i still want to build at some point
<azonenberg> (open hardware 10 GHz sampling scope)