<jersey99>
leons and david-sawatzke[m I saw some issues going back and forth about ping and sram in liteeth. Coincidentally, when I try to rebuild 10g on kc705, ping seems to fail. Either way, is it safe to assume that master on liteeth is now safe to build 10G ethernet?
<leons>
jersey99: Depending on what your doing, I have quite a bunch of additions to the PHYXGMII TX code which I’m going to PR in a couple of hours
<leons>
Also the Etherbone core doesn’t yet work with the 64-bit data path because it needs some more buffering before/after the StrideConverter respectively. I can look into this in the next days if that’s something you want to use.
<jersey99>
I don't need etherbone, but I do need a working UDP port
<jersey99>
so, am I hearing is that Master should work for standard operation?
<jersey99>
*so, am I hearing that Master should work for standard operation?
<jersey99>
leons .. I am guessing your additions are for the case when the lanes are not aligned?
<jersey99>
your additions to the PHYXGMII I mean
<leons>
Well, you do need alignment as per IEEE on lane 0, that’s the first and fifth octet in the 64-bit bus word. But yes, the TX side after these changes now maintains its own IFG logic to allow packet transmission to start on the fifth octet. Also it now optionally implements DIC to maintain an actual effective data rate of 10GBit/s
<leons>
Need to adapt the simulation still and clean up my unit tests for the PHY, but otherwise I’m pretty confident it works
<leons>
All of this is especially important if you’re implementing switching logic, etc.
<jersey99>
Awesome, that is good stuff. For now I will spend some time looking at signals to see why ping fails on master.
<leons>
jersey99: if I may ask, which FPGAs do you use? Xilinx? And if so which generation?
<jersey99>
My 10g core worked with VUP
<jersey99>
*works
<jersey99>
and for having a uniform test platform, I setup a KC705 with some liteeth collaborators
<jersey99>
all Xilinx
<leons>
Okay. I’m working with a Kintex UltraScale+ and a 7-Series Virtex. How are you instantiating the XGMII-compatible interface? Xilinx IP core or interface to the transceiver somehow?
<jersey99>
for now a xilinx IP core with an XGMII port
<jersey99>
which V7 board are you using?
<jersey99>
I do have a version somewhere with the transceiver interface, which I haven't touched in 2 years. Let me know if you are going that route.
<leons>
My V7 board is the NetFPGA-SUME
<jersey99>
I remember setting it up, and looking at some signals, but I didn't see it through. At that point I really only cared about TX
<jersey99>
ok. I have an HTG-703 on me, and it's a V7 with GTH transceivers
<jersey99>
how are you interfacing the XGMII?
<leons>
I think we should really stick to XGMII, that’s already sufficiently complex and at least cross-vendor. But it might be worthwhile to have a Xilinx transceiver to XGMII bridge in Migen as well
<jersey99>
Well, you know, if there is a way to avoid a blackbox, we will eventually ...
<jersey99>
For now, let's stick to XGMII
<leons>
For the 7-Series I’m using the PCS/PMA core by Xilinx, for USP I’m borrowing the Transceiver to XGMII logic from the verilog-ethernet project
<jersey99>
Ah I see, why not use the PCS/PMA core for that as well?
<leons>
Isn’t compatible with USP
<jersey99>
Is that true? I managed to synthesize and run the XGMII interface on a Virtex USP.
<leons>
Huh, that’s interesting… maybe my Vivado is too old. I find Xilinx IP core versioning and documentation really confusing. But hey, now we know it works with two XGMII “PHY”s
<leons>
I’m not sure I understand your point about Blackboxes though. I don’t thing it’s reasonable and worthwhile to get rid of the transceivers wizard core, but translating that to XGMII in a separate module (such that the XGMII to stream adapter is still separate) would be really cool
<jersey99>
Well, I haven't tried the latest XGMII phy from you yet. But I assume it will just work. I have the old xgmii.py working with 7-series Kintex, and US and USP Virtex
<leons>
Yeah, I was worried that I might rely on any implementation specific behavior of the XGMII but I don’t think so
<jersey99>
I just meant that, Xilinx IP core that generates an XGMII interface is something that we could get rid of.
<jersey99>
Cool. Also, out of curiosity, how are you testing line rate? :)
mm002 has joined #litex
peepsalot has quit [Read error: Connection reset by peer]
mm003 has quit [Read error: Connection reset by peer]
peepsalot has joined #litex
<leons>
jersey99: that's an excellent question. My tests have been primary creating a crossover-connection between two SFP+s at the stream-interface level, and then running regular traffic through the FPGA between two machines with NICs from different vendors (Aquantia, Intel, Solarflare)
<leons>
I've also built a really stupid PacketStreamer module which creates packets with scapy, puts them into a read-only memory and the FPGA just spews them out as fast as possible
<leons>
For development I've written some synthetic test cases, but it turns out that when you control both the implementation and the test cases, it's really trivial to implement it in a non-compliant way
<jersey99>
haha, indeed
<leons>
So for quicker development cycles I've built a crossover-connection between the XGMII interfaces of the two transceivers, captured a bunch of bus data using the ILA, exported as CSV and built an XGMIICSVInjector 😀
<leons>
It's still very broken code though, need to rework the tests. But it's been sooo much better than synthetic tests or straight up building hardware. 10 Gbit/s is way to fast for typical debugging approaches 😕
<jersey99>
Haha .. while building, I found that Verilator testbench [xgmii_ethernet.c] helped the most.
<jersey99>
for line rate, it was practically impossible to test. So I did some persistence on the scope, with a blip that emitted packets, and could see that valid signals pretty much took up the whole duty cycle. Then I used some checksums and counters in the packets to see I wasn't dropping anything, and dumped chunks of network traffic, to assure myself.
<jersey99>
None of these is concrete mind you
<leons>
Yes, that's been a blessing, however when you get into the really weird stuff of shifted transmissions due to START alignment on the fifth byte or implementing DIC, it didn't help me any more because I've missed parts of the IEEE802.3 standard. So the xgmii_ethernet.c didn't help much for this
<jersey99>
What you have is definitely sophisticaed
<leons>
That's probably why the IFG and DIC was a nightmare to build
<jersey99>
yea, what is your application? A router?
<leons>
I hope the documentation will spare others from making the same mistakes I did 🙂
<leons>
Currently just streaming data at approximately line rate, but after my current project I'm very much looking forward to building a switch/router type of thing
<jersey99>
cool
<leons>
That's why I wanted to pave the way for this already. Also it feels better to have a foundation which works reliably and won't be a bottleneck later on
<leons>
What do you try to build, if I may ask? (at risk of spamming this channel :))
<jersey99>
definitely. btw, I just ran ping on the PR 88 branch, and it is intermittent as of now. Let me look closer
<jersey99>
My goal is to evacuate as much data as possible from the FPGA
<jersey99>
all data coming from ADCs
<leons>
re ping: It might well be a problem with the Packetizer/Depacketizer. Feel free to open that Pandora's box, but be warned: there lies madness.
<jersey99>
lol
<jersey99>
I am well aware .. I have spent countless hours 2 years ago with packet.py
<leons>
I'm at >3 weeks full time last time I checked
<jersey99>
I spent a whole month with 64 bit data path and xgmii. And I was happy to have something to take home after that. You are doing great!
<_florent_>
leons: Don't worry about spamming the channel with use-cases/applications, that's generally a lot more interesting than issues/implementation details :)
<_florent_>
That's great otherwise to see you all improving/collaborating on LiteEth!
<_florent_>
leons: I see #88 is still a draft, but it looks fine and could be merged if you want.
<leons>
ah, yes. For one I wanted to update the description to something describing the issue at hand a bit better, and I was looking for feedback precisely like the one from jersey99
<leons>
AFAIK the problem was easily reproducible using ping on the 32-bit data path of david-sawatzke. Maybe he can confirm that it works on the proper hardware as well?
<_florent_>
Sure, let's wait a bit then
<jersey99>
@leon
<jersey99>
leons I can confirm the problem still exists. I set a stream of data to flow out from the UDP port to a known destination IP/Port. This basically is currently failing, as the core tries to send ARP requests to get the destination mac. Traditionally, a quick patch for this has been to populate the arp cache with the destination ip/mac tuple inside
<jersey99>
arp.py. I can confirm that even patching that doesn't satisfy the core. The problem is deeper than what I looked for. Also, I do see that different parts of the core now use different versions of packet (litex.soc.interconnect.packet , or liteeth.packet), maybe related to that? idk
<leons>
ah, yes, I think I've stumbled over that exact same issue!
<leons>
I do not yet have a solution for that. I suspect it's an issue with the ARP implementation though. I did look at the Packetizers/Depacketizers involved for doing the ARP logic and these looked as if they were behaving correctly.
<leons>
Where do you see `litex.soc.interconnect.packet` being used? I think the Ethernet core should use only the `liteeth.packet` one.
<jersey99>
liteeth/mac/common.py
<leons>
Oh, I think imports from `litex.soc.interconnect.packet` are fine, as long as they are not using the Packetizer/Depacketizer from there.
<jersey99>
Nevermind, my bad. All the Packetizers are from liteeth.packet
<leons>
No worries! It's great to have someone reproduce that issue.
<jersey99>
Another question for you. On the 7-series pcs/pma core, what clock do you use for the XGMII? coreclk? leons
<jersey99>
That is the recommended TX clock
<leons>
Yes, the clockclk provided by the pcs/pma core. As far as I understand this should be generated from a PLL driven by the mgtrefclk input, right?
<leons>
I've got to admit: for the transceivers I'm just puzzling things together I find on the web until it works. Was really happy when trial and error did finally lead to a working config on my KUP. And the 7-series config is taken verbatim from verilog-ethernet.