<shorne>
to reproduce I need to run linux for like 10 minutes, trying to figure out what instruction sequence is causing the failure
<shorne>
maybe time for a instruction trace cache?
bl0x_ has quit [Ping timeout: 245 seconds]
bl0x_ has joined #litex
Degi has quit [Ping timeout: 256 seconds]
Degi has joined #litex
toshywoshy has quit [Read error: Connection reset by peer]
toshywoshy has joined #litex
FabM has joined #litex
<_florent_>
tnt: sorry I was focusing on other things, PCIe will automatically downgrade to X1 yes. On the ZCU106, I'm configuring the link for Gen3 X4 and this is downgraded to Gen3 X1 when using the PCIe riser.
<_florent_>
tnt: I'll look at your notes.
<_florent_>
cr1901: The improvements mostly comes from automatically regrouping wishbone accesses by bursts when possible. This avoid paying the initial "latency tax" for each access :)
<_florent_>
cr1901: On the Titanium, the CPU can also be clocked with a quite high frequency, so this help with this simple generic core (makes it less mandatory to have DDR IOs)
<_florent_>
tnt: I'm not sure to remember the PCIe configuration of the bitstream, is it Gen3 X4?
<tnt>
_florent_: yes, gen3 x4
<tnt>
Although behavior is identical with gen3 x8
<_florent_>
ok, so it seems the case I was testing on the ZCU106 was the one that is also working on your side. I'll try to do a test with the ZCU106 without the PCIe riser. (but Gen3 X4/X8 has been validated on the VCU1525 and FK33)
<tnt>
_florent_: yeah, and I had a x4 gen3 link up on the previous setup. Unfortunately as I suspected, that setup is not availabe anymore, it was the card they sent me.
<tnt>
If it was some signal integrity issue, I would expect the test with the dodgy USB cable extender thing to be worse and yet it's the only case that works ( when in PCIe3 slot ).
<tnt>
If it was the link width, then I'd expect that extended thing to work in either usb slot, but it only works on PCIe3 and not PCIe1.
<tnt>
And if it was only which PCIe slot it's plugged in, then I'd expect the PCIe extender test to work when PCIe3 slot, but in that case, it never works.
<tnt>
I have a small Celeron SoC board that also has a signle 1x pcie lane, I'll try to test in there and see how it goes.
dcallagh has quit [Quit: You have been kicked for being idle]
leons has quit [Quit: You have been kicked for being idle]
zjason` is now known as zjason
<acathla>
Someone has an alternative to the ice40up5k-sg48i? It's zero stock everywhere... I need a small FPGA (in size), some RAM (128KB was nice), some IOs. New FPGAs at Lattice seem out of stock too.
<tnt>
Well, if you don't need many IO, the UWG30 package is available easily.
<acathla>
Easily for how long? Stocks seem quite limited too. I'll check if I can live with so few IO...
<tnt>
ATM _all_ parts can fall out of stock next day ... so only viable strategy is to buy all you need first then design for it.
<cr1901>
> The improvements mostly comes from automatically regrouping wishbone accesses by bursts when possible.
<cr1901>
_florent_: Is there something new sitting between the wishbone interface and the CPU (such as posted writes)?
<cr1901>
Wishbone is normally 3 clock ticks for a transaction. I don't see how you get 6x speedup by using the burst interface :P
<cr1901>
(where each xfer takes 1 clock tick)
<tnt>
well if previousy each access resulted in 1 hyperbus transaction and now it continues the transaction if it's contiguous, this would definitely sped things up.
<tnt>
because you don't have to stop and restart an hyperbus transaction each time, you can just stop feeding the clk to thehyperram, and if the next wishbone access happens to be a continuation, just continue the current burst.
<cr1901>
Oh...
<tnt>
(just a theory, I didn't actually look at what the core used to do / is doing now)
<cr1901>
Well, it's certainly logical. I probably could've used some critical thinking skills there
<cr1901>
But I don't feel like thinking lately :(
Abhishek_ has quit [Quit: Connection closed for inactivity]
FabM has quit [Quit: Leaving]
<somlo>
_florent_, gatecat: after a few iterations, I managed to get dual-core rocket going on ecpix5, with yosys/trellis/nextpnr as snapshotted from yosyshq github yesterday: https://pastebin.com/cAwGD0t5
<tpb>
Title: [ 0.000000] Linux version 5.16.0-rc3-00291-g6fc4b6533e6c (somlo@glsvmlin.ini. - Pastebin.com (at pastebin.com)
<somlo>
most often it hangs during boot, but with just the right random seed and a bit of luck, it can be done
<somlo>
no FPU, and 85k ecp5 utilization is at 95% (ish)
<geertu>
somlo: Meh, my orange-crab is too small...
<somlo>
geertu: does it come with a 45k ecp5?
<somlo>
also, while single-core is pretty solid on ecp5, dual-core is still a bit "wobbly" -- got lucky and linux finished booting, and I could run `cat /proc/cpuinfo`, but mounting the sdcard and running md5sum on a file there crashed it :)
<geertu>
somlo: Mine has 25k, but some havee 85k
<somlo>
that won't fit any kind of rocket chip at all, as far as I can tell (single-core rocket has 45% utilization on the 85k ecp5 variant)
<geertu>
Hence I'm stuck in the 32-bit era, for now ;-)
<somlo>
but anyhow, getting dual-core to even finish booting is further than I ever got before, so the toolchain must be improving, is all I was trying to say :)
<somlo>
if only ecp5 came in something like a "165k" version that could fit a gateware FPU... :)
<geertu>
somlo: external FPU, like in the old days?
<somlo>
custom fpga dev board, like real engineers would do :D
yfl has joined #litex
<yfl>
hi!
<yfl>
liteeth question here ...
<yfl>
LiteEthMACCore seems to assume there are always 2 clock domains in the PHY, eth_tx and eth_rx
<yfl>
what would be best to do if my PHY (based on Xilinx pcs_pma IP core) only has a single one?
yfl has quit [Quit: WeeChat 3.4]
Guest549 has joined #litex
Guest549 has quit [Client Quit]
yfl has joined #litex
yfl has quit [Quit: WeeChat 3.4]
zjason` has joined #litex
zjason has quit [Ping timeout: 256 seconds]
<Degi>
Can't you use the DSP blocks for a FPU? The few leftover slices might be just enough as interconnect