_florent_ changed the topic of #litex to: LiteX FPGA SoC builder and Cores / Github : https://github.com/enjoy-digital, https://github.com/litex-hub / Logs: https://libera.irclog.whitequark.org/litex
tpb has quit [Remote host closed the connection]
tpb has joined #litex
Wolf0 has quit [Ping timeout: 252 seconds]
Wolf0 has joined #litex
Degi_ has joined #litex
Degi has quit [Ping timeout: 272 seconds]
Degi_ is now known as Degi
pftbest has joined #litex
pftbest_ has joined #litex
pftbest has quit [Read error: Connection reset by peer]
pftbest_ has quit [Remote host closed the connection]
pftbest has joined #litex
FabM has joined #litex
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #litex
<Melkhior> @_florent_ For the issue I have (the 'stream' benchmark not validating), it seems I have found a 'deterministic' way of creating the problem (with my bitstream/binary combo anyway)
<Melkhior> it's not temperature, it's power-cycling
<_florent_> ah ok
<Melkhior> if I power cycle (unplug/replug), during this first boot the bennchmark is fine
<Melkhior> if I then type 'halt' in Linux, wait for Linux to halt, then hit the 'reset' button on the board, this particular variant of the benchmark won't validate
<Melkhior> most of the system works fine (I have seen very occasional random crashes in the past that I cannot confirm have the same origin)
<Melkhior> the benchmark relies on FP64 computations, so it could be the FPU, or the memory
<Melkhior> or coherency (it uses OpenMP over 4 cores)
<Melkhior> weird, I could have thought that using the reset button and power-cycling would be similar/identical...
<Melkhior> power-cycling is an OK workaround, but it seems something doesn't reset properly in my soc
<_florent_> interesting, in the CRG of your SoC, do you have a self.rst Signal connected to the PLL?
<_florent_> If yes, the whole SoC will be reseted when SoCController.reset is written
<_florent_> If not, only the CPU will be reseted
<_florent_> If you have it, that could be useful to remove it to see if it impacts the behavior
<Melkhior> I have:
<Melkhior> self.comb += pll.reset.eq(~plls_reset | self.rst)
<Melkhior> for all 4 S7MMCM (system, idelay, video, usb)
<Melkhior> plls_reset = platform.request("cpu_reset")
<Melkhior> "cpu_reset" is the reset button I think
<_florent_> ok, cpu_reset is the reset button yes
<_florent_> instead of halt, could you do a reboot in Linux?
<_florent_> this should write the SoCControler's reset register to reboot the CPU
<_florent_> it would be interesting to see if the behavior is similar than with a manual reset
<Melkhior> 'reboot' worked
<Melkhior> and it seems the benchmark is fine after 'reboot'
<Melkhior> 'halt+reset' -> broken again
<_florent_> ok
<Melkhior> 'reboot' -> can't find the sdcard ...
<_florent_> just after plugging the board
<_florent_> can you interrupt the LiteX BIOS boot
<_florent_> do a reboot command
<_florent_> and then let it boot and see if the test is passing?
<Melkhior> 'reset' (from the missing sdcard) -> seems OK
<Melkhior> will try
<Melkhior> not sure if it's related, but the 'plug, force reboot from BIOS sequence didn't do what we expected':
<Melkhior> (root)buildroot:~# [ 41.772794] Unable to handle kernel access to user memory without ua
<Melkhior> [ 41.779362] Oops [#1]
<Melkhior> ccess routines at virtual address 00000048
<Melkhior> a50da0-dirty #14
<Melkhior> [ 41.780185] CPU: 0 PID: 159 Comm: stream_unrolled Not tainted 5.13.0-rc2-173551-g9ed90b
<Melkhior> [ 41.784120] epc : handle_mm_fault+0x228/0x990
<Melkhior> [ 41.785990] ra : do_page_fault+0xd4/0x2ce
<Melkhior> [ 41.788170] epc : c00b3fa4 ra : c000586e sp : c2843ea0
<Melkhior> [ 41.790646] gp : c0660958 tp : c1433600 t0 : c000579a
<Melkhior> [ 41.793199] t1 : c04cc0f8 t2 : c04cc138 s0 : c2843f20
<Melkhior> [ 41.795072] s1 : 9bbd4000 a0 : c0b7e600 a1 : c161e000
<Melkhior> [ 41.797699] a2 : 00000cc0 a3 : 80000000 a4 : 42836000
<Melkhior> [ 41.800291] a5 : 00000000 a6 : 9d58f000 a7 : 000001a6
<Melkhior> [ 41.802785] s2 : 00000255 s3 : c2843f70 s4 : c0662000
<Melkhior> [ 41.805377] s5 : c068df68 s6 : 00000255 s7 : 00000000
<Melkhior> [ 41.807981] s8 : 0000000f s9 : 0000000d s10: c15b7c48
<Melkhior> [ 41.810579] s11: 0000000f t3 : 000993c9 t4 : 003d0900
<Melkhior> [ 41.812647] t5 : 00000000 t6 : 00000001
<Melkhior> [ 41.814088] status: 00000120 badaddr: 00000048 cause: 0000000d
<Melkhior> [ 41.816980] Call Trace:
<Melkhior> [ 41.817819] [<c00b3fa4>] handle_mm_fault+0x228/0x990
<Melkhior> [ 41.819586] [<c000586e>] do_page_fault+0xd4/0x2ce
<Melkhior> [ 41.821179] [<c000208a>] ret_from_exception+0x0/0xc
<Melkhior> [ 41.825777] ---[ end trace 1b357620da9de914 ]---
<Melkhior> let's try again
<Melkhior> 'plug, force reboot from BIOS' -> seems OK
<Melkhior> 'halt, reset' -> broken
<Melkhior> I'll try do do more tests over the week-end to confirm that 'reboot' is better than 'halt;hard-reset'
<Melkhior> @_florent_ question: does 'reboot' in the BIOS also write to said "SoCControler's reset register" ?
<Melkhior> I suppose so...
<_florent_> Yes it's rebooting by writing to reset register
<_florent_> which should be very similar to a button reset since this is also reseting the PLL (since self.rst Signal is present)
<Melkhior> I will have do do more tests; if 'reboot' from linux succeeds (i.e. no issue with the sd-card) it seems the benchmark will work
<Melkhior> however I've seen it fails after a 'reboot' from BIOS I think
<Melkhior> off to lunch, thx for the explanations
Degi has quit [Remote host closed the connection]
Degi has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
FabM has quit [Ping timeout: 265 seconds]
geertu has quit [Quit: Changing server]
geertu has joined #litex
FabM has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
futarisIRCcloud has joined #litex
FabM has quit [Ping timeout: 272 seconds]
FabM has joined #litex
C-Man has joined #litex
C-Man has left #litex [#litex]
C-Man has joined #litex
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
FabM has quit [Quit: Leaving]
jryans has quit [Quit: node-irc says goodbye]
shoragan[m] has quit [Quit: node-irc says goodbye]
dcallagh has quit [Quit: node-irc says goodbye]
sajattack[m] has quit [Quit: node-irc says goodbye]
Leon[m] has quit [Quit: node-irc says goodbye]
jryans has joined #litex
shoragan[m] has joined #litex
Leon[m] has joined #litex
dcallagh has joined #litex
sajattack[m] has joined #litex
pftbest_ has joined #litex
pftbest has quit [Ping timeout: 272 seconds]
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #litex
pftbest_ has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Ping timeout: 268 seconds]
tcal has joined #litex
pftbest has joined #litex
pftbest has quit [Ping timeout: 264 seconds]
pftbest has joined #litex
pftbest has quit [Ping timeout: 268 seconds]
pftbest has joined #litex
pftbest has quit [Remote host closed the connection]
pftbest has joined #litex
pftbest has quit [Ping timeout: 264 seconds]