00:00
tpb has quit [Remote host closed the connection]
00:00
tpb has joined #litex
02:33
Wolf0 has quit [Ping timeout: 252 seconds]
02:45
Wolf0 has joined #litex
03:14
Degi_ has joined #litex
03:15
Degi has quit [Ping timeout: 272 seconds]
03:15
Degi_ is now known as Degi
05:44
pftbest has joined #litex
05:53
pftbest_ has joined #litex
05:54
pftbest has quit [Read error: Connection reset by peer]
05:54
pftbest_ has quit [Remote host closed the connection]
05:55
pftbest has joined #litex
06:25
FabM has joined #litex
09:18
TMM_ has joined #litex
09:44
<
Melkhior >
@_florent_ For the issue I have (the 'stream' benchmark not validating), it seems I have found a 'deterministic' way of creating the problem (with my bitstream/binary combo anyway)
09:44
<
Melkhior >
it's not temperature, it's power-cycling
09:44
<
Melkhior >
if I power cycle (unplug/replug), during this first boot the bennchmark is fine
09:45
<
Melkhior >
if I then type 'halt' in Linux, wait for Linux to halt, then hit the 'reset' button on the board, this particular variant of the benchmark won't validate
09:46
<
Melkhior >
most of the system works fine (I have seen very occasional random crashes in the past that I cannot confirm have the same origin)
09:46
<
Melkhior >
the benchmark relies on FP64 computations, so it could be the FPU, or the memory
09:47
<
Melkhior >
or coherency (it uses OpenMP over 4 cores)
09:47
<
Melkhior >
weird, I could have thought that using the reset button and power-cycling would be similar/identical...
09:49
<
Melkhior >
power-cycling is an OK workaround, but it seems something doesn't reset properly in my soc
09:50
<
_florent_ >
interesting, in the CRG of your SoC, do you have a self.rst Signal connected to the PLL?
09:51
<
_florent_ >
If yes, the whole SoC will be reseted when SoCController.reset is written
09:52
<
_florent_ >
If not, only the CPU will be reseted
09:52
<
_florent_ >
If you have it, that could be useful to remove it to see if it impacts the behavior
09:55
<
Melkhior >
self.comb += pll.reset.eq(~plls_reset | self.rst)
09:55
<
Melkhior >
for all 4 S7MMCM (system, idelay, video, usb)
09:56
<
Melkhior >
plls_reset = platform.request("cpu_reset")
09:56
<
Melkhior >
"cpu_reset" is the reset button I think
09:56
<
_florent_ >
ok, cpu_reset is the reset button yes
09:57
<
_florent_ >
instead of halt, could you do a reboot in Linux?
09:57
<
_florent_ >
this should write the SoCControler's reset register to reboot the CPU
09:58
<
_florent_ >
it would be interesting to see if the behavior is similar than with a manual reset
09:59
<
Melkhior >
'reboot' worked
09:59
<
Melkhior >
and it seems the benchmark is fine after 'reboot'
10:02
<
Melkhior >
'halt+reset' -> broken again
10:03
<
Melkhior >
'reboot' -> can't find the sdcard ...
10:03
<
_florent_ >
just after plugging the board
10:03
<
_florent_ >
can you interrupt the LiteX BIOS boot
10:03
<
_florent_ >
do a reboot command
10:03
<
_florent_ >
and then let it boot and see if the test is passing?
10:04
<
Melkhior >
'reset' (from the missing sdcard) -> seems OK
10:04
<
Melkhior >
will try
10:06
<
Melkhior >
not sure if it's related, but the 'plug, force reboot from BIOS sequence didn't do what we expected':
10:06
<
Melkhior >
(root)buildroot:~# [ 41.772794] Unable to handle kernel access to user memory without ua
10:06
<
Melkhior >
[ 41.779362] Oops [#1]
10:06
<
Melkhior >
ccess routines at virtual address 00000048
10:06
<
Melkhior >
a50da0-dirty #14
10:06
<
Melkhior >
[ 41.780185] CPU: 0 PID: 159 Comm: stream_unrolled Not tainted 5.13.0-rc2-173551-g9ed90b
10:06
<
Melkhior >
[ 41.784120] epc : handle_mm_fault+0x228/0x990
10:06
<
Melkhior >
[ 41.785990] ra : do_page_fault+0xd4/0x2ce
10:06
<
Melkhior >
[ 41.788170] epc : c00b3fa4 ra : c000586e sp : c2843ea0
10:07
<
Melkhior >
[ 41.790646] gp : c0660958 tp : c1433600 t0 : c000579a
10:07
<
Melkhior >
[ 41.793199] t1 : c04cc0f8 t2 : c04cc138 s0 : c2843f20
10:07
<
Melkhior >
[ 41.795072] s1 : 9bbd4000 a0 : c0b7e600 a1 : c161e000
10:07
<
Melkhior >
[ 41.797699] a2 : 00000cc0 a3 : 80000000 a4 : 42836000
10:07
<
Melkhior >
[ 41.800291] a5 : 00000000 a6 : 9d58f000 a7 : 000001a6
10:07
<
Melkhior >
[ 41.802785] s2 : 00000255 s3 : c2843f70 s4 : c0662000
10:07
<
Melkhior >
[ 41.805377] s5 : c068df68 s6 : 00000255 s7 : 00000000
10:07
<
Melkhior >
[ 41.807981] s8 : 0000000f s9 : 0000000d s10: c15b7c48
10:07
<
Melkhior >
[ 41.810579] s11: 0000000f t3 : 000993c9 t4 : 003d0900
10:07
<
Melkhior >
[ 41.812647] t5 : 00000000 t6 : 00000001
10:07
<
Melkhior >
[ 41.814088] status: 00000120 badaddr: 00000048 cause: 0000000d
10:07
<
Melkhior >
[ 41.816980] Call Trace:
10:07
<
Melkhior >
[ 41.817819] [<c00b3fa4>] handle_mm_fault+0x228/0x990
10:07
<
Melkhior >
[ 41.819586] [<c000586e>] do_page_fault+0xd4/0x2ce
10:07
<
Melkhior >
[ 41.821179] [<c000208a>] ret_from_exception+0x0/0xc
10:07
<
Melkhior >
[ 41.825777] ---[ end trace 1b357620da9de914 ]---
10:07
<
Melkhior >
let's try again
10:10
<
Melkhior >
'plug, force reboot from BIOS' -> seems OK
10:13
<
Melkhior >
'halt, reset' -> broken
10:17
<
Melkhior >
I'll try do do more tests over the week-end to confirm that 'reboot' is better than 'halt;hard-reset'
10:18
<
Melkhior >
@_florent_ question: does 'reboot' in the BIOS also write to said "SoCControler's reset register" ?
10:20
<
Melkhior >
I suppose so...
10:20
<
_florent_ >
Yes it's rebooting by writing to reset register
10:20
<
_florent_ >
which should be very similar to a button reset since this is also reseting the PLL (since self.rst Signal is present)
10:24
<
Melkhior >
I will have do do more tests; if 'reboot' from linux succeeds (i.e. no issue with the sd-card) it seems the benchmark will work
10:24
<
Melkhior >
however I've seen it fails after a 'reboot' from BIOS I think
10:27
<
Melkhior >
off to lunch, thx for the explanations
11:11
Degi has quit [Remote host closed the connection]
11:16
Degi has joined #litex
11:20
pftbest has quit [Remote host closed the connection]
11:20
pftbest has joined #litex
11:31
FabM has quit [Ping timeout: 265 seconds]
11:44
geertu has quit [Quit: Changing server]
11:44
geertu has joined #litex
11:48
FabM has joined #litex
12:30
pftbest has quit [Remote host closed the connection]
12:43
pftbest has joined #litex
12:45
futarisIRCcloud has joined #litex
13:03
FabM has quit [Ping timeout: 272 seconds]
13:11
FabM has joined #litex
13:48
C-Man has joined #litex
13:48
C-Man has left #litex [#litex]
13:49
C-Man has joined #litex
14:55
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
16:19
FabM has quit [Quit: Leaving]
16:36
jryans has quit [Quit: node-irc says goodbye]
16:36
shoragan[m] has quit [Quit: node-irc says goodbye]
16:36
dcallagh has quit [Quit: node-irc says goodbye]
16:36
sajattack[m] has quit [Quit: node-irc says goodbye]
16:36
Leon[m] has quit [Quit: node-irc says goodbye]
16:37
jryans has joined #litex
16:39
shoragan[m] has joined #litex
16:39
Leon[m] has joined #litex
16:39
dcallagh has joined #litex
16:39
sajattack[m] has joined #litex
16:53
pftbest_ has joined #litex
16:57
pftbest has quit [Ping timeout: 272 seconds]
17:28
TMM_ has joined #litex
18:52
pftbest_ has quit [Remote host closed the connection]
18:53
pftbest has joined #litex
19:00
pftbest has quit [Remote host closed the connection]
19:00
pftbest has joined #litex
19:21
pftbest has quit [Remote host closed the connection]
19:25
pftbest has joined #litex
19:31
pftbest has quit [Remote host closed the connection]
19:33
pftbest has joined #litex
19:36
pftbest has quit [Remote host closed the connection]
19:56
pftbest has joined #litex
20:02
pftbest has quit [Ping timeout: 268 seconds]
20:07
tcal has joined #litex
20:38
pftbest has joined #litex
20:42
pftbest has quit [Ping timeout: 264 seconds]
21:35
pftbest has joined #litex
21:40
pftbest has quit [Ping timeout: 268 seconds]
22:09
pftbest has joined #litex
23:09
pftbest has quit [Remote host closed the connection]
23:29
pftbest has joined #litex
23:34
pftbest has quit [Ping timeout: 264 seconds]