Forty-Bot has quit [Read error: Connection reset by peer]
damian101_ has joined #riscv
Forty-Bot has joined #riscv
damian101 has quit [Ping timeout: 264 seconds]
damian101_ has quit [Ping timeout: 255 seconds]
damian101 has joined #riscv
damian101 has quit [Ping timeout: 240 seconds]
damian101 has joined #riscv
naoki has joined #riscv
naoki has quit [Client Quit]
s1b1 has joined #riscv
<clever>
/build/openssl-3.0.13/crypto/threads_pthread.c:272:(.text+0x33c): undefined reference to `__atomic_load_8'
<clever>
the a in rv32ima is for atomic opcodes right? so what would result in __atomic_load_8 being missing?
<clever>
i would expect libgcc.a to be providing a symbol like that
<wbx>
clever: i would say you need to link -latomic
<clever>
hmmm, i do have a libatomic.a in my musl toolchain
<clever>
but why would openssl be missing something so simple?
<clever>
readelf does also find that symbol in there
<clever>
-latomic does seem to fix things, its progressing now
<clever>
[ 0.195737] Run /init as init process
<clever>
[ 0.203968] init[15]: unhandled signal 11 code 0x2 at 0x81549934
<clever>
wbx: that just leaves trying to figure out why init dies like this....
<wbx>
clever: something for dalias in #musl?
<clever>
had one more idea, then i can try there
<clever>
wbx: aha, progress, rdinit=/bin/sh gives a "working" shell, and it only dies upon trying to run "cat /init", so i can at least inspect things a little bit
BootLayer has joined #riscv
<sorear>
clever: __atomic_load_8 is an 8 byte load, rv32ima only provides 1, 2, 4 byte loads natively
<sorear>
clever: did you get a register dump as part of the kernel signal handling? you should
<sorear>
one nice thing about qemu is it has a built-in gdb stub, you can add "-s" to the qemu command and then "target remote :1234" in gdb
<clever>
sorear: yep i did
<clever>
when i run my linux build in gdb, i dont get any serial output
<clever>
so we can see that s5, is just a load from a5
<clever>
which itself, was a constant from elsewhere in ram
<clever>
ghidra shows its a few layers of pointers to pointers, ending in a null
<wbx>
clever: IIRC ash needs an MMU. hush is for noMMU.
<clever>
i was thinking the same thing, when i noticed ash was present
<clever>
let me see if i can force it back to hush
<clever>
busybox-riscv32-linux-musl> Force NOMMU build (NOMMU) [N/y/?] n
<clever>
wbx: that one looks important!
jacklsw has joined #riscv
<wbx>
sorear: with my regular testing I found one problem with riscv64 noMMU FLAT. clone test is segfaulting. riscv64 MMU is fine. riscv64 noMMU ELF is fine, too. any idea why clone is failing in uClibc-ng-test?
<wbx>
clever: yes.
<clever>
but it still produces an ash binary
<clever>
busybox-riscv32-linux-musl> .config:1172:warning: trying to assign nonexistent symbol NUMMU
<clever>
because i cant spell, lol
<clever>
BusyBox v1.36.1 () hush - the humble shell
<clever>
# cat /init
<clever>
[ 3.176902] sh[15]: unhandled signal 11 code 0x2 at 0x8174b9c0
<clever>
and it fails in basically the identical way
<sorear>
wbx: we did hit an argument order issue a while ago but I can't see any reason that would care about binfmt
<wbx>
sorear: but it should be fixed in 1.0.48
<wbx>
sorear: i also have verified that the pending kernel patch is applied.
<sorear>
clever: giant execution trace time? does a reasonable value of s5 get loaded and then inappropriately overwritten, or never loaded?
jacklsw has quit [Ping timeout: 268 seconds]
<clever>
sorear: not sure, i havent done much low level debug like this on rv32
damian101 has quit [Remote host closed the connection]
damian101 has joined #riscv
naoki has joined #riscv
davidlt has quit [Ping timeout: 268 seconds]
jacklsw has quit [Ping timeout: 264 seconds]
naoki has quit [Client Quit]
theruran has joined #riscv
damian101 has quit [Ping timeout: 252 seconds]
BootLayer has quit [Quit: Leaving]
fuel has joined #riscv
<fuel>
hey, i'm currently working on my own risc-v emulator, and i was wondering what exactly i should use for my first test programs
<sorear>
probably depends on the scope of the emulator
<sorear>
is it intended to run linux programs? short snippets of code written by students? simulating a realistic computer running a realistic OS? simulating a realistic microcontroller? VM for universal computing gadget applications?
<sorear>
there are several test suites covering different parts of that space but if you want a "hello world" it probably makes sense to design one yourself
<fuel>
sorear, i want to emulate a semi-realistic system that runs real oses
davidlt has joined #riscv
<fuel>
something similar to what 86box and pcem do for x86, except without emulating real machines and firmware lol
<fuel>
i usually test with real programs when first starting on writing a new emulator so :p
davidlt has quit [Ping timeout: 264 seconds]
<clever>
wbx: found the issue over in the #musl channel, musl lacked a rv32 vfork(), so it just ran fork(), and rv32nommu linux doesnt block that!
mlw has quit [Read error: Connection reset by peer]
mlw has joined #riscv
smaeul has quit [Ping timeout: 268 seconds]
davidlt has joined #riscv
fuel has quit [Remote host closed the connection]
fuel has joined #riscv
alexghiti has joined #riscv
jacklsw has joined #riscv
damian101 has joined #riscv
jacklsw has quit [Ping timeout: 268 seconds]
jacklsw has joined #riscv
damian101 has quit [Ping timeout: 256 seconds]
jacklsw has quit [Ping timeout: 260 seconds]
jacklsw has joined #riscv
theruran has quit [Quit: Connection closed for inactivity]
jacklsw has quit [Ping timeout: 256 seconds]
jacklsw has joined #riscv
fossdd has quit [Remote host closed the connection]
fossdd_ has joined #riscv
jacklsw has quit [Ping timeout: 256 seconds]
jacklsw has joined #riscv
<Esmil>
geertu: oh, interesting. i'll try on my starlight
<geertu>
Esmil: I tried bisecting it, but it was a bit hard because several commits don't build.
<geertu>
After re-ordering, I arrived at "riscv: dts: starfive: Add JH7100 USB node", "usb: cdns3: starfive: Initialize JH7100 host mode", and "riscv: dts: Add full JH7100, Starlight and VisionFive support" (they need to be applied together to build)
mlw has quit [Ping timeout: 268 seconds]
<geertu>
If I find some time, I try to discover which DTS node causes the issue
mlw has joined #riscv
<mps>
Esmil: iiuc this should work on visionfive V1
<mps>
I'll had some time tomorrow to test
<mps>
s/had/have/
<Esmil>
mps: yeah, i booted my visionfive branch on the vf1, but i'll try the starlight board now
<mps>
aha, ok. I don't have startlights, only starfives
<geertu>
I am not using the defconfig, so I may have enabled something special that triggers the issue
<mps>
Esmil: sorry, I don't have any bluetooth peripheral
<Esmil>
there should be a wifi/bluetooth module on all VF1s
<mps>
maybe I could try with my son mouse for macs if it is compatible with linux
<mps>
I can do only basic tests maybe
<Esmil>
ah, but just booting and looking for either the 'Bluetooth: hci0: command 0x1001 tx timeout' or 'Bluetooth: hci0: BCM43430A1 'brcm/BCM43430A1.hcd' Patch' line should be fine
<mps>
aha, ok. then it is easier. maybe I will find time this evening
<Esmil>
thanks
<mps>
np
paddymahoney has quit [Ping timeout: 240 seconds]
paddymahoney has joined #riscv
jacklsw has quit [Quit: Back to the real world]
fossdd_ has quit [Remote host closed the connection]
damian101 has joined #riscv
<geertu>
Esmil: My starlight boots again when removing the &spi2 { ... } from jh7100-common.dtsi
davidlt has quit [Remote host closed the connection]
davidlt has joined #riscv
fossdd_ has joined #riscv
knielsen_ is now known as knielsen
<Esmil>
geertu: ..and when it fails it's still a NULL pointer dereference in the plic driver like you pasted above?
<Esmil>
because the visionfive_defconfig does have CONFIG_SPI_DW_MMIO=y so should probe the driver enabled in that node
<geertu>
Yeah, I noticed that, too
<geertu>
defconfig fails in the same way
<geertu>
trying visionfive_defconfig
<geertu>
Note that the bad kernel booted fine once, so perhaps there is some race condition
<geertu>
WARN_ON_ONCE(!handler->present);
<geertu>
Perhaps the plic_handlers per_cpu handling is racey?
<geertu>
Esmil: visionfive_defconfig does better, but spews lots of "device non-coherent but no non-coherent operations supported" warnings, and fails to mount nfsroot
<geertu>
My "bad" kernel+config booted fine again. So this is an intermittent problem.
<geertu>
On success, the kernel prints
<geertu>
iscv-plic c000000.interrupt-controller: mapped 133 interrupts with 2 handlers
<geertu>
for 4 contexts.
<geertu>
On failure, it prints
<geertu>
WARNING: CPU: 0 PID: 1 at drivers/irqchip/irq-sifive-plic.c:373 plic_handle_irq+0xf2/0xf6
<geertu>
followed by a NULL-pointer deref later
joev_ is now known as joev
<geertu>
So it crashes when an early interrupt happens in plic_probe
mlw has quit [Ping timeout: 240 seconds]
mlw has joined #riscv
<Esmil>
geertu: there is 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver") that moves the probing of the plic driver later, but that was merged for v6.8 already
<Esmil>
..but it does sound like the plic probe now races with peripherals that need their interrupts
<geertu>
That was in v6.9, and it did cause issues, that were fixed
<Esmil>
yes, sorry 6.9
guerby_ is now known as guerby
psydroid has joined #riscv
<Esmil>
geertu: i'm curious. does my visionfive branch + "git revert a7fb69ffd7ce abb720579490 956521064780 a15587277a24 6c725f33d67b b68d0ff529a9 25d862e183d4 8ec99b033147" work for you?