sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv | Matrix: #riscv:catircservices.org
alexghiti has quit [Ping timeout: 264 seconds]
KREYREN has quit [Remote host closed the connection]
theruran has quit [Quit: Connection closed for inactivity]
damian101 has joined #riscv
damian101__ has quit [Ping timeout: 260 seconds]
s1b1 has quit [Quit: ZNC 1.8.2+deb3.1 - https://znc.in]
Forty-Bot has quit [Read error: Connection reset by peer]
damian101_ has joined #riscv
Forty-Bot has joined #riscv
damian101 has quit [Ping timeout: 264 seconds]
damian101_ has quit [Ping timeout: 255 seconds]
damian101 has joined #riscv
damian101 has quit [Ping timeout: 240 seconds]
damian101 has joined #riscv
naoki has joined #riscv
naoki has quit [Client Quit]
s1b1 has joined #riscv
<clever> /build/openssl-3.0.13/crypto/threads_pthread.c:272:(.text+0x33c): undefined reference to `__atomic_load_8'
<clever> the a in rv32ima is for atomic opcodes right? so what would result in __atomic_load_8 being missing?
<clever> i would expect libgcc.a to be providing a symbol like that
<wbx> clever: i would say you need to link -latomic
<clever> hmmm, i do have a libatomic.a in my musl toolchain
<clever> but why would openssl be missing something so simple?
<clever> readelf does also find that symbol in there
<clever> -latomic does seem to fix things, its progressing now
<clever> [ 0.195737] Run /init as init process
<clever> [ 0.203968] init[15]: unhandled signal 11 code 0x2 at 0x81549934
<clever> wbx: that just leaves trying to figure out why init dies like this....
<wbx> clever: something for dalias in #musl?
<clever> had one more idea, then i can try there
<clever> wbx: aha, progress, rdinit=/bin/sh gives a "working" shell, and it only dies upon trying to run "cat /init", so i can at least inspect things a little bit
BootLayer has joined #riscv
<sorear> clever: __atomic_load_8 is an 8 byte load, rv32ima only provides 1, 2, 4 byte loads natively
<sorear> clever: did you get a register dump as part of the kernel signal handling? you should
<sorear> one nice thing about qemu is it has a built-in gdb stub, you can add "-s" to the qemu command and then "target remote :1234" in gdb
<clever> sorear: yep i did
<clever> when i run my linux build in gdb, i dont get any serial output
<sorear> do you mean in qemu?
<clever> oops, yeah
<clever> lines 8-23, i believe is the child crashing, due to system() trying to do /bin/sh -c "cat /init"
<clever> and then lines 24-39 is the parent dying as well
<clever> oh, i also just noticed, its ash instead of hush
<sorear> what do you have in the dump at 81549934 and 816687e8 ?
<sorear> both of them are userspace null pointer dereferences if status/badaddr/cause can be trusted
<clever> 81549934: 038aa783 lw a5,56(s5)
<clever> 816687e8: 00092783 lw a5,0(s2)
<clever> sorear: if i load the entire ram into ghidra, then 81549934 is inside this function i believe
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #riscv
<sorear> what is s5? where does it come from? building with -O0 -g may help
<clever> the gist does list a5 as being null in that first failure
<sorear> s5, not a5
<clever> oh, you mean where it came from
<sorear> where it came from, what it means, etc
<clever> yeah
<clever> sorear: https://gist.github.com/cleverca22/6ca15fdc6fb014cc8e7094c2b0979ea1#file-gistfile2-txt i believe this is the entry of the function, up to the failure
<clever> so we can see that s5, is just a load from a5
<clever> which itself, was a constant from elsewhere in ram
<clever> ghidra shows its a few layers of pointers to pointers, ending in a null
<wbx> clever: IIRC ash needs an MMU. hush is for noMMU.
<clever> i was thinking the same thing, when i noticed ash was present
<clever> let me see if i can force it back to hush
<clever> busybox-riscv32-linux-musl> Force NOMMU build (NOMMU) [N/y/?] n
<clever> wbx: that one looks important!
jacklsw has joined #riscv
<wbx> sorear: with my regular testing I found one problem with riscv64 noMMU FLAT. clone test is segfaulting. riscv64 MMU is fine. riscv64 noMMU ELF is fine, too. any idea why clone is failing in uClibc-ng-test?
<wbx> clever: yes.
<clever> but it still produces an ash binary
<clever> busybox-riscv32-linux-musl> .config:1172:warning: trying to assign nonexistent symbol NUMMU
<clever> because i cant spell, lol
<clever> BusyBox v1.36.1 () hush - the humble shell
<clever> # cat /init
<clever> [ 3.176902] sh[15]: unhandled signal 11 code 0x2 at 0x8174b9c0
<clever> and it fails in basically the identical way
<sorear> wbx: we did hit an argument order issue a while ago but I can't see any reason that would care about binfmt
<wbx> sorear: but it should be fixed in 1.0.48
<wbx> sorear: i also have verified that the pending kernel patch is applied.
<sorear> clever: giant execution trace time? does a reasonable value of s5 get loaded and then inappropriately overwritten, or never loaded?
jacklsw has quit [Ping timeout: 268 seconds]
<clever> sorear: not sure, i havent done much low level debug like this on rv32
jacklsw has joined #riscv
<clever> this error msg appears in the faulting function around 0x8174b9c0, inlining probably did that
<clever> it feels like i'm in this area of the code
<clever> yep, its all lining up
<clever> 81549850 in ram, is b590 in the busybox binary, now that i have the load offset, its easy
<clever> b9b4: e30f80ef jal 3fe4 <xpipe>
<clever> b9b8: cc8f70ef jal 2e80 <vfork@plt>
<clever> b9bc: 00412703 lw a4,4(sp)
<clever> b9c0: 00a72023 sw a0,0(a4)
davidlt has joined #riscv
damian101 has quit [Remote host closed the connection]
damian101 has joined #riscv
naoki has joined #riscv
davidlt has quit [Ping timeout: 268 seconds]
jacklsw has quit [Ping timeout: 264 seconds]
naoki has quit [Client Quit]
theruran has joined #riscv
damian101 has quit [Ping timeout: 252 seconds]
BootLayer has quit [Quit: Leaving]
fuel has joined #riscv
<fuel> hey, i'm currently working on my own risc-v emulator, and i was wondering what exactly i should use for my first test programs
<sorear> probably depends on the scope of the emulator
<sorear> is it intended to run linux programs? short snippets of code written by students? simulating a realistic computer running a realistic OS? simulating a realistic microcontroller? VM for universal computing gadget applications?
<sorear> there are several test suites covering different parts of that space but if you want a "hello world" it probably makes sense to design one yourself
<fuel> sorear, i want to emulate a semi-realistic system that runs real oses
davidlt has joined #riscv
<fuel> something similar to what 86box and pcem do for x86, except without emulating real machines and firmware lol
<fuel> i usually test with real programs when first starting on writing a new emulator so :p
davidlt has quit [Ping timeout: 264 seconds]
<clever> wbx: found the issue over in the #musl channel, musl lacked a rv32 vfork(), so it just ran fork(), and rv32nommu linux doesnt block that!
mlw has quit [Read error: Connection reset by peer]
mlw has joined #riscv
smaeul has quit [Ping timeout: 268 seconds]
davidlt has joined #riscv
fuel has quit [Remote host closed the connection]
fuel has joined #riscv
alexghiti has joined #riscv
jacklsw has joined #riscv
damian101 has joined #riscv
jacklsw has quit [Ping timeout: 268 seconds]
jacklsw has joined #riscv
damian101 has quit [Ping timeout: 256 seconds]
jacklsw has quit [Ping timeout: 260 seconds]
jacklsw has joined #riscv
theruran has quit [Quit: Connection closed for inactivity]
jacklsw has quit [Ping timeout: 256 seconds]
jacklsw has joined #riscv
fossdd has quit [Remote host closed the connection]
fossdd_ has joined #riscv
jacklsw has quit [Ping timeout: 256 seconds]
jacklsw has joined #riscv
<Esmil> geertu: oh, interesting. i'll try on my starlight
<geertu> Esmil: I tried bisecting it, but it was a bit hard because several commits don't build.
<geertu> After re-ordering, I arrived at "riscv: dts: starfive: Add JH7100 USB node", "usb: cdns3: starfive: Initialize JH7100 host mode", and "riscv: dts: Add full JH7100, Starlight and VisionFive support" (they need to be applied together to build)
mlw has quit [Ping timeout: 268 seconds]
<geertu> If I find some time, I try to discover which DTS node causes the issue
mlw has joined #riscv
<mps> Esmil: iiuc this should work on visionfive V1
<mps> I'll had some time tomorrow to test
<mps> s/had/have/
<Esmil> mps: yeah, i booted my visionfive branch on the vf1, but i'll try the starlight board now
<mps> aha, ok. I don't have startlights, only starfives
<geertu> I am not using the defconfig, so I may have enabled something special that triggers the issue
<Esmil> geertu: yes, i was going to say it works with this config https://termbin.com/dg672
<Esmil> ..but i'm compiling the visionfive_defconfig now
pabs3 has quit [Read error: Connection reset by peer]
pabs3 has joined #riscv
pecastro has joined #riscv
<Esmil> geertu: yeah, the visionfive_defconfig also boots on the starlight board for me
smaeul has joined #riscv
alperak has joined #riscv
_whitelogger has joined #riscv
<geertu> Esmil: thanks for checking!
<Esmil> geertu: np. i wonder which config option will break it now
<Esmil> mps: if you could test if bluetooth works with my visionfive branch that would be great
<mps> Esmil: sorry, I don't have any bluetooth peripheral
<Esmil> there should be a wifi/bluetooth module on all VF1s
<mps> maybe I could try with my son mouse for macs if it is compatible with linux
<mps> I can do only basic tests maybe
<Esmil> ah, but just booting and looking for either the 'Bluetooth: hci0: command 0x1001 tx timeout' or 'Bluetooth: hci0: BCM43430A1 'brcm/BCM43430A1.hcd' Patch' line should be fine
<mps> aha, ok. then it is easier. maybe I will find time this evening
<Esmil> thanks
<mps> np
paddymahoney has quit [Ping timeout: 240 seconds]
paddymahoney has joined #riscv
jacklsw has quit [Quit: Back to the real world]
fossdd_ has quit [Remote host closed the connection]
damian101 has joined #riscv
<geertu> Esmil: My starlight boots again when removing the &spi2 { ... } from jh7100-common.dtsi
davidlt has quit [Remote host closed the connection]
davidlt has joined #riscv
fossdd_ has joined #riscv
knielsen_ is now known as knielsen
<Esmil> geertu: ..and when it fails it's still a NULL pointer dereference in the plic driver like you pasted above?
<Esmil> because the visionfive_defconfig does have CONFIG_SPI_DW_MMIO=y so should probe the driver enabled in that node
<geertu> Yeah, I noticed that, too
<geertu> defconfig fails in the same way
<geertu> trying visionfive_defconfig
<geertu> Note that the bad kernel booted fine once, so perhaps there is some race condition
<geertu> WARN_ON_ONCE(!handler->present);
<geertu> Perhaps the plic_handlers per_cpu handling is racey?
<geertu> Esmil: visionfive_defconfig does better, but spews lots of "device non-coherent but no non-coherent operations supported" warnings, and fails to mount nfsroot
<geertu> My "bad" kernel+config booted fine again. So this is an intermittent problem.
<geertu> On success, the kernel prints
<geertu> iscv-plic c000000.interrupt-controller: mapped 133 interrupts with 2 handlers
<geertu> for 4 contexts.
<geertu> On failure, it prints
<geertu> WARNING: CPU: 0 PID: 1 at drivers/irqchip/irq-sifive-plic.c:373 plic_handle_irq+0xf2/0xf6
<geertu> followed by a NULL-pointer deref later
joev_ is now known as joev
<geertu> So it crashes when an early interrupt happens in plic_probe
mlw has quit [Ping timeout: 240 seconds]
mlw has joined #riscv
<Esmil> geertu: there is 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver") that moves the probing of the plic driver later, but that was merged for v6.8 already
<Esmil> ..but it does sound like the plic probe now races with peripherals that need their interrupts
<geertu> That was in v6.9, and it did cause issues, that were fixed
<Esmil> yes, sorry 6.9
guerby_ is now known as guerby
psydroid has joined #riscv
<Esmil> geertu: i'm curious. does my visionfive branch + "git revert a7fb69ffd7ce abb720579490 956521064780 a15587277a24 6c725f33d67b b68d0ff529a9 25d862e183d4 8ec99b033147" work for you?
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #riscv
paddymahoney has quit [Ping timeout: 240 seconds]
damian101 has quit [Ping timeout: 268 seconds]
Amanieu has joined #riscv
<geertu> Esmil: After reverting these, the kernel boots fine 5/5
damian101 has joined #riscv
<geertu> Before the revert, 1/5 (1 in 5 tries)
<Esmil> geertu: aha! so it *is* 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
paddymahoney has joined #riscv
<geertu> Esmil: Probably. After the revert, plic is initialized earlier
<Esmil> exactly
<conchuod> I think that patch highlights just how kernel bad testing is on riscv.
<conchuod> s/kernel bad/bad kernel/
Andre_Z has joined #riscv
Amanieu has quit [Quit: Amanieu]
Amanieu has joined #riscv
damian101 has quit [Remote host closed the connection]
Andre_Z has quit [Quit: Leaving.]
fossdd_ is now known as fossdd
Stat_headcrabed has joined #riscv
seds has quit [Quit: Connection closed for inactivity]
Amanieu has quit [Quit: Amanieu]
cp- has quit [Ping timeout: 272 seconds]
cp- has joined #riscv
cp- has quit [Ping timeout: 256 seconds]
Stat_headcrabed has quit [Quit: Stat_headcrabed]
cp- has joined #riscv
andyc has joined #riscv
JanC has quit [Read error: Connection reset by peer]
JanC has joined #riscv
BootLayer has joined #riscv
hbx has joined #riscv
luca_ has joined #riscv
luca_ is now known as OwlWizard
damian101 has joined #riscv
jfsimon1981 has joined #riscv
Amanieu has joined #riscv
andyc has quit [Quit: Connection closed for inactivity]
fuel is now known as BootyWarrior
BootyWarrior is now known as fuel
damian101 has quit [Ping timeout: 255 seconds]
stefanct has quit [Ping timeout: 268 seconds]
stefanct has joined #riscv
fuel is now known as Fidel-Castro
Fidel-Castro is now known as JoeBiden
JoeBiden is now known as fuel
fuel is now known as BidenHisTime
BidenHisTime is now known as fuel
BootLayer has quit [Quit: Leaving]
JanC has quit [Remote host closed the connection]
JanC has joined #riscv
SpaceCoaster has quit [Quit: Bye]
SpaceCoaster has joined #riscv
tlwoerner has quit [Remote host closed the connection]
tlwoerner has joined #riscv
OwlWizard has quit [Quit: OwlWizard]
alexghiti has quit [Ping timeout: 268 seconds]
alexghiti has joined #riscv
theruran has joined #riscv
davidlt has quit [Read error: Connection reset by peer]
davidlt_ has joined #riscv
<mps> Esmil: I've got just this https://tpaste.us/z509
<mps> do I have to enable some drivers for JH7100
<Esmil> mps: CONFIG_BT_HCIUART_BCM=y
<mps> oh
<mps> will rebuild now
<mps> (it is slow with qemu-user)
vagrantc has joined #riscv
<mps> Esmil: all I've got is https://tpaste.us/Xyk8
<mps> list of related modules loaded https://tpaste.us/LRKa
<Esmil> ah, sorry. do you have CONFIG_SERIAL_DEV_BUS=y and CONFIG_SERIAL_DEV_CTRL_TTYPORT=y ?
<Esmil> mps: ^
<mps> I have CONFIG_SERIAL_DEV_BUS=y but not sure about CONFIG_SERIAL_DEV_CTRL_TTYPORT. let me check
<mps> oh, I had CONFIG_SERIAL_DEV_BUS=m and see now it must be =y
<Esmil> yeah, i seem to remember i ran into something like that too
<mps> (rebuilding again, will be back 15-20 minutes)
<mps> Esmil: now I've got https://tpaste.us/Z5wa
<Esmil> ugh, yeah. so apparently my vf1 is special in that it works :(
<mps> for me it is not important, I didn't used bluetooth for about 15 years
alperak has quit [Quit: Connection closed for inactivity]
davidlt_ has quit [Ping timeout: 255 seconds]
mlw has quit [Ping timeout: 256 seconds]
jfsimon1981 has quit [Remote host closed the connection]
danilogondolfo has quit [Remote host closed the connection]
psydroid has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]
hightower3 has joined #riscv
hightower4 has quit [Ping timeout: 264 seconds]
alexghiti has quit [Ping timeout: 256 seconds]
wingsorc has quit [Quit: Leaving]
wingsorc has joined #riscv
vagrantc has quit [Quit: leaving]
pecastro has quit [Ping timeout: 272 seconds]
DesRoin has quit [Ping timeout: 255 seconds]
DesRoin has joined #riscv