dgilmore changed the topic of #fedora-riscv to: Fedora on RISC-V https://fedoraproject.org/wiki/Architectures/RISC-V || Logs: https://libera.irclog.whitequark.org/fedora-riscv || Alt Arch discussions are welcome in #fedora-alt-arches
<somlo> davidlt[m]: when I try booting on LiteX, I get a store/AMO access fault very early during kernel initialization (in paging_init, apparently): https://pastebin.com/EXXidSdn
<somlo> ever seen an error like this before ? :)
zsun has joined #fedora-riscv
zsun has quit [Quit: Leaving.]
zsun has joined #fedora-riscv
zsun has quit [Quit: Leaving.]
zsun has joined #fedora-riscv
ahs3[m] has quit [Quit: You have been kicked for being idle]
zsun has quit [Quit: Leaving.]
esv has quit [Ping timeout: 246 seconds]
zsun has joined #fedora-riscv
<somlo> davidlt[m]: I am wondering if something in the newer kernel config is causing the kernel to fail to boot on litex. So using a recent 6.2.0-rc1+ tree, I did the following (using the old f33 5.18-8-200.0 config, which results in a kernel that boots fine, and the new f37 6.0.10-300.0 config, which crashes as per the link above):
<somlo> cp /boot/config-<version> .config
<somlo> make olddefconfig
<somlo> make savedefconfig
<somlo> cp defconfig ../defconfig-<version>
<somlo> the diff between the two defconfigs is here: http://mirror.ini.cmu.edu/litex/defconfig.diff
<somlo> Do any of the added (or removed) config options sound like a plausible explanation? I'll go over them and see if anything immediately sticks out to me, but each actual hypothesis will take me a few days to test :)
<somlo> Oh, and Happy New Year! :)
<somlo> so I narrowed it down to only things that are set to "y" (unconditionally, modules -- "m" are less likely to be an issue):
<somlo> $ grep ^+ /tmp/defconfig.diff | grep y$
<somlo> +CONFIG_ERRATA_THEAD=y
<somlo> +CONFIG_SOC_STARFIVE=y
<somlo> +CONFIG_PRINTK_INDEX=y
<somlo> +CONFIG_KEXEC_FILE=y
<somlo> +CONFIG_COMPAT_32BIT_TIME=y
<somlo> +CONFIG_MODULE_UNLOAD_TAINT_TRACKING=y
<somlo> +CONFIG_NF_FLOW_TABLE_PROCFS=y
<somlo> +CONFIG_FW_LOADER_COMPRESS_ZSTD=y
<somlo> +CONFIG_EFI_COCO_SECRET=y
<somlo> +CONFIG_NVME_AUTH=y
<somlo> +CONFIG_NVME_TARGET_AUTH=y
<somlo> +CONFIG_SCSI_FLASHPOINT=y
<somlo> +CONFIG_FB_EFI=y
<somlo> +CONFIG_IMA_KEXEC=y
<somlo> +CONFIG_CRYPTO_ECDH=y
<somlo> +CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE=y
esv has joined #fedora-riscv
zsun has quit [Quit: Leaving.]
fuwei has joined #fedora-riscv
<somlo> as a first pass at a test, I'll turn off all of the above in the stock f37 config and re-build the 6.2.0-rc1+ kernel, to see if it makes a difference to litex
<somlo> we should know at some point in early January 23 ;)
<davidlt[m]> somlo: so you are sure that this is related to CONFIG_* option?
<davidlt[m]> I have seen store/AMO issues before (multiple times), each time the reasons were different for it.
<davidlt[m]> Sadly that means I don't have exact solution for you. That is -- > needs debugging (which you are already doing).
ahs3[m] has joined #fedora-riscv
<somlo> yeah, I can say that 6.2.0-rc1+ with `make olddefconfig` using the f33 config works; same process for f37 does not, so I'm suspecting CONFIG_* -- I'll report back as soon as I know more...
fuwei has quit [Ping timeout: 246 seconds]
fuwei has joined #fedora-riscv
nirik has quit [Ping timeout: 252 seconds]