dgilmore changed the topic of #fedora-riscv to: Fedora on RISC-V https://fedoraproject.org/wiki/Architectures/RISC-V || Logs: https://libera.irclog.whitequark.org/fedora-riscv || Alt Arch discussions are welcome in #fedora-alt-arches
JasenChao has joined #fedora-riscv
JasenChao has quit [Quit: Client closed]
JasenChao has joined #fedora-riscv
davidlt has joined #fedora-riscv
davidlt has quit [Remote host closed the connection]
davidlt has joined #fedora-riscv
<TelegramRelayBot> jasenchao joined the group via invite link.
JasenChao has quit [Quit: Client closed]
<davidlt> rwmjones, could we get NVR (rawhide and f40) for libunwind: https://src.fedoraproject.org/rpms/libunwind/commits/rawhide
<davidlt> This is just ExclusiveArch change
<davidlt> rwmjones, guile22, switch to use --disable-rpath instead of sed, http://fedora.riscv.rocks:3000/rpms/guile22/commit/6ba6be5a04d440c2cacba81ffc1517e77586ce70
<davidlt> rwmjones, cmake, update timing out tests for riscv64 to incl. one more (Qt6Autogen.MocIncludeSymlin): http://fedora.riscv.rocks:3000/rpms/cmake/commit/0510331e78061e4f1ca99bfb269d80740f411377
<davidlt> Do not disable test in general, this was just needed to double check the list
<davidlt> rwmjones, libssh, fix provides for riscv64 (same as on other arches): http://fedora.riscv.rocks:3000/rpms/libssh/commit/a2378456fe5c98edacd80694810d1db45cb623ba
<davidlt> rwmjones, kernel-srpm-macros, add riscv64 to kernel_arches macro: http://fedora.riscv.rocks:3000/rpms/kernel-srpm-macros/commit/208ecde39e98d412abf9ad17fcdf2a1c96df1180
<davidlt> Do git grep to double check if we need to do anything else in this package. Most likely not, but double check.
<davidlt> I am not sure if this is needed or not, as it exist for backward compatibility.
<davidlt> Double check with the maintainer if we need it.
<davidlt> Maybe some folks still expect it.
<rwmjones> morning, will do shortly
<rwmjones> davidlt: https://koji.fedoraproject.org/koji/taskinfo?taskID=114550920 libunwind-1.8.0-3.fc41
<rwmjones> davidlt: https://koji.fedoraproject.org/koji/taskinfo?taskID=114551094 kernel-srpm-macros-1.0-23.fc41
<rwmjones> I just updated kernel-srpm-macros without a PR as there is nothing to review
<rwmjones> there were lots of other changes in kernel-srpm-macros but none appeared related to riscv64
<rwmjones> %ghc_arches is not used anywhere, but better to have it IMHO
<davidlt> rwmjones, just reminder to make fc40 NVRs too
<rwmjones> f40, not f41?
<davidlt> Well we are building f40, not f41 yet
<rwmjones> that could be more complicated, let me see ...
<rwmjones> so there's also something to tell you about f40 ... the fork of c10s from f40 has completed, and I'm told they won't be updating / synchronizing c10s packages from f40 at all in future
<rwmjones> that means i'll need to pull all our f41 changes into c10s (merging not open yet)
<rwmjones> it's not really a problem but it's not what I expected
<davidlt> rwmjones, well, I expected this
<davidlt> as soon as they did the final sync (after mass rebuild) it's a separate thing
<davidlt> at least until RHEL X.0 happens, and then it's open to changes as long as that doesn't go against RHEL policy
<davidlt> Thanks
<rwmjones> later today I'm going to go through all the existing PRs and merge all the valgrind_arches ones if maintainers haven't done so already
<rwmjones> these are not controversial changes IMHO
<rwmjones> davidlt: I don't understand the cmake change (riscv64/main commit 0510331e78061e4), it seems to both disable the tests and filter out a failing test?
<rwmjones> I will try compiling it with just the filter bit
<davidlt> rwmjones, check my message, it explained it
<davidlt> update timing out tests for riscv64 to incl. one more (Qt6Autogen.MocIncludeSymlin)
<davidlt> Do not disable test in general, this was just needed to double check the list
<rwmjones> oh I see, alright let me try it
<davidlt> it's just about Qt6Autogen.MocIncludeSymlink part
<davidlt> so tiny change
<rwmjones> got it
<rwmjones> is it really "Qt6Autogen.MocIncludeSymlin" or "Qt6Autogen.MocIncludeSymlink" ?
<davidlt> check commit
<davidlt> ah
<davidlt> checking
<davidlt> rwmjones,
<davidlt> Qt6Autogen.MocIncludeSymlin
<rwmjones> I wonder if it does prefix matching and so it just worked anyway
<davidlt> The following tests FAILED:
<davidlt> 661 - Qt6Autogen.MocIncludeSymlink (Timeout)
<rwmjones> I'm actually doing a local build to see if I can reproduce the problem, but the build takes forever :-(
<rwmjones> you may have noticed I'm going through our PR backlog and pushing things ...
<rwmjones> I'll add comments to the relevant PRs
<davidlt> rwmjones, I am cooking lunch right now ;)
<rwmjones> no problem :-)
<davidlt> I always said, this is 24/7 :) Even in the kitchen.
kalev has joined #fedora-riscv
zsun has joined #fedora-riscv
<davidlt> rwmjones, I tried building mold, two tests failed: http://fedora.riscv.rocks/koji/taskinfo?taskID=1635825
<rwmjones> will look at mold a bit later, I'm currently going through existing PRs
<davidlt> it's part of 2.39 from what I can see
<davidlt> I see LLVM 18 is making it's way towards F40 too
zsun has quit [Quit: Leaving.]
<rwmjones> davidlt: re mold, yes it's plausible ...
<rwmjones> the gcc patch is old enough that surely we have that in gcc 14 already
<rwmjones> the glibc patch is more recent though
<davidlt> but it's in a release
<rwmjones> in Fedora already?
<davidlt> I checked commit, but I don't know if all this is related to mold anyway
<davidlt> yeah, it's part of glibc 2.39. It landed weeks ago
<rwmjones> hard to say, I'm just doing a local build of mold from rawhide here to see
<rwmjones> maybe there'll be more details in the log file
<davidlt> ah, 12 days ago: [PATCH] RISC-V: Fix the static-PIE non-relocated object check
<davidlt> Reported-by: Andreas Schwab <schwab@suse.de>
<davidlt> Closes: BZ #31317
<davidlt> Fixes: e0590f41fe ("RISC-V: Enable static-pie.")
<davidlt> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
<rwmjones> that's in mold?
<davidlt> Yes
<rwmjones> ok let me see if that can be backported then
<davidlt> I don't think it landed yet
<rwmjones> so if a fix is needed in glibc, we could either temporarily skip the failing mold tests with a link to the glibc fix in mold.spec
<rwmjones> or just wait a bit
<rwmjones> it's not really clear to me if mold is actually needed anywhere, it seems to be used only for optimization in projects like ceph, and optionally there
<rwmjones> I got through quite a lot of the PRs today, the easy ones I merged and did f40 & f41 builds
<rwmjones> still waiting on cmake build to finish
<davidlt> cmake build is kinda slow
<rwmjones> it's doing tests, but also competing with another build on my vf2
<rwmjones> https://kojipkgs.fedoraproject.org//work/tasks/215/114560215/build.log <- I think this is what happens if you completely ignore warnings in your C++ project
davidlt has quit [Ping timeout: 268 seconds]
davidlt has joined #fedora-riscv
davidlt has quit [Remote host closed the connection]
davidlt has joined #fedora-riscv
cyberpear has joined #fedora-riscv
<davidlt> rwmjones, a reminder, join Matrix :)
<conchuod> davidlt: I ordered one of these k230 boards, is there an eta on this x60 thing you mentioned?
<davidlt> conchuod, it was never announced
<davidlt> I mean the date and the price
<davidlt> they are working on the wiki and youtube videos, shouldn't be too long
<davidlt> fun thing, they added more content and this time it's called K1X. Sometimes they call it K1.
<davidlt> It seems that two clusters might be different a bit.
<davidlt> "AI" stuff is only on cluster 0, and it also as extra 512KB "TCM" next to L2 cache
<davidlt> AI stuff is "X60TM extends 16 AI instructions, including matrix multiplication and sliding window calculation."
<davidlt> I would be surprised to see it only on one cluster only.
<davidlt> Anyways, that's something Linux will not care anyways.
<davidlt> It has section "Easy to buy", but it's still empty.
<davidlt> I don't see anything on Aliexpress too.
<davidlt> I have no idea why it needs Mini PCIe slot.
<davidlt> The SPI chip is 4MB.
<davidlt> There is 2Kbit EEPROM too.
<davidlt> 16GB eMMC
<davidlt> I am more worried that 8GB of RAM is mentioned more often than 16GB.
<sorear> fixed TCM/L2 split? ew
<davidlt> and it some cases it's 4GB.
<davidlt> I don't know what TCM is.
<davidlt> But it's 512K L2 + 512K TCM on cluster 0.
<sorear> tightly-coupled memory, SRAM in the package with fixed-cycle access latency
<davidlt> cluster 1 has 512K L2.
<sorear> and fixed addresses
<sorear> linux won't ever use it but if you're doing bare metal you can put data in the TCM with a linker script
<davidlt> Yeah, sounds cool after reading a bit on Google search results
<davidlt> ARM TCM (Tightly-Coupled Memory) handling in Linux
<sorear> u54/u74 have a single pool of memory which can be configurably split into an L2 cache and a L2 TCM (sifive calls it a "loosely coupled memory" to distinguish it from L1 TCM but that's not standard terminology)
<davidlt> Quote: Notice that this is not a MMU table: you actually move the physical location of the TCM around. At the place you put it, it will mask any underlying RAM from the CPU so it is usually wise not to overlap any physical RAM with the TCM.
<davidlt> Quote: To avoid confusion the current Linux implementation will map the TCM 1 to 1 from physical to virtual memory in the location specified by the kernel. Currently Linux will map ITCM to 0xfffe0000 and on, and DTCM to 0xfffe8000 and on, supporting a maximum of 32KiB of ITCM and 32KiB of DTCM.
<sorear> there's no guarantee spacemit will let you configure the TCM physical address at runtime, sifive doesn't
<davidlt> I never knew you could do this, so this is new and cool to me :)
<sorear> andes has a TCM with a fixed *virtual* address that overlays the default mapping address for position-dependent executables, this is blatantly in violation of the privileged architecture but pointing it out doesn't fix anything
<sorear> or rather Renesas does in the RZ/Five; I don't know how much of that is directly caused by Andes TCM limitations
<davidlt> oh
<davidlt> Linux commit: csky: Tightly-Coupled Memory or Sram support
<conchuod> sorear: We use that TCM on polarfire to run the firmware out of.
<conchuod> I think the sifive term is "lim"
<conchuod> because why reuse the standard term
<davidlt> because you cannot market it otherwise :)
<davidlt> I get impression that 512K TCM is a lot reading about ARM stuff
<sorear> it's more common for a TCM to be connected at the level of the L1 caches
<sorear> with size and access latency to match
<davidlt> wouldn't that have a bigger impact in core layout?
<davidlt> Especially 512K, that's probably a large area compared to those X60 cores (1.3x perf of A55)
<sorear> yes, which is why they're normally smaller than that
<davidlt> Looking at various stuff about it I got impression this is designed for AI/ML data processing.
<conchuod> davidlt: "it was never announced" but you often seem to know more than you should!
<davidlt> conchuod, I know nothing
<davidlt> Technically it's all public, and other folks found bits online too
<conchuod> hows your sg2042?
<davidlt> Short answer? Annoying.
<davidlt> I keep corrupting NVMe even with my lower Koji load settings.
<davidlt> Like it's fine building GCC, LLVM, but once I got towards more than maxjobs=1 it's a risk.
<davidlt> I am not sure why. I will be doing a new LLVM (18) and GCC 14 builds, maybe it will break this time.
<conchuod> davidlt: Did you see the starfive pci issue?
<davidlt> I might talk with SOPHGO about it, but not sure. Still waiting for a contact, I guess.
<davidlt> conchuod, I did, but I don't think we truly know what's the issue.
<davidlt> I would like to see a proper errata doc with details.
<conchuod> The lads at work were saying to me they shoulda hid that shit in opensbi and never mentioned it on lkm
<conchuod> lkml*
<davidlt> It seems all the boards are broken in one way or another :)
<davidlt> We can start guessing what's broken on SpacemiT K1 :)
<conchuod> It's a custom CPU, so all bets are off.
<davidlt> Or JH8100 :)
<conchuod> I have high hopes for 8100 actually.
<davidlt> Well, they said it was properly tested, but that's kinda it.
<sorear> if the issue is actually "writes issued by devices are reordered" that doesn't sound like an issue opensbi can fix
<conchuod> Aye, but the kernel isn't fixing it either I don't think.
<conchuod> But you do that in your vendor opensbi, noone asks questions and your driver gets merged..
<sorear> _can_ the kernel do anything about it except on a driver by driver basis?
<rwmjones> sorear: so is TCM accessed through the cache hierarchy or is it fast enough to serve directly to the core?
<sorear> rwmjones: on u54 the L1 TCM goes directly to the core, the L2/"LIM" physically goes through the L1 caches but has an uncacheable PMA
<sorear> I don't have whatever davidlt is looking at
<rwmjones> I see
<davidlt> there is no extra details
<davidlt> it's literally written 512K L2 CACHE + 512K TCM
<davidlt> cluster 0
<davidlt> I haven't see SoC datasheet or anything like it (yet)
<sorear> love to see people internalize "not talking about safety and correctness problems makes them go away" right as STS-51-L passes out of living memory of the people now entering management positions
<conchuod> sorear: I'm just surprised they didn't try to hide it, I'm not saying that hiding it is what I want them to do.
<davidlt> Just be happy we don't get to see what's broken in Intel and AMD CPUs :)
<conchuod> davidlt: Do you know if the jh7110 a "real" device for them with non sbc customers or are they kinda just tiding themselves over til something more powerful?
<davidlt> conchuod, I don't know, but my guess is that it has not real customer that would cover the development, etc.
<davidlt> I would say the same is with JH8100.
<davidlt> Don't get me wrong, I bet these things still will be sold/used in China.
<conchuod> I wonder how any of these companies actually fund the chips they make, but the answer probably is that they don't.
<davidlt> Government helps with funding, plus push to use local.
<davidlt> ByteDance is now funding StarFive too.
<davidlt> (I think)
<conchuod> Which I guess makes sense, better starfive than give alibaba t-head money
<davidlt> 150 million USD is a good start :)
<davidlt> I think Baidu and Alibaba are competitors in China
<davidlt> There are several large companies (massive ones) in China that could have enough money too cook something custom
<conchuod> yah
<conchuod> I suppose they probably also don't do as much validation as more established places either.
<davidlt> Well it costs tons of money and time
<davidlt> Instead you could move fast, like SpaceX :)
<davidlt> When I worked with Huawei they could SoCs very fast.
<davidlt> Like I never seen so fast moving silicon company. It's like every shuttle in the fab had to have a new improved design.
<davidlt> The only sad thing is that you are left with tons of non-production hardware.
cyberpear has quit [Quit: Connection closed for inactivity]
<conchuod> We are omega slow, which I guess makes sense given how much variance we have to try to validate.
<davidlt> Yeah, but you have established market/customers. You don't want to have mistakes.
<conchuod> I mean, it's just a completely different world to the "you must iterate every year" phone SoC companies etc
<davidlt> I recall ARM having multiple teams (3?) for Cortex-A era stuff to deliver a new design every year
<sorear> there's a joke about netburst and merced nearly killing intel because intel was no good at handling pipeline mispredictions
<rwmjones> davidlt: did you see this one? https://github.com/rhboot/shim/pull/420
<rwmjones> it's from canonical and looks fairly sensible to me
<davidlt> rwmjones, yes, but it was blocked by pjones
<rwmjones> blocked as in he actively blocked it, or he just needs to review it & didn't?
<davidlt> rwmjones, IIRC he wanted binutils changes, which all landed in binutils 2.42
<davidlt> rwmjones, quote from him (looking at emails):
<davidlt> This is one of those places where RISC-V seems to want to make every
<davidlt> single mistake ARM made with AArch64, and I have to push back. There's
<davidlt> no way we want to have more arches that are build with "ld -O binary".
<davidlt> Binutils needs to support our binary targets.
<davidlt> All what is needed landed in binutils 2.42.
<davidlt> IIRC shim also depends on a gnu-efi, and that needs rebase
<davidlt> Unless someone starts looking into updating/rebasing/reviewing/etc. riscv64 stuff under "rhboot" I am not planning on shipping GRUB2 (x86_64/aarch64-like bootflow).
<davidlt> systemd-boot for the win, especially as it no longer requires gnu-efi :)
<rwmjones> sure
<davidlt> We could have had the same bootflow a long time ago
<davidlt> Incl. shim
<rwmjones> does systemd-boot use shim?
<davidlt> shim is not strictly needed, so we just don't use it
<davidlt> I disabled it in Pungi compose, or in some places (templates?), but that's annoying
<rwmjones> from RH point of view, some kind of secure boot for RHEL is essential
<davidlt> Yeah, but in that case rebase gnu-efi, rebase GRUB2, and get shim support in.
<davidlt> Canonical sent initial patches 2-3 years ago
<davidlt> I am not willing to look at it on my own, just way too many out-of-tree patches (gnu-efi is the easiest here).
<davidlt> GRUB2 is nonsense. It's like 350 or more patches. I am not looking into that.
<rwmjones> agreed
<rwmjones> I just asked peter jones what's going on and if he'll merge the shim PR
<davidlt> brianredbeard refreshed a PR two weeks ago: https://github.com/rhboot/shim/pull/641
<rwmjones> yeah we're discussing this on a private (grr) thread
<rwmjones> let's see what peter says
<davidlt> rwmjones, I think you still need gnu-efi update
<davidlt> there is one more problem
<davidlt> IIRC shim is hardcoded to boot grub next, thus whatever next stage is it must be renamed to grubefi or whatever
<davidlt> I guess systemd-boot switch does that, but I don't recall.
davidlt has quit [Ping timeout: 264 seconds]