dgilmore changed the topic of #fedora-riscv to: Fedora on RISC-V https://fedoraproject.org/wiki/Architectures/RISC-V || Logs: https://libera.irclog.whitequark.org/fedora-riscv || Alt Arch discussions are welcome in #fedora-alt-arches
exFATmatt[m] has joined #fedora-riscv
drewfustini has quit [Server closed connection]
drewfustini has joined #fedora-riscv
<Entei[m]> `binutils` rebuild failed after almost 2 days of tests...... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/ded70072ba118feb963a0366fa402222546504b4>)
<Entei[m]> Not sure where complete error log is supposed to be, BUILDROOT directory is empty
<Entei[m]> The tests run on single core, they are damn slow on VM. Are there any consequences of disabling them?
<Entei[m]> Entei[m]: this is probably what caused the build to fail I am guessing
<davidlt[m]> VMs is really slow on a single core.
<Entei[m]> LFS says the testsuite for binutils is considered critical
* davidlt[m] getting ☕️
<Entei[m]> I don't know at this point. I commented out all the tests and even the function call run_tests. Let's see
<davidlt[m]> The testsuite never passes. Record the list of failed tests, and cross-check with existing build.
<Entei[m]> davidlt[m]: > * <@davidlt:matrix.org> getting ☕️
<Entei[m]> Oh well good morning :)
<davidlt[m]> Ah, ld never had any failings.
<davidlt[m]> So that's interesting.
<Entei[m]> davidlt[m]: It stopped my build process, exactly what TEST2 comment says.
<davidlt[m]> Just for the interest, what tests failed?
<Entei[m]> davidlt[m]: Well I unfortunately rebooted the VM now, no more of terminal messages... I couldn't find any logs too, the BUILDROOT directory was empty
<davidlt[m]> BUILDROOT should be empty, that's where things get installed before packaging
<davidlt[m]> BUILD directory is where the things happen
<Entei[m]> <davidlt[m]> "BUILD directory is where the..." <- I am dumb... I deleted that
<Entei[m]> What would you suggest? Continue with the build (with tests commented out) or cancel?
<davidlt[m]> I would record failing tests, save build directory.
<davidlt[m]> Disable tests and continue, while investigating what happened to the tests.
<Entei[m]> Are there any readily available RISCV boards for development supported by Fedora Rawhide? All these tests on single core are taking too long for every other build, especially for small packages.
<davidlt[m]> I remembered this morning how to join IRC channels from Matrix. I am back on #riscv:libera.chat on Libera.Chat.
<davidlt[m]> JH7110 if you want upstream support, but lacks on the RAM side.
<davidlt[m]> Alternative is TH1520 board from Sipeed.
<davidlt[m]> They are currently selling Beta version of PCB (non-GA revision), but it has slightly stronger cores, all T-HEAD extensions (that are not used anywhere) and it goes up to 16GB.
<davidlt[m]> Now the downside going with that is lack of proper upstream support, and lack of high speed IO.
<davidlt[m]> The best you can do is to run USB-A to NVMe adapter and that will be way better than any microSD or eMMC storage.
<davidlt[m]> In both, bandwidth and IOPS.
<davidlt[m]> 6.5 kernel will get basic support for this board, but basic means it boots and has console (UART), nothing really impressive.
<davidlt[m]> It usually takes 6-18 months to do a decent upstream support.
<davidlt[m]> They also aren't selling 16GB variant as that will come with the final board, which is/was suppose to show up this month.
<davidlt[m]> Delays are common thing, thus lets say it shows up sometime this summer.
<davidlt[m]> Now we all are dreaming about Pioneer board with 64-cores and DDR4 DIMMs support.
<Entei[m]> Man development is such a slow process. How do people even do it?
<davidlt[m]> That's incoming. There is CrowdSupply page. We have seen demos with Fedora 38, there is V1.0, and V1.1 PCBs, none of which are final GA product.
<davidlt[m]> We have been doing it since 2016 IIRC, we are used to it ;)
<davidlt[m]> My 1st GCC builds on ARM64 simulator took 2 weeks IIRC :)
<davidlt[m]> Mistakes are really expensive time wise, thus you need to questions a lot of your moves.
<Entei[m]> davidlt[m]: Wow. Such patience.
<davidlt[m]> And you have need to do tons of things in parallel.
<thefossguy> davidlt[m]: Is there any "insider info" about it's eta? wink wink nudge nudge
<Entei[m]> Yeah failed builds is what gives me anxiety. A build taking 2 days is fine, but when it fails, it just makes me sad
<davidlt[m]> I am king of browser tabs, and terminal windows. I usually work on multiple issues at once and thus forget about them within a few hours.
<davidlt[m]> Pratham Patel: I was told (yesterday) that there are still some issues.
<davidlt[m]> I assumes that means that V1.1 is not ready for mass production.
<thefossguy> Sad.
<thefossguy> But I'm willing to wait
<davidlt[m]> But I am not impressed with STREAM results (if we can believe them).
<thefossguy> I ~need~ want this :D
<thefossguy> s/~/~~/, s/~/~~/
<thefossguy> (what, strikethrough doesn't work here?)
<davidlt[m]> We just need to be careful with our expectations here.
<thefossguy> davidlt[m]: Isn't that because of the memory channels' limitations?
<davidlt[m]> I assume it will be a great improvement for development, but a terrible SoC in general (especially for HPC).
<thefossguy> I read 4 slots. Even if that's quad channel, for 64 cores, that's too low.
<thefossguy> davidlt[m]: Yeah, agreed.
<davidlt[m]> Not really. There are tons of things that could go wrong, but from my limited experience testing prototype SoC the interconnect is the hardest part.
<davidlt[m]> And that basically could kill any scaling on MT workloads.
<thefossguy> I have a hunch we will limit the cores to less than 24-32 when compiling something in parallel.
<davidlt[m]> Have you noticed that no other strong RISCV startup made a large core SoC yet? :)
<Entei[m]> I am learning openssl and how to setup koji instance while things are building. But all of this would be for nothing if these core packages keep failing.
<davidlt[m]> Huawei took multiple generation on their pre-production ARM64 server chips to resolve issues in their interconnect.
<Entei[m]> davidlt[m]: Yeah I was expecting SiFive to be the first ones to get such development machines
<davidlt[m]> it's not that easy.
<davidlt[m]> and SiFive core business is not making those SBCs/boards.
<thefossguy> davidlt[m]: This is a personal viewpoint and a rather narrow one: scaling with multiple nodes is what datacentre wants and this method doesn't hurt HPC much either.
<thefossguy> I'd say 64-128 cores is where I expect a single machine to top out at.
<davidlt[m]> Ventana 1st gen tops out at 192.
<davidlt[m]> Datacenter is interesting. The density and power consumption is highly important.
<thefossguy> I heard Ampere "One" has the same 192 core max limit
<thefossguy> One? I can't recall what its called
<thefossguy> *pratham goes afk
<davidlt[m]> There are rumours that AMD Zen6 will go up to 256 cores.
<davidlt[m]> Entei: If you are new to this it will be hard, there is no way around it.
<davidlt[m]> But joining the party so late means a lot of heavy lifting was already done ;)
<davidlt[m]> Your life would be a lot easier if you had C extension supported on whatever you/your company is working on.
<Entei[m]> Yeah I get both the pros and cons. I get the foundation laid out with OE or prebuilt Fedora images, so it's somewhat easy I feel.
<Entei[m]> davidlt[m]: Yeah in that case I'd just be working on DTS and probably testing. Since rv64gc is like the standard.
<davidlt[m]> The standard will change, "RV64GC" is legacy from my point of view.
<davidlt[m]> Once Platform spec is ratified we will know what is our true starting base.
<davidlt[m]> I think, so far, that's RVA22 (which is minor profile).
<davidlt[m]> RVA23 is suppose to be a new major profile, but it also might be RVA24.
jcajka has joined #fedora-riscv
<Entei[m]> davidlt[m]: Lol yeah. More extensions coming every day like B, H, V. But I am talking from the point of software. Every piece of program comes with rv64gc as the default.
<davidlt[m]> Today kinda yes, but we already having discussions about support newer profiles.
<davidlt[m]> Maybe not even supporting RVA20 (RV64GC) in CentOS Stream.
<davidlt[m]> Running RVA20 binaries on high-perf platform would be a significant performance loss.
<davidlt[m]> rwmjones: I might be looking into GHC (very slowly). I am building GHC 9.2.8 + LLVM 13 for some experimentation.
<davidlt[m]> Not having pandoc might cause me problems trying to rebuild R.
sajcho has joined #fedora-riscv
sajcho has quit [Client Quit]
<davidlt[m]> Out of existing R packages in Koji about 10% were build R 4.2. That's not bad.
sajcho has joined #fedora-riscv
sajcho has quit [Client Quit]
<rwmjones> ok
<davidlt[m]> I am surprised that we already have compliance test suite, and that there is an announced about RVI20 compliance, but no RVA20.
<davidlt[m]> palmer: did anyone attempted to do that on QEMU? :)
<davidlt[m]> At least this confirms (?) that it's RV64IMAFDCSUZicsr_Zifencei_Zba_Zbb
<davidlt[m]> Oh, there is C908 from T-HEAD
<davidlt[m]> ISA RV64IMAFDCVSUZicsr_Zifencei_Zihintpause_Zfh_Zba_Zbb_Zbc_Zbs_Xthead
<davidlt[m]> Is this a profile check, or RVI standard for I extension?
<davidlt[m]> I think it's only for I extension.
<davidlt[m]> But no, there are tests for other extensions too.
<davidlt[m]> But the report doesn't mention RVI, or RVI20U64, or anything like that.
<davidlt[m]> While it shows ISA with Zba and Zbb, that's not tested.
<davidlt[m]> Ah, sorry, there are some tests, in B diretory.
<davidlt[m]> They just don't use Zba, Zbb, etc. naming.
<conchuod> davidlt[m]: They confirmed before I took the dts that it does rv64gc_zba_zbb ;)
<conchuod> they being starfive
<conchuod> davidlt[m]: Also, since the 1520 seems to be filled with DW IP, it may not take nearly as long to get the various peripherals supported as you might expect.
<davidlt[m]> Yeah, but the more interesting bits like networking, and something else is modified or custom DW IP IIRC.
<davidlt[m]> The last time I looked at it was some weeks ago.
<davidlt[m]> Network, Storage (MMC) and USB, I think, have vendor (T-HEAD) compat strings. I didn't look in to the code for what.
<davidlt[m]> All other bits like *SPI, WDT, etc. are all already support IIRC.
<davidlt[m]> The fact that basic support was merged for 6.5 is already impressive, and might mean we could use it sooner than later.
<conchuod> davidlt[m]: Part of my motivation was to settle down the MAINTAINERS messing involved.
<conchuod> Also, since the vendor is not involved, cutting out what needs new drivers is essential :/
<conchuod> s/cutting out/cutting down/
<davidlt[m]> Well, at least one person from T-HEAD is in MAINTAINERS ;)
<conchuod> I do not expect Guo to send any patches for drivers.
<davidlt[m]> Yet he typically doesn't work on SoC enablement part.
<davidlt[m]> I think it's gonna be mainly one person show.
<conchuod> It's good to have him anyway, so that people that do not intimately know the who of linux-riscv can mail suitable people.
<davidlt[m]> Yeah, and he knows the details from inside :)
<davidlt[m]> If the issue pops up he can clarify things as I don't exactly know in what state documentation is.
<davidlt[m]> and, btw, welcome to the channel 🎆
<conchuod> In a language that I speak, pretty much non existent.
<conchuod> davidlt[m]: Oh ye, I was trying the most recent createAppliance image for my nfs root - I had some issues on the Nezha where systemd-resolved.service would never start during boot. Was fine on the other boards I tried. Haven't looked into it as I dunno what the state of those images are.
pbrobinson has quit [Server closed connection]
pbrobinson has joined #fedora-riscv
<davidlt[m]> That one at least have some testing
<conchuod> davidlt[m]: Is that not the same thing?
<davidlt[m]> For rootfs yes, but not every image is tested.
<conchuod> Ye. I don't use the fedora kernel or anything, just the rootfs.
<davidlt[m]> Yeah, but it's about only the kernel.
<davidlt[m]> We had a bug where binutils was generating TEXTRELs bit in ELF header, but no relocations.
<davidlt[m]> Combined that with SELinux rules things didn't work properly.
<davidlt[m]> Actually, I think resolved was one of the affected things.
<davidlt[m]> But we don't use it. NetworkManager should handle that part.
<davidlt[m]> Which was also affected by TEXTREL issues.
<conchuod> Interesting that it only didn't work for me on the Nezha.
<davidlt[m]> Check journal, and try booting with SELinux disabled, just in case.
<conchuod> Are those issues fixed in what you linked, compared to http://fedora.riscv.rocks/koji/taskinfo?taskID=1421549 ?
<davidlt[m]> If it's 20230519.n.0 then it's fixed in the disk image.
<davidlt[m]> The previous one definitely have this issue.
<Esmil> davidlt[m]: do you have a link to how that is fixed? we just talked about that in #archlinuxriscv the other day
<Esmil> the TEXTREL issues that is
<davidlt[m]> Give me a minute.
<Esmil> Cool, thank you
<davidlt[m]> Related, but not required IIRC: https://sourceware.org/pipermail/binutils/2023-May/127653.html
<davidlt[m]> We had <200 affected packages IIRC.
<Esmil> <3
esv_ has joined #fedora-riscv
esv_ has quit [Quit: Leaving]
aurel32 has quit [Server closed connection]
aurel32 has joined #fedora-riscv
<Entei[m]> binutils built (with disabled tests), and I still get those compressed instructions in my final binary on compiling
<davidlt[m]> Check from where they are coming.
<Entei[m]> davidlt[m]: Building the relocatable object file with `gcc -c main.c` and inspecting, there doesn't seem to be any compressed instructions. So probably libc now? I do have a printf call
<davidlt[m]> If that's inlined true.
<davidlt[m]> So if you get executable, and objdump the whole thing and scan for C instructions in code where is it?
<Entei[m]> Wait a main, I'll post the objdump.
<Entei[m]> Not the complete objdump, I used cat on terminal. If more is needed, let me know I'll scp it out and paste full dump
<davidlt[m]> If I look at .text section (but don't trust me 100%), this is from glibc.
<davidlt[m]> _start and load_gp
<Entei[m]> Should I proceed to rebuild glibc with this hacked up binutils? Or do I build again with tests enabled, since LFS guide deems it so critical?
<davidlt[m]> Do glibc now, will take a few hours.
<davidlt[m]> But in general you will need to do a few mass rebuilds anyways.
<davidlt[m]> The goal right now is just to verify that all looks OK before committing more time.
<davidlt[m]> Having C and non-C mixed right now is somewhat OK, as you rebuild more and more in phases it should vanish.
<davidlt[m]> So each iteration (phase) of mass rebuild should reduce and reduce the number of C popping up.
<davidlt[m]> Also add "-M no-aliases" to objdump.
<davidlt[m]> In that case C instruction should get a prefix "c."
<davidlt[m]> palmer: is it me or riscv disassembler options (objdump) were never added to the man page?
<davidlt[m]> I don't see a section for riscv talking about numeric and no-aliases.
fuwei has joined #fedora-riscv
zsun has joined #fedora-riscv
fuwei has quit [Ping timeout: 240 seconds]
zsun has quit [Ping timeout: 240 seconds]
zsun has joined #fedora-riscv
tg has joined #fedora-riscv
<palmer> davidlt[m]: probably we forgot
<palmer> can you file a bug?
<davidlt[m]> Actually there is a ticket for that already.
<davidlt[m]> This one was created and closed by Nelson https://sourceware.org/bugzilla/show_bug.cgi?id=27809
<davidlt[m]> But it actually doesn't touch documentation.
<palmer> odd. I bugged Nelson already, but he might be on PTO for a week
esv has quit [Ping timeout: 258 seconds]
esv has joined #fedora-riscv
jcajka has quit [Quit: Leaving]
zsun has quit [Quit: Leaving.]
somlo__ is now known as somlo
zsun has joined #fedora-riscv
tg_ has joined #fedora-riscv
tg has quit [Ping timeout: 240 seconds]
tg_ has quit [Quit: tg_]
tg has joined #fedora-riscv
zsun has quit [Ping timeout: 250 seconds]
zsun has joined #fedora-riscv