dgilmore changed the topic of #fedora-riscv to: Fedora on RISC-V https://fedoraproject.org/wiki/Architectures/RISC-V || Logs: https://libera.irclog.whitequark.org/fedora-riscv || Alt Arch discussions are welcome in #fedora-alt-arches
pjw has joined #fedora-riscv
zsun has joined #fedora-riscv
zsun has quit [Quit: Leaving.]
<davidlt[m]> djdelorie: Koji killed GCC (timeout was reached)
<davidlt[m]> djdelorie: could you reboot the board, and I would restart the GCC build
<davidlt[m]> LLVM 15 is also incoming too
jcajka has joined #fedora-riscv
zsun has joined #fedora-riscv
zsun has quit [Remote host closed the connection]
<davidlt[m]> I will continue with Perl bootstrap, but it's getting close where I will disable perl_bootstrap macros and start rebuilding again
<davidlt[m]> Majority of direct perl packages (perl-*) are already in.
esv_ is now known as esv
zsun has joined #fedora-riscv
masami has joined #fedora-riscv
<zsun> davidlt[m]: hi, I see you already have the basic riscv kernel config in your gitea. Do you have a plan when to submit it to fedora kernel-ark?
<zsun> I am thinking that submitting to fedora kernel-ark will make it easier for people to collabroate
<davidlt[m]> It's on the TODO list, all the bits are kinda in place (just some minor updates)
<zsun> that's great
<davidlt[m]> There is bugzilla ticket for that, and I did chat with kernel maintainer some time ago.
<davidlt[m]> Is that something you would need sooner than later?
<davidlt[m]> If there is a need I could prioritize that maybe next week.
<zsun> davidlt[m]: not in a hurry. I am helping tekkamannijia generating the config files added for kernel-ark and just realized you already have most of them
<zsun> as this is already on your plan, I'll do my work on top of yours, which is much easier I believe
<davidlt[m]> Cool, OK. I will try to do is sooner than later. I hoping to finish Perl bootstrap this week.
<zsun> cool, thanks
masami has quit [Quit: Leaving]
jcajka has quit [Quit: Leaving]
zsun has quit [Quit: Leaving.]
<djdelorie> davidlt[m]: rebooted
<davidlt[m]> djdelorie: understood. resubmitting GCC
<djdelorie> note it was busy building srpms at the time ;-)
<davidlt[m]> djdelorie: not a problem, I will force a build on it :)
<djdelorie> I figured you'd disable general job-getting so it was only running one
<davidlt[m]> Nah, the machine capacity is enough to lock it at one build
<djdelorie> oh good
<davidlt[m]> So I just need to get it to build GCC and no other job will attempt to land.
<davidlt[m]> I believe the capacity for the node is 2.0 and GCC build weight is 6.0
<davidlt[m]> I just use my god mode in Koji to shuffle things manually a bit ;)
<nirik> assign-task --force is fun. ;)
<davidlt[m]> Alternatively would be to configure different channels and configure some logic in hub, but that's not needed (now).
<davidlt[m]> nirik: it is :)
<davidlt[m]> nirik: your board has processed ~650 tasks already
<nirik> I had a short network offline time yesterday, but I don't think it affected the builder here much
<nirik> excellent. ;)
<nirik> temp is staying pretty good.
<nirik> CPU Temp: +39.8°C
<davidlt[m]> That's very good
<davidlt[m]> nirik: your board produced 579 RPMs so far
<nirik> great. Glad it's getting use... sorry I took so long to finish getting it setup
<davidlt[m]> nirik: we will need to rebuild our Koji server and to slowly bring infrastructure closer to what upstream Fedora does. Would you be willing to look into that, maybe take a lead or/and at least define the phases with what needs to be done?
<davidlt[m]> This way I could take myself a bit out of the loop on all the things.
<davidlt[m]> Otherwise I will just do whatever I do :) But I prefer for things to happen faster thus more people take parts would be nice to have.
<davidlt[m]> neil: showed an interest a few parts like pungi and disk image generation (which we don't do the way Fedora does).
<nirik> davidlt[m]: possibly. I'm pretty busy, but I can see... I can at least write up what I think plans make sense and try and work on it?
<nirik> where's the best place to discuss? mailing list?
<davidlt[m]> Whatever you prefer. Mailing list might be good as we haven't sent to many emails there :)
<neil> i need to make my fedoraproject.org email alias work some day...
<davidlt[m]> We could also use: https://discussion.fedoraproject.org/tag/risc-v
<nirik> neil: are you in more than one group?
<nirik> davidlt[m]: I can arrange that.
<davidlt[m]> If you want I could describe what we have, what we considered, etc.
<davidlt[m]> Realistically our biggest issues is storage. We have ~20TB of fast flash storage, not used in efficient way. Like we even keep backups on the same drives. We have 100+TB of HDDs, that don't have a physical server (never got funding to get it going, but the drives exist).
<davidlt[m]> We are running out of /mnt/koji space. Quick solution would be to pool all 3 NVMes (~20TB) of Flash and with no redundancy an keep going for 2-3 years. Depend on backups, but actually build a local physical server.
<nirik> my thought was to spin up a hub in aws... and import f38 once the current koji builds it... then we only have 38+ in there and can try and start keeping up with mainline...
<davidlt[m]> Our Koji is located in SF, but majority of the boards will be at my place. Koji is data movement challenge. We moved over petabyte of data. I am in Lithuania thus that's a long way to ship data without a local cache, but I do have a fiber.
<nirik> but that doesn't take much advantage of your current hw
<davidlt[m]> We will run out of storage before we can get to full Rawhide.
<davidlt[m]> Oh yeah, our flash storage was expensive, but I could do repos very fast ;)
* djdelorie wonders if "move to SF" is one of davidlt[m]'s options ;-)
<davidlt[m]> And it had no problems feeding ~170 QEMU builders.
<davidlt[m]> Yeah, but I am not living alone thus not my own personal decision :)
<nirik> the current hub... what space does it have for storage expansion?
<neil> nirik: not yet
<davidlt[m]> Too small main drive, 256G. <20TB PCIe x8 NVMe. All slots filled. There are 8 SAS/SATA drive support IIRC.
<nirik> neil: ok, thats needed for the alias to kick in. I can add you to some group if you like... \
<davidlt[m]> Thus I can get it to limb towards full Rawhide by pooling all NVMe to ~20TB (no redundancy, that's fine with me). Updating the main drive to 1-2T NVMe (M.2). Use the same M.2 for postgre db.
<davidlt[m]> Then have a small machine with those 100+TB raw HDD storage with some RAID6 or something and depend on that.
<davidlt[m]> Currently backup repository sits here locally, but that's running out of space too.
<davidlt[m]> I do have a replacement for external drive to NAS (50TB).
<davidlt[m]> And at some point I might switch to 2Gbps fiber.
<davidlt[m]> I have 16 Unmatched boards that will be connected to Koji. Still need to buy a few parts.
<davidlt[m]> Our Koji runs on Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
<davidlt[m]> That's 2S, 20C, 40T.
<davidlt[m]> 128G of RAM
<nirik> cool.
<nirik> so it sounds like: limp along to rawhide parity or close... then discuss and figure out plan after that?
<davidlt[m]> Current config is RAID1 with 2 NVMe for /mnt/koji.
<davidlt[m]> and one NVMe is used for backups and postgredb.
<neil> i suspected I'd be a member of at least one group due to signing the contributor agreement but I think maybe the signed_fpca group isn't working or needed anymore maybe
<nirik> and do you need $'s for the first part now (disclaimer: I don't have any, but I can talk to mgmt)
<nirik> neil: yeah, it has to be one in addition to that.
<davidlt[m]> I can work on the funding.
<neil> I too can inquire about funding via $dayjob and/or Rocky Enterprise Software Foundation
<davidlt[m]> I have some secured for buying the missing parts for Unmatched to get 16 boards connected.
<neil> very nice :)
<davidlt[m]> The rest depends on how you would like to handle our Koji infra :)
<davidlt[m]> So if we decide to limp for some time until we fully catch up, that's fine. In that case minimal to none investment needed probably.
<davidlt[m]> But we still need a plan for afterwards.
<neil> and "wing it" doesn't count as a plan I guess?
davidlt has joined #fedora-riscv
<davidlt[m]> You might want to ping Al Stone at RH about this.
<davidlt[m]> Well, the current stuff only gives us ability to produce RPMs and disk images, but not in a proper way.
<davidlt[m]> A proper content, but cooked in an old ARMv7 way.
<davidlt[m]> No pungi, koschei, koji-shadow, modularity, RPM sign infra, CI gating, etc.
<davidlt[m]> If I do that then I don't have to look into packages :)
<neil> :)
<nirik> I think a lot of that could come after mainline/primary and isn't so important for secondary.
<nirik> pungi/composes might be good tho.
<davidlt[m]> That's not gonna happen for quite some time.
<davidlt[m]> We have been discussing this for years now. Until the proper standards based hardware arrives riscv64 will not be in the official Fedora koji instance.
<davidlt[m]> So it's always gonna be a secondary arch with a separate koji infra.
<nirik> right, it has to be able to reasonably keep up and have hw that doesn't need a lot of handholding
<djdelorie> "keep up" is a separate problem
<djdelorie> "server grade" is more like remote managment etc
<davidlt[m]> Well, if SiFive / Intel P550 with Intel 7nm happens at some point that shouldn't be an issue I gues :)
<nirik> right.
<davidlt[m]> That is/was suppose to be released in 2022.
<nirik> years not over yet. ;)
<davidlt[m]> StarFive Tech JH7110 will give a nice boost (but 8GB of RAM, way cheaper too).
<davidlt[m]> Servers are some years out, but work is WIP.
<davidlt[m]> Specs are going forward. Ventana is upstreaming their stuff into GNU toolchain.
<davidlt[m]> Until that we support e-waste hardware (not build based on standards) like old good armv7hl :)
<nirik> right, so it might be that we have 1 gen of secondary infra we need to run before mainline... anyhow, I can start a thread on this on the list and we can see if there's consensus
<nirik> our first 32 bit arm hw was the lovely calxeda... 24 (I think) armv7 boards in a 4U chassis
<davidlt[m]> Note, we will run out of storage sooner than later thus some rebuilding will need to happen :)
<davidlt[m]> I actually already run out of space one night, but I found an extra ~140G I could delete :)
<davidlt[m]> I never touch that. I was ARMv8/aarch64 boy. All my toys were from other brands :)
<nirik> yep. step1: more storage to limp along more. step2: new hub/storage setup with more room to grow/other services. step3: mainline
<nirik> (at least in my mind)
<davidlt[m]> So we might need to have "temporary plan" before anything else.
<davidlt[m]> Basically how to keep it afloat until we catch up on RPMs side.
<nirik> yep
<davidlt[m]> My suggestion: pool in all the PCIe x8 NVMes to form ~20TB /mnt/koji (no redudancy). Update the main drive (main OS + postgredb). Or and also look into SAS drives for postgredb.
<nirik> that sounds fine to me. Does that mean you will have to reformat and sync data back from backup?
<davidlt[m]> Use those 100+TB RAW HDDs with high redundancy for local backups (we use restic for that).
<davidlt[m]> Would require buying a sever for that.
<davidlt[m]> I would considering switch to Btrfs for /mnt/koji for snapshots.
<davidlt[m]> Yes, either local or remote (directly from home).
<davidlt[m]> It's probably ~6TB.
<nirik> so, perhaps we should look for funding for a new server (with no or limited drives, since you have a bunch)
<davidlt[m]> This is what we have right now in Koji hub:
<davidlt[m]> Node SN Model Namespace Usage Format FW Rev... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/124131ea766c70e8ce6bb53153a2c945a27f2164)
<davidlt[m]> Yeah, server also needs a new home. The current Colo in Fremont should be replaced.
<davidlt[m]> Alternative would be to build a local cheaper server at my place as that's where majority of boards will be.
<nirik> or aws... ;) (could be in a region near your boards)
<davidlt[m]> I am not sure moving such amounts of data over such distance makes sense.
<nirik> we don't really have a place for a machine right now, I would have said our community cage in rdu, but it's supposedly moving at some point before too long... but I guess it's possible to have something there.
<nirik> hum.
<neil> i'll check w/ some contacts to see if I can scrounge some hw. colo space is a bit harder. assumedly something lower power would be nice if it's gonna be in your home
* nirik just remembered a blade center he was trying to find a good use for.
<davidlt[m]> The thing about AWS is that it's not NVMe and moving data out is expensive :)
<nirik> it can be nvme. You just need to specify... :) and amazon is comping our fedora account right now at least...
<neil> 👆
<neil> :)
<neil> i'll also throw out that Rocky has a new build system that might be useful at least insofar as it doesn't require NFS at all--just an object storage
<neil> obviously it needs to go into koji at the end of the day, though
<davidlt[m]> Well as I said before I can leave the best course of action for you both to figure out :)
<davidlt[m]> Of course we could move it to AWS ASAP and just forget about it, or Rocky infra.
<davidlt[m]> djdelorie: check your baord
<davidlt[m]> djdelorie: I don't see a ping on Koji side
<djdelorie> right, the usual "won't start kojid on boot" problem
<davidlt[m]> you can modify service file ;)
<davidlt[m]> It's back
<neil> nirik want me to start an ether pad or something and we can jot down ideas about steps/plans/phases?
<nirik> neil: if you like. I was gonna start a list thread, but we could use something like that to organize the discussion.
<neil> the openinfra folks infected me with ether pad ☺
<nirik> I need to drop off for a bit... in town today waiting for my car to get repaied and I need to move and find a place with power. ;)
<nirik> back in a bit.
<davidlt[m]> I will be sleeping, but you can leave me any questions in a thread, IRC, or ether pad :)
<neil> :) sounds good. ty
davidlt has quit [Ping timeout: 244 seconds]