<zsun>
I am thinking that submitting to fedora kernel-ark will make it easier for people to collabroate
<davidlt[m]>
It's on the TODO list, all the bits are kinda in place (just some minor updates)
<zsun>
that's great
<davidlt[m]>
There is bugzilla ticket for that, and I did chat with kernel maintainer some time ago.
<davidlt[m]>
Is that something you would need sooner than later?
<davidlt[m]>
If there is a need I could prioritize that maybe next week.
<zsun>
davidlt[m]: not in a hurry. I am helping tekkamannijia generating the config files added for kernel-ark and just realized you already have most of them
<zsun>
as this is already on your plan, I'll do my work on top of yours, which is much easier I believe
<davidlt[m]>
Cool, OK. I will try to do is sooner than later. I hoping to finish Perl bootstrap this week.
<djdelorie>
note it was busy building srpms at the time ;-)
<davidlt[m]>
djdelorie: not a problem, I will force a build on it :)
<djdelorie>
I figured you'd disable general job-getting so it was only running one
<davidlt[m]>
Nah, the machine capacity is enough to lock it at one build
<djdelorie>
oh good
<davidlt[m]>
So I just need to get it to build GCC and no other job will attempt to land.
<davidlt[m]>
I believe the capacity for the node is 2.0 and GCC build weight is 6.0
<davidlt[m]>
I just use my god mode in Koji to shuffle things manually a bit ;)
<nirik>
assign-task --force is fun. ;)
<davidlt[m]>
Alternatively would be to configure different channels and configure some logic in hub, but that's not needed (now).
<davidlt[m]>
nirik: it is :)
<davidlt[m]>
nirik: your board has processed ~650 tasks already
<nirik>
I had a short network offline time yesterday, but I don't think it affected the builder here much
<nirik>
excellent. ;)
<nirik>
temp is staying pretty good.
<nirik>
CPU Temp: +39.8°C
<davidlt[m]>
That's very good
<davidlt[m]>
nirik: your board produced 579 RPMs so far
<nirik>
great. Glad it's getting use... sorry I took so long to finish getting it setup
<davidlt[m]>
nirik: we will need to rebuild our Koji server and to slowly bring infrastructure closer to what upstream Fedora does. Would you be willing to look into that, maybe take a lead or/and at least define the phases with what needs to be done?
<davidlt[m]>
This way I could take myself a bit out of the loop on all the things.
<davidlt[m]>
Otherwise I will just do whatever I do :) But I prefer for things to happen faster thus more people take parts would be nice to have.
<davidlt[m]>
neil: showed an interest a few parts like pungi and disk image generation (which we don't do the way Fedora does).
<nirik>
davidlt[m]: possibly. I'm pretty busy, but I can see... I can at least write up what I think plans make sense and try and work on it?
<nirik>
where's the best place to discuss? mailing list?
<davidlt[m]>
Whatever you prefer. Mailing list might be good as we haven't sent to many emails there :)
<neil>
i need to make my fedoraproject.org email alias work some day...
<davidlt[m]>
If you want I could describe what we have, what we considered, etc.
<davidlt[m]>
Realistically our biggest issues is storage. We have ~20TB of fast flash storage, not used in efficient way. Like we even keep backups on the same drives. We have 100+TB of HDDs, that don't have a physical server (never got funding to get it going, but the drives exist).
<davidlt[m]>
We are running out of /mnt/koji space. Quick solution would be to pool all 3 NVMes (~20TB) of Flash and with no redundancy an keep going for 2-3 years. Depend on backups, but actually build a local physical server.
<nirik>
my thought was to spin up a hub in aws... and import f38 once the current koji builds it... then we only have 38+ in there and can try and start keeping up with mainline...
<davidlt[m]>
Our Koji is located in SF, but majority of the boards will be at my place. Koji is data movement challenge. We moved over petabyte of data. I am in Lithuania thus that's a long way to ship data without a local cache, but I do have a fiber.
<nirik>
but that doesn't take much advantage of your current hw
<davidlt[m]>
We will run out of storage before we can get to full Rawhide.
<davidlt[m]>
Oh yeah, our flash storage was expensive, but I could do repos very fast ;)
* djdelorie
wonders if "move to SF" is one of davidlt[m]'s options ;-)
<davidlt[m]>
And it had no problems feeding ~170 QEMU builders.
<davidlt[m]>
Yeah, but I am not living alone thus not my own personal decision :)
<nirik>
the current hub... what space does it have for storage expansion?
<neil>
nirik: not yet
<davidlt[m]>
Too small main drive, 256G. <20TB PCIe x8 NVMe. All slots filled. There are 8 SAS/SATA drive support IIRC.
<nirik>
neil: ok, thats needed for the alias to kick in. I can add you to some group if you like... \
<davidlt[m]>
Thus I can get it to limb towards full Rawhide by pooling all NVMe to ~20TB (no redundancy, that's fine with me). Updating the main drive to 1-2T NVMe (M.2). Use the same M.2 for postgre db.
<davidlt[m]>
Then have a small machine with those 100+TB raw HDD storage with some RAID6 or something and depend on that.
<davidlt[m]>
Currently backup repository sits here locally, but that's running out of space too.
<davidlt[m]>
I do have a replacement for external drive to NAS (50TB).
<davidlt[m]>
And at some point I might switch to 2Gbps fiber.
<davidlt[m]>
I have 16 Unmatched boards that will be connected to Koji. Still need to buy a few parts.
<davidlt[m]>
Our Koji runs on Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
<davidlt[m]>
That's 2S, 20C, 40T.
<davidlt[m]>
128G of RAM
<nirik>
cool.
<nirik>
so it sounds like: limp along to rawhide parity or close... then discuss and figure out plan after that?
<davidlt[m]>
Current config is RAID1 with 2 NVMe for /mnt/koji.
<davidlt[m]>
and one NVMe is used for backups and postgredb.
<neil>
i suspected I'd be a member of at least one group due to signing the contributor agreement but I think maybe the signed_fpca group isn't working or needed anymore maybe
<nirik>
and do you need $'s for the first part now (disclaimer: I don't have any, but I can talk to mgmt)
<nirik>
neil: yeah, it has to be one in addition to that.
<davidlt[m]>
I can work on the funding.
<neil>
I too can inquire about funding via $dayjob and/or Rocky Enterprise Software Foundation
<davidlt[m]>
I have some secured for buying the missing parts for Unmatched to get 16 boards connected.
<neil>
very nice :)
<davidlt[m]>
The rest depends on how you would like to handle our Koji infra :)
<davidlt[m]>
So if we decide to limp for some time until we fully catch up, that's fine. In that case minimal to none investment needed probably.
<davidlt[m]>
But we still need a plan for afterwards.
<neil>
and "wing it" doesn't count as a plan I guess?
davidlt has joined #fedora-riscv
<davidlt[m]>
You might want to ping Al Stone at RH about this.
<davidlt[m]>
Well, the current stuff only gives us ability to produce RPMs and disk images, but not in a proper way.
<davidlt[m]>
A proper content, but cooked in an old ARMv7 way.
<davidlt[m]>
No pungi, koschei, koji-shadow, modularity, RPM sign infra, CI gating, etc.
<davidlt[m]>
If I do that then I don't have to look into packages :)
<neil>
:)
<nirik>
I think a lot of that could come after mainline/primary and isn't so important for secondary.
<nirik>
pungi/composes might be good tho.
<davidlt[m]>
That's not gonna happen for quite some time.
<davidlt[m]>
We have been discussing this for years now. Until the proper standards based hardware arrives riscv64 will not be in the official Fedora koji instance.
<davidlt[m]>
So it's always gonna be a secondary arch with a separate koji infra.
<nirik>
right, it has to be able to reasonably keep up and have hw that doesn't need a lot of handholding
<djdelorie>
"keep up" is a separate problem
<djdelorie>
"server grade" is more like remote managment etc
<davidlt[m]>
Well, if SiFive / Intel P550 with Intel 7nm happens at some point that shouldn't be an issue I gues :)
<nirik>
right.
<davidlt[m]>
That is/was suppose to be released in 2022.
<nirik>
years not over yet. ;)
<davidlt[m]>
StarFive Tech JH7110 will give a nice boost (but 8GB of RAM, way cheaper too).
<davidlt[m]>
Servers are some years out, but work is WIP.
<davidlt[m]>
Specs are going forward. Ventana is upstreaming their stuff into GNU toolchain.
<davidlt[m]>
Until that we support e-waste hardware (not build based on standards) like old good armv7hl :)
<nirik>
right, so it might be that we have 1 gen of secondary infra we need to run before mainline... anyhow, I can start a thread on this on the list and we can see if there's consensus
<nirik>
our first 32 bit arm hw was the lovely calxeda... 24 (I think) armv7 boards in a 4U chassis
<davidlt[m]>
Note, we will run out of storage sooner than later thus some rebuilding will need to happen :)
<davidlt[m]>
I actually already run out of space one night, but I found an extra ~140G I could delete :)
<davidlt[m]>
I never touch that. I was ARMv8/aarch64 boy. All my toys were from other brands :)
<nirik>
yep. step1: more storage to limp along more. step2: new hub/storage setup with more room to grow/other services. step3: mainline
<nirik>
(at least in my mind)
<davidlt[m]>
So we might need to have "temporary plan" before anything else.
<davidlt[m]>
Basically how to keep it afloat until we catch up on RPMs side.
<nirik>
yep
<davidlt[m]>
My suggestion: pool in all the PCIe x8 NVMes to form ~20TB /mnt/koji (no redudancy). Update the main drive (main OS + postgredb). Or and also look into SAS drives for postgredb.
<nirik>
that sounds fine to me. Does that mean you will have to reformat and sync data back from backup?
<davidlt[m]>
Use those 100+TB RAW HDDs with high redundancy for local backups (we use restic for that).
<davidlt[m]>
Would require buying a sever for that.
<davidlt[m]>
I would considering switch to Btrfs for /mnt/koji for snapshots.
<davidlt[m]>
Yes, either local or remote (directly from home).
<davidlt[m]>
It's probably ~6TB.
<nirik>
so, perhaps we should look for funding for a new server (with no or limited drives, since you have a bunch)
<davidlt[m]>
This is what we have right now in Koji hub:
<davidlt[m]>
Yeah, server also needs a new home. The current Colo in Fremont should be replaced.
<davidlt[m]>
Alternative would be to build a local cheaper server at my place as that's where majority of boards will be.
<nirik>
or aws... ;) (could be in a region near your boards)
<davidlt[m]>
I am not sure moving such amounts of data over such distance makes sense.
<nirik>
we don't really have a place for a machine right now, I would have said our community cage in rdu, but it's supposedly moving at some point before too long... but I guess it's possible to have something there.
<nirik>
hum.
<neil>
i'll check w/ some contacts to see if I can scrounge some hw. colo space is a bit harder. assumedly something lower power would be nice if it's gonna be in your home
* nirik
just remembered a blade center he was trying to find a good use for.
<davidlt[m]>
The thing about AWS is that it's not NVMe and moving data out is expensive :)
<nirik>
it can be nvme. You just need to specify... :) and amazon is comping our fedora account right now at least...
<neil>
👆
<neil>
:)
<neil>
i'll also throw out that Rocky has a new build system that might be useful at least insofar as it doesn't require NFS at all--just an object storage
<neil>
obviously it needs to go into koji at the end of the day, though
<davidlt[m]>
Well as I said before I can leave the best course of action for you both to figure out :)
<davidlt[m]>
Of course we could move it to AWS ASAP and just forget about it, or Rocky infra.
<davidlt[m]>
djdelorie: check your baord
<davidlt[m]>
djdelorie: I don't see a ping on Koji side
<djdelorie>
right, the usual "won't start kojid on boot" problem
<davidlt[m]>
you can modify service file ;)
<davidlt[m]>
It's back
<neil>
nirik want me to start an ether pad or something and we can jot down ideas about steps/plans/phases?
<nirik>
neil: if you like. I was gonna start a list thread, but we could use something like that to organize the discussion.
<neil>
the openinfra folks infected me with ether pad ☺
<nirik>
I need to drop off for a bit... in town today waiting for my car to get repaied and I need to move and find a place with power. ;)
<nirik>
back in a bit.
<davidlt[m]>
I will be sleeping, but you can leave me any questions in a thread, IRC, or ether pad :)