dustymabe changed the topic of #fedora-coreos to: Fedora CoreOS :: Find out more at https://getfedora.org/coreos/ :: Logs at https://libera.irclog.whitequark.org/fedora-coreos
mheon has quit [Ping timeout: 252 seconds]
daMaestro has quit [Quit: Leaving]
daMaestro has joined #fedora-coreos
plarsen has quit [Quit: NullPointerException!]
ravanelli has quit [Remote host closed the connection]
vgoyal has quit [Quit: Leaving]
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 260 seconds]
bgilbert has quit [Ping timeout: 255 seconds]
Betal has quit [Quit: WeeChat 3.8]
jcajka has joined #fedora-coreos
saschagrunert has joined #fedora-coreos
npcomp has quit [Ping timeout: 255 seconds]
jpn has joined #fedora-coreos
<apollo13> can someone help me understand how fedora-coreos-pool works? it contains qemu-guest-agent from the looks of it, but when trying to add that to my packages it complains because liburing.so.2 is missing
<apollo13> also how can I generate a lockfile per used lockfile-repo?
jpn has quit [Ping timeout: 268 seconds]
npcomp has joined #fedora-coreos
kian[m] has quit [Quit: You have been kicked for being idle]
Turnikov has joined #fedora-coreos
Turnikov has quit [Ping timeout: 252 seconds]
Turnikov has joined #fedora-coreos
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 248 seconds]
Turnikov has quit [Ping timeout: 252 seconds]
Turnikov has joined #fedora-coreos
Turnikov has quit [Ping timeout: 255 seconds]
jpn has joined #fedora-coreos
Turnikov has joined #fedora-coreos
Turnikov has quit [Ping timeout: 255 seconds]
Turnikov has joined #fedora-coreos
nalind has joined #fedora-coreos
vgoyal has joined #fedora-coreos
plarsen has joined #fedora-coreos
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
<apollo13> mhm I am adding packages from fedora/fedora-updates repos and sometimes it complains about failing gpg verification. Any idea why?
mheon has joined #fedora-coreos
jlebon has joined #fedora-coreos
<dustymabe> apollo13: corrupt download? bad mirror? not sure
<dustymabe> is it consistent?
<dustymabe> this should get branched stream unblocked: https://github.com/coreos/fedora-coreos-config/pull/2253
<dustymabe> aaradhak[m]: once that merges you can rebase your branched f38 PR
<aaradhak[m]> I see. I will rebase F38 PR after the merge of 2253
<dustymabe> jlebon: mind a review on https://github.com/coreos/coreos-assembler/pull/3364 ? Hopefully that will get us unblocked and I can roll out the library update and credentials file format update
<apollo13> dustymabe: yeah corrupt download maybe, after removing the cache dir it was okay again
Turnikov has quit [Ping timeout: 255 seconds]
<jlebon> dustymabe: ack will do
<apollo13> btw what is the recommended way to get stable passwd/group ids? Or what ensure that the behavior is deterministic?
<apollo13> ie I added nomad & consul which creates nomad & consul users and I am a bit afraid that the next build might switch those ids etc
<dustymabe> apollo13: did you add those users via Ignition? you can specify the UID/GID for the user
<apollo13> dustymabe: no, I am building my own coreos spin so to say and those are part of rpm package preinstall scripts
<dustymabe> apollo13: there are definitely people that know more about this than me :) - maybe jlebon or travier[m] will chime in once they see this
<apollo13> https://github.com/coreos/rpm-ostree/issues/36 let me read up on that
<dustymabe> ok aaradhak[m] that PR merged.. wait 10m or so until it gets synced over to the branched branch and then rebase your PR
<apollo13> okay there is "preserve-passwd", that said my CI always does new builds, what is the best way to get the previous commit from a docker registry?
<jlebon> but we're trying to migrate to systemd-sysusers
<jlebon> the main thing IIRC is whether the rpm needs to install content (outside of /var) as that user
<apollo13> not really no, it is mainly what happens if the user changes and how that affects state in var which might be unreadable
<apollo13> then
<jlebon> right, the passwd file i linked above is how we ensure that today
<apollo13> right, is there a way to create users before the rpm runs?
<apollo13> I want to fix the user id and have a stable/nice one
<apollo13> the rpm preinstall luckily checks if the user is already there and if yes does nothing
<apollo13> so if I could precreate the users I'd know what to put into the passwd/group file :)
Turnikov has joined #fedora-coreos
jcajka has quit [Quit: Leaving]
Turnikov has quit [Ping timeout: 255 seconds]
<jlebon> i think the way that file was created is the other way actually. a compose is made, and then the file is updated with what came out. then subsequent composes are ensured to match that file.
<jlebon> but honestly, it's been a while since i've looked at this code and there's been changes there recently to prepare for moving over to sysusers. might be worth following up in existing rpm-ostree issues or opening a new one
<apollo13> mhm, it gets confusing now, my manifest includes fedora-coreos-base and I just modified the passwd to see if it fails but it doesn't seem to bother :D
<jlebon> but quickly, yes, you can also use `preserve-passwd` + `check-passwd: previous`. to import the previous build, you can use `ostree container unencapsulate`
<apollo13> fair enough, though my main problem currently is that check-passwd doesn't seem to do anything
<jlebon> hmm, that sounds like a bug. might be worth filing an rpm-ostree issue :)
<apollo13> yeah dunno, is there any way to get some useful debug output or anything I should look for in the logs?
<apollo13> I think I have an idea, preserve-password defaults to true nowadays
<apollo13> so it would just copy the file and as such always match, trying with preserve-passwd false now
<apollo13> maybe that results in actually checking it
<apollo13> ok yeah, that is it: error: Validating user entries according to treefile check-passwd: passwd UID changed: chrony (994 to 996)
saschagrunert has quit [Quit: Leaving]
<travier[m]> apollo13: I'd recommend using a systemd-sysusers config with a stable UID/GID. If you need to chown things, use tmpfile.d configs
<travier[m]> You can do all that without an RPM
<travier[m]> apollo13: https://github.com/travier/fedora-coreos-nomad > This this for an example with nomad on FCOS
<apollo13> travier[m]: The thing is I want to install the nomad & consul RPMs, can I put sysusers into that *before* useradd from there runs?
<travier[m]> No need to rebuild FCOS
<travier[m]> Why install the RPMs? Why not use layrering?
<travier[m]> s/layrering/layering/
<apollo13> looking through the butane config: how would that work for nomad updates? nomad-binaries.service only runs once due to the conditions
<apollo13> as for layering: how? you mean ostree-layers in the treefile?
<apollo13> I am not married to the RPM approach but it seemed to be the easiest
ravanelli has quit [Remote host closed the connection]
<travier[m]> CoreOS layering, via a Containerfile: https://github.com/coreos/layering-examples
<travier[m]> To update you need to remove the binaries and reboot, they get re-downloaded on boot.
<travier[m]> This repo is an example, not a full production ready setup. A starting point
<apollo13> mhm good point about the layering, maybe that would be enough
<apollo13> any recommendations on how to add users with the layering examples? just write a sysusers config?
<apollo13> and since you seem to be playing with nomad as well: do you have any recommendations on how to nicely supply butane configurations in a bare metal environment? some aws metadata service emulation or so out there?
ravanelli has joined #fedora-coreos
bgilbert has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
gursewak__ has joined #fedora-coreos
daMaestro has quit [Quit: Leaving]
<travier[m]> yes, just write a sysusers config file with fixed UIDs/GIDs and run systemd-sysusers and that should set things up. Otherwise it will be set on boot.
<travier[m]> for bare metal I don't more specific recommendations that what we have in the docs
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
<apollo13[m]> Thanks, will try that tomorrow 🙂
vgoyal has quit [Quit: Leaving]
Betal has joined #fedora-coreos
<apollo13[m]> siosm[m]: if I layer via container files, do you have any recommendations on how to deploy that onto the machines? Currently I have my own qcow file that comes out of the cosa build process. Not sure how to do that with containers
<travier[m]> With a container you rebase to the image after the first boot. We don't have a good story for producing images from containers right now
vgoyal has joined #fedora-coreos
rsalveti has quit [Quit: Connection closed for inactivity]
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
<Kanibal> and on the same note - is it possible, to build a aarch64 image with cosa on a x86_64 machine?
<dustymabe> Kanibal: we don't really support crossbuilding
<dustymabe> jlebon: I'm going to start merging the azure library update PRs (starting with https://github.com/coreos/coreos-assembler/pull/3349 and https://github.com/coreos/fedora-coreos-pipeline/pull/811) and then the backports (will disable the downstream pipeline until everything filters through) - WDYT?
<jlebon> dustymabe: SGTM
<apollo13[m]> Mhm yeah the rebasing after first boot is something I wanted to avoid, guess I'll stick with my suboptimal approach for now and switch over to containers once anaconda has support for directly kickstarting them or so. Nevertheless learned a lot, thanks for all the help!
<jlebon> dustymabe, ravanelli: should we try rolling out the testiso bits tmw?
<dustymabe> jlebon: I think she is half day tomorrow - but she did say she wanted to do it this week
<dustymabe> jlebon: I'm going to amend the commit in https://github.com/coreos/coreos-assembler/pull/3364 to fix the commit ID mentioned in the commit message. That's the only change I'm going to make, though so maybe you can LGTM and we can merge
<dustymabe> it has currently passed all tests already
<jlebon> dustymabe: from my POV, feel free to comment that and force merge it once you've updated it
<dustymabe> ok jlebon - backports to other branches are up: https://github.com/coreos/coreos-assembler/pulls
<dustymabe> all of them were a clean backport of the 4.12 commit
<dustymabe> which almost never happens
<dustymabe> now let's hope everything compiles :)
<jlebon> :lgtm:
<dustymabe> jlebon: do you think we should wait on openshift CI for those PRs ^^ ?
<jlebon> dustymabe: i think we can safely skip them. CoreOS CI checked it compiles, and neither it nor OpenShift CI does any azure uploading tests
<dustymabe> ok
<dustymabe> bgilbert: jlebon: thoughts on the best path forward for https://github.com/coreos/fedora-coreos-tracker/issues/1423#issuecomment-1438674260 ?
bgilbert has quit [Ping timeout: 246 seconds]
<dustymabe> jlebon: any idea what's going on with CI in https://github.com/coreos/coreos-assembler/pull/3366
<dustymabe> it looks like it builds the COSA container fine but when it starts to launch it to run `make check && make unittest` it has trouble
<dustymabe> it's happen twice now in the same spot
<jlebon> dustymabe: replied in the issue
<jlebon> re. CI: oh, i hadn't noticed until i read your msg that this was a rerun... i had started a new run :)
vgoyal_ has joined #fedora-coreos
<jlebon> might be worth checking the jenkins logs too
<dustymabe> like in the jenkins pod?
vgoyal has quit [Ping timeout: 246 seconds]
<dustymabe> jlebon: failed in the same spot the third time
<dustymabe> just kind of weird that it didn't fail for 412 411 or 49
<dustymabe> the same command that's failing seems to succeed for me locally: https://paste.centos.org/view/e550fd60
<jlebon> dustymabe: i've restarted it a third time, this time not doing a replay but "Build Now" to see if it makes a difference
<jlebon> did you check the pod logs to see if there were more details there?
<dustymabe> the jenkins pod or the cosa pod ?
<jlebon> the jenkins pod
<dustymabe> i looked at the jenkins pod but nothing stood out... it's really hard IMO to read java logs (there's just constant errors)
<jlebon> that warning looks suspicious, but i see it too in the other PRs that did pass
<dustymabe> if it fails this time just YOLO and merge?
<dustymabe> so we can re-enable the downstream pipeline
bgilbert has joined #fedora-coreos
<jlebon> if `make check && make unittest` passes for you locally on that branch, SGTM
nalind has quit [Quit: bye for now]
<dustymabe> it passed
<bgilbert> jlebon: re https://github.com/coreos/fedora-coreos-tracker/issues/1423#issuecomment-1439116591, I still need to write it up, but we agreed yesterday not to proceed with that
<bgilbert> the problem isn't ignition.firstboot but ignition.platform.id. we've been assuming that karg is always available, and that would break on live systems
<bgilbert> I have branches to make the appropriate adjustments, but it feels brittle compared to just making a docs fix
<jlebon> ahhh, missed that convo. :) so you wrote patches to fix the bits that assume ignition.platform.id? how invasive are they?
<bgilbert> not too invasive. it's one path in Afterburn (network kargs) and a few random units
<bgilbert> it's more a question of whether we want the special case, for something that's essentially user error
<bgilbert> it turns out we generally (but not universally) document that ignition.firstboot should be used in PXE configs
<jlebon> ok, i think i can get behind that. it just wasn't clear to me we should make ignition.platform.id a hard requirement. you could imagine framing it as "we default to `metal` if unspecified"
vgoyal_ has quit [Remote host closed the connection]
vgoyal_ has joined #fedora-coreos
<bgilbert> jlebon: yeah, but then we'd need to remember that in ConditionKernelArgument= lists, etc.
<bgilbert> there are a bunch of independent things that read the karg
vgoyal_ has quit [Client Quit]
<jlebon> having it baked in /etc/cmdline.d would've worked at least for dracut stuff, but yeah there's more stuff
<bgilbert> I don't have a strong opinion, mostly a sense of unease. we can always revisit if needed
<jlebon> anyway, just documenting it for now SGTM! though we should probably check we give a clear error if it's missing
<jlebon> yeah agreed (re. revisit)
<bgilbert> I was thinking of having pxe customize print out a reminder
<bgilbert> oh, also: there is a use case for not running ignition
<bgilbert> namely coreos.inst kargs
<bgilbert> yup
<bgilbert> full circle :-P
<jlebon> nice :)
* dustymabe needs to head out - see you tomorrow!
* bgilbert waves
bgilbert has quit [Ping timeout: 255 seconds]
mheon has quit [Ping timeout: 264 seconds]