#fedora-coreos on 2023-02-21 — irc logs at libera.irclog.whitequark.org

2022-05-11 12:42 dustymabe changed the topic of #fedora-coreos to: Fedora CoreOS :: Find out more at https://getfedora.org/coreos/ :: Logs at https://libera.irclog.whitequark.org/fedora-coreos

00:36 mheon has quit [Ping timeout: 252 seconds]

00:54 daMaestro has quit [Quit: Leaving]

01:19 daMaestro has joined #fedora-coreos

01:23 plarsen has quit [Quit: NullPointerException!]

01:45 ravanelli has quit [Remote host closed the connection]

01:49 vgoyal has quit [Quit: Leaving]

02:24 ravanelli has joined #fedora-coreos

02:28 ravanelli has quit [Ping timeout: 260 seconds]

03:27 bgilbert has quit [Ping timeout: 255 seconds]

06:38 Betal has quit [Quit: WeeChat 3.8]

07:43 jcajka has joined #fedora-coreos

07:45 saschagrunert has joined #fedora-coreos

08:42 npcomp has quit [Ping timeout: 255 seconds]

08:49 jpn has joined #fedora-coreos

08:50 <apollo13> can someone help me understand how fedora-coreos-pool works? it contains qemu-guest-agent from the looks of it, but when trying to add that to my packages it complains because liburing.so.2 is missing

08:51 <apollo13> also how can I generate a lockfile per used lockfile-repo?

08:54 jpn has quit [Ping timeout: 268 seconds]

08:58 npcomp has joined #fedora-coreos

09:00 kian[m] has quit [Quit: You have been kicked for being idle]

10:32 Turnikov has joined #fedora-coreos

10:37 Turnikov has quit [Ping timeout: 252 seconds]

10:38 Turnikov has joined #fedora-coreos

11:04 ravanelli has joined #fedora-coreos

11:08 ravanelli has quit [Ping timeout: 248 seconds]

11:09 Turnikov has quit [Ping timeout: 252 seconds]

11:25 Turnikov has joined #fedora-coreos

11:39 Turnikov has quit [Ping timeout: 255 seconds]

11:56 jpn has joined #fedora-coreos

12:14 Turnikov has joined #fedora-coreos

12:22 Turnikov has quit [Ping timeout: 255 seconds]

12:23 Turnikov has joined #fedora-coreos

12:56 nalind has joined #fedora-coreos

13:03 vgoyal has joined #fedora-coreos

13:27 plarsen has joined #fedora-coreos

13:34 ravanelli has joined #fedora-coreos

13:39 ravanelli has quit [Remote host closed the connection]

13:46 ravanelli has joined #fedora-coreos

13:53 <apollo13> mhm I am adding packages from fedora/fedora-updates repos and sometimes it complains about failing gpg verification. Any idea why?

14:04 mheon has joined #fedora-coreos

14:12 jlebon has joined #fedora-coreos

14:27 <dustymabe> apollo13: corrupt download? bad mirror? not sure

14:27 <dustymabe> is it consistent?

14:36 <dustymabe> this should get branched stream unblocked: https://github.com/coreos/fedora-coreos-config/pull/2253

14:36 <dustymabe> aaradhak[m]: once that merges you can rebase your branched f38 PR

14:38 <aaradhak[m]> I see. I will rebase F38 PR after the merge of 2253

14:39 <dustymabe> jlebon: mind a review on https://github.com/coreos/coreos-assembler/pull/3364 ? Hopefully that will get us unblocked and I can roll out the library update and credentials file format update

14:41 <apollo13> dustymabe: yeah corrupt download maybe, after removing the cache dir it was okay again

14:41 Turnikov has quit [Ping timeout: 255 seconds]

14:44 <jlebon> dustymabe: ack will do

14:44 <apollo13> btw what is the recommended way to get stable passwd/group ids? Or what ensure that the behavior is deterministic?

14:45 <apollo13> ie I added nomad & consul which creates nomad & consul users and I am a bit afraid that the next build might switch those ids etc

14:47 <dustymabe> apollo13: did you add those users via Ignition? you can specify the UID/GID for the user

14:47 <apollo13> dustymabe: no, I am building my own coreos spin so to say and those are part of rpm package preinstall scripts

14:57 <dustymabe> apollo13: there are definitely people that know more about this than me :) - maybe jlebon or travier[m] will chime in once they see this

14:58 <apollo13> https://github.com/coreos/rpm-ostree/issues/36 let me read up on that

14:58 <dustymabe> ok aaradhak[m] that PR merged.. wait 10m or so until it gets synced over to the branched branch and then rebase your PR

15:00 <apollo13> okay there is "preserve-passwd", that said my CI always does new builds, what is the best way to get the previous commit from a docker registry?

15:03 <jlebon> apollo13: the traditional way is to do something like this: https://github.com/coreos/fedora-coreos-config/blob/testing-devel/manifests/passwd and https://github.com/coreos/fedora-coreos-config/blob/66da416fe3f75eaa6c96b4f3c4eee3f45ff13242/manifests/fedora-coreos-base.yaml#L38-L43

15:04 <jlebon> but we're trying to migrate to systemd-sysusers

15:05 <jlebon> the main thing IIRC is whether the rpm needs to install content (outside of /var) as that user

15:06 <jlebon> see https://github.com/coreos/rpm-ostree/issues/49

15:06 <apollo13> not really no, it is mainly what happens if the user changes and how that affects state in var which might be unreadable

15:06 <apollo13> then

15:07 <jlebon> right, the passwd file i linked above is how we ensure that today

15:08 <apollo13> right, is there a way to create users before the rpm runs?

15:08 <apollo13> I want to fix the user id and have a stable/nice one

15:08 <apollo13> the rpm preinstall luckily checks if the user is already there and if yes does nothing

15:09 <apollo13> so if I could precreate the users I'd know what to put into the passwd/group file :)

15:09 Turnikov has joined #fedora-coreos

15:09 jcajka has quit [Quit: Leaving]

15:14 Turnikov has quit [Ping timeout: 255 seconds]

15:14 <jlebon> i think the way that file was created is the other way actually. a compose is made, and then the file is updated with what came out. then subsequent composes are ensured to match that file.

15:15 <jlebon> but honestly, it's been a while since i've looked at this code and there's been changes there recently to prepare for moving over to sysusers. might be worth following up in existing rpm-ostree issues or opening a new one

15:16 <apollo13> mhm, it gets confusing now, my manifest includes fedora-coreos-base and I just modified the passwd to see if it fails but it doesn't seem to bother :D

15:16 <jlebon> but quickly, yes, you can also use `preserve-passwd` + `check-passwd: previous`. to import the previous build, you can use `ostree container unencapsulate`

15:17 <apollo13> fair enough, though my main problem currently is that check-passwd doesn't seem to do anything

15:18 <jlebon> hmm, that sounds like a bug. might be worth filing an rpm-ostree issue :)

15:20 <apollo13> yeah dunno, is there any way to get some useful debug output or anything I should look for in the logs?

15:22 <apollo13> I think I have an idea, preserve-password defaults to true nowadays

15:23 <apollo13> so it would just copy the file and as such always match, trying with preserve-passwd false now

15:23 <apollo13> maybe that results in actually checking it

15:25 <apollo13> ok yeah, that is it: error: Validating user entries according to treefile check-passwd: passwd UID changed: chrony (994 to 996)

15:26 saschagrunert has quit [Quit: Leaving]

15:28 <travier[m]> apollo13: I'd recommend using a systemd-sysusers config with a stable UID/GID. If you need to chown things, use tmpfile.d configs

15:29 <travier[m]> You can do all that without an RPM

15:29 <travier[m]> apollo13: https://github.com/travier/fedora-coreos-nomad > This this for an example with nomad on FCOS

15:29 <apollo13> travier[m]: The thing is I want to install the nomad & consul RPMs, can I put sysusers into that *before* useradd from there runs?

15:29 <travier[m]> No need to rebuild FCOS

15:30 <travier[m]> Why install the RPMs? Why not use layrering?

15:30 <travier[m]> s/layrering/layering/

15:32 <apollo13> looking through the butane config: how would that work for nomad updates? nomad-binaries.service only runs once due to the conditions

15:32 <apollo13> as for layering: how? you mean ostree-layers in the treefile?

15:32 <apollo13> I am not married to the RPM approach but it seemed to be the easiest

15:42 ravanelli has quit [Remote host closed the connection]

15:44 <travier[m]> CoreOS layering, via a Containerfile: https://github.com/coreos/layering-examples

15:44 <travier[m]> To update you need to remove the binaries and reboot, they get re-downloaded on boot.

15:45 <travier[m]> This repo is an example, not a full production ready setup. A starting point

15:46 <apollo13> mhm good point about the layering, maybe that would be enough

15:47 <apollo13> any recommendations on how to add users with the layering examples? just write a sysusers config?

15:48 <apollo13> and since you seem to be playing with nomad as well: do you have any recommendations on how to nicely supply butane configurations in a bare metal environment? some aws metadata service emulation or so out there?

15:51 ravanelli has joined #fedora-coreos

15:52 bgilbert has joined #fedora-coreos

16:03 ravanelli has quit [Remote host closed the connection]

16:15 gursewak__ has joined #fedora-coreos

16:18 daMaestro has quit [Quit: Leaving]

16:33 <travier[m]> yes, just write a sysusers config file with fixed UIDs/GIDs and run systemd-sysusers and that should set things up. Otherwise it will be set on boot.

16:33 <travier[m]> for bare metal I don't more specific recommendations that what we have in the docs

16:35 ravanelli has joined #fedora-coreos

17:39 ravanelli has quit [Remote host closed the connection]

18:06 <apollo13[m]> Thanks, will try that tomorrow 🙂

18:12 vgoyal has quit [Quit: Leaving]

18:22 Betal has joined #fedora-coreos

18:58 <apollo13[m]> siosm[m]: if I layer via container files, do you have any recommendations on how to deploy that onto the machines? Currently I have my own qcow file that comes out of the cosa build process. Not sure how to do that with containers

19:00 <travier[m]> With a container you rebase to the image after the first boot. We don't have a good story for producing images from containers right now

19:12 vgoyal has joined #fedora-coreos

19:19 rsalveti has quit [Quit: Connection closed for inactivity]

19:46 ravanelli has joined #fedora-coreos

19:47 ravanelli has quit [Remote host closed the connection]

19:47 ravanelli has joined #fedora-coreos

19:51 <Kanibal> and on the same note - is it possible, to build a aarch64 image with cosa on a x86_64 machine?

19:55 <dustymabe> Kanibal: we don't really support crossbuilding

20:01 <dustymabe> jlebon: I'm going to start merging the azure library update PRs (starting with https://github.com/coreos/coreos-assembler/pull/3349 and https://github.com/coreos/fedora-coreos-pipeline/pull/811) and then the backports (will disable the downstream pipeline until everything filters through) - WDYT?

20:04 <jlebon> dustymabe: SGTM

20:22 <apollo13[m]> Mhm yeah the rebasing after first boot is something I wanted to avoid, guess I'll stick with my suboptimal approach for now and switch over to containers once anaconda has support for directly kickstarting them or so. Nevertheless learned a lot, thanks for all the help!

20:22 <jlebon> dustymabe, ravanelli: should we try rolling out the testiso bits tmw?

20:36 <dustymabe> jlebon: I think she is half day tomorrow - but she did say she wanted to do it this week

20:36 <dustymabe> jlebon: I'm going to amend the commit in https://github.com/coreos/coreos-assembler/pull/3364 to fix the commit ID mentioned in the commit message. That's the only change I'm going to make, though so maybe you can LGTM and we can merge

20:37 <dustymabe> it has currently passed all tests already

20:38 <jlebon> dustymabe: from my POV, feel free to comment that and force merge it once you've updated it

20:44 <dustymabe> ok jlebon - backports to other branches are up: https://github.com/coreos/coreos-assembler/pulls

20:44 <dustymabe> all of them were a clean backport of the 4.12 commit

20:44 <dustymabe> which almost never happens

20:44 <dustymabe> now let's hope everything compiles :)

20:51 <jlebon> :lgtm:

21:06 <dustymabe> jlebon: do you think we should wait on openshift CI for those PRs ^^ ?

21:09 <jlebon> dustymabe: i think we can safely skip them. CoreOS CI checked it compiles, and neither it nor OpenShift CI does any azure uploading tests

21:09 <dustymabe> ok

21:14 <dustymabe> bgilbert: jlebon: thoughts on the best path forward for https://github.com/coreos/fedora-coreos-tracker/issues/1423#issuecomment-1438674260 ?

21:15 bgilbert has quit [Ping timeout: 246 seconds]

21:26 <dustymabe> jlebon: any idea what's going on with CI in https://github.com/coreos/coreos-assembler/pull/3366

21:26 <dustymabe> it looks like it builds the COSA container fine but when it starts to launch it to run `make check && make unittest` it has trouble

21:27 <dustymabe> it's happen twice now in the same spot

21:31 <jlebon> dustymabe: replied in the issue

21:32 <jlebon> re. CI: oh, i hadn't noticed until i read your msg that this was a rerun... i had started a new run :)

21:32 vgoyal_ has joined #fedora-coreos

21:34 <jlebon> might be worth checking the jenkins logs too

21:34 <dustymabe> like in the jenkins pod?

21:34 vgoyal has quit [Ping timeout: 246 seconds]

21:39 <dustymabe> jlebon: failed in the same spot the third time

21:41 <dustymabe> just kind of weird that it didn't fail for 412 411 or 49

21:46 <dustymabe> the same command that's failing seems to succeed for me locally: https://paste.centos.org/view/e550fd60

21:46 <jlebon> dustymabe: i've restarted it a third time, this time not doing a replay but "Build Now" to see if it makes a difference

21:46 <jlebon> did you check the pod logs to see if there were more details there?

21:47 <dustymabe> the jenkins pod or the cosa pod ?

21:47 <jlebon> the jenkins pod

21:47 <dustymabe> i looked at the jenkins pod but nothing stood out... it's really hard IMO to read java logs (there's just constant errors)

21:49 <jlebon> that warning looks suspicious, but i see it too in the other PRs that did pass

21:50 <dustymabe> if it fails this time just YOLO and merge?

21:50 <dustymabe> so we can re-enable the downstream pipeline

21:51 bgilbert has joined #fedora-coreos

21:51 <jlebon> if `make check && make unittest` passes for you locally on that branch, SGTM

22:06 nalind has quit [Quit: bye for now]

22:10 <dustymabe> it passed

22:12 <bgilbert> jlebon: re https://github.com/coreos/fedora-coreos-tracker/issues/1423#issuecomment-1439116591, I still need to write it up, but we agreed yesterday not to proceed with that

22:13 <bgilbert> the problem isn't ignition.firstboot but ignition.platform.id. we've been assuming that karg is always available, and that would break on live systems

22:13 <bgilbert> I have branches to make the appropriate adjustments, but it feels brittle compared to just making a docs fix

22:17 <jlebon> ahhh, missed that convo. :) so you wrote patches to fix the bits that assume ignition.platform.id? how invasive are they?

22:24 <bgilbert> not too invasive. it's one path in Afterburn (network kargs) and a few random units

22:24 <bgilbert> it's more a question of whether we want the special case, for something that's essentially user error

22:25 <bgilbert> it turns out we generally (but not universally) document that ignition.firstboot should be used in PXE configs

22:27 <jlebon> ok, i think i can get behind that. it just wasn't clear to me we should make ignition.platform.id a hard requirement. you could imagine framing it as "we default to `metal` if unspecified"

22:27 vgoyal_ has quit [Remote host closed the connection]

22:28 vgoyal_ has joined #fedora-coreos

22:28 <bgilbert> jlebon: yeah, but then we'd need to remember that in ConditionKernelArgument= lists, etc.

22:28 <bgilbert> there are a bunch of independent things that read the karg

22:28 vgoyal_ has quit [Client Quit]

22:28 <jlebon> having it baked in /etc/cmdline.d would've worked at least for dracut stuff, but yeah there's more stuff

22:29 <bgilbert> I don't have a strong opinion, mostly a sense of unease. we can always revisit if needed

22:29 <jlebon> anyway, just documenting it for now SGTM! though we should probably check we give a clear error if it's missing

22:29 <jlebon> yeah agreed (re. revisit)

22:30 <bgilbert> I was thinking of having pxe customize print out a reminder

22:30 <bgilbert> oh, also: there is a use case for not running ignition

22:30 <bgilbert> namely coreos.inst kargs

22:30 <dustymabe> right see https://github.com/coreos/fedora-coreos-tracker/issues/1423#issuecomment-1437828578

22:31 <bgilbert> yup

22:31 <bgilbert> full circle :-P

22:31 <jlebon> nice :)

22:31 * dustymabe needs to head out - see you tomorrow!

22:31 * bgilbert waves

22:48 bgilbert has quit [Ping timeout: 255 seconds]

23:21 mheon has quit [Ping timeout: 264 seconds]