#fedora-coreos on 2023-03-31 — irc logs at libera.irclog.whitequark.org

2022-05-11 12:42 dustymabe changed the topic of #fedora-coreos to: Fedora CoreOS :: Find out more at https://getfedora.org/coreos/ :: Logs at https://libera.irclog.whitequark.org/fedora-coreos

00:04 b100s has quit [Remote host closed the connection]

00:05 b100s has joined #fedora-coreos

00:38 plarsen has quit [Remote host closed the connection]

01:01 gursewak has joined #fedora-coreos

01:06 gursewak has quit [Ping timeout: 250 seconds]

01:31 bgilbert has quit [Ping timeout: 252 seconds]

01:50 b100s has quit [Remote host closed the connection]

01:58 b100s has joined #fedora-coreos

02:06 b100s has quit [Ping timeout: 250 seconds]

02:06 gursewak has joined #fedora-coreos

02:11 gursewak has quit [Ping timeout: 250 seconds]

02:20 gursewak has joined #fedora-coreos

02:25 gursewak has quit [Ping timeout: 260 seconds]

02:28 b100s has joined #fedora-coreos

02:35 gursewak has joined #fedora-coreos

02:38 troglodito has quit [Read error: Connection reset by peer]

02:39 gursewak has quit [Ping timeout: 250 seconds]

02:39 troglodito has joined #fedora-coreos

02:50 gursewak has joined #fedora-coreos

02:54 gursewak has quit [Ping timeout: 250 seconds]

03:31 jlebon has quit [Quit: leaving]

04:08 gursewak has joined #fedora-coreos

04:12 Betal has quit [Quit: WeeChat 3.8]

04:22 daMaestro has joined #fedora-coreos

05:06 gursewak has quit [Ping timeout: 250 seconds]

05:39 cyberpear has quit [Quit: Connection closed for inactivity]

05:40 jcajka has joined #fedora-coreos

05:47 sentenza has quit [Remote host closed the connection]

06:35 saschagrunert has joined #fedora-coreos

06:46 plundra has quit [Remote host closed the connection]

06:46 plundra has joined #fedora-coreos

07:12 jpn has joined #fedora-coreos

07:24 jpn has quit [Ping timeout: 250 seconds]

07:27 daMaestro has quit [Quit: Leaving]

08:04 jpn has joined #fedora-coreos

08:09 jpn has quit [Ping timeout: 255 seconds]

08:24 jpn has joined #fedora-coreos

08:46 bgilbert has joined #fedora-coreos

09:02 saschagrunert has quit [Remote host closed the connection]

09:35 bgilbert has quit [Ping timeout: 260 seconds]

10:01 fifofonix has joined #fedora-coreos

10:06 fifofonix has quit [Ping timeout: 265 seconds]

10:23 fifofonix has joined #fedora-coreos

11:53 jbrooks has quit [Read error: Connection reset by peer]

11:53 jbrooks has joined #fedora-coreos

12:21 vgoyal has joined #fedora-coreos

12:53 mheon has joined #fedora-coreos

13:38 jlebon has joined #fedora-coreos

13:39 <dustymabe> 👋 jlebon

13:44 <dustymabe> FYI all: no pods are getting started successfully in the FCOS pipeline

13:45 <dustymabe> looks like the jnlp container is having trouble: https://paste.centos.org/view/d3382090

13:47 <dustymabe> I can try to restart jenkins to see if that helps

14:00 <jlebon> dustymabe: 👋

14:01 <jlebon> will take a look too

14:08 millerthegorilla has joined #fedora-coreos

14:11 nalind has joined #fedora-coreos

14:17 <jlebon> restart did help

14:19 gursewak has joined #fedora-coreos

14:24 gursewak has quit [Ping timeout: 260 seconds]

14:36 cyberpear has joined #fedora-coreos

15:14 jcajka has quit [Quit: Leaving]

15:24 gursewak has joined #fedora-coreos

17:09 jpn has quit [Ping timeout: 255 seconds]

17:25 jpn has joined #fedora-coreos

17:27 plarsen has joined #fedora-coreos

17:31 Betal has joined #fedora-coreos

17:47 bgilbert has joined #fedora-coreos

17:51 <dustymabe> jlebon: any idea why secure boot doesn't seem to work for <= 33 ?

17:56 sentenza has joined #fedora-coreos

17:59 <bgilbert> dustymabe: what does "doesn't work" mean?

17:59 <dustymabe> it won't boot

17:59 <bgilbert> dustymabe: what does "won't boot" mean?

18:00 <dustymabe> BdsDxe: failed to load Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x2,0x0): Access Denied

18:00 <dustymabe> BdsDxe: No bootable option or device was found.

18:00 <dustymabe> BdsDxe: Press any key to enter the Boot Manager Menu.

18:00 <dustymabe> grab kola-ac334cf9.tar.xz from https://jenkins-fedora-coreos-pipeline.apps.ocp.stg.fedoraproject.org/blue/organizations/jenkins/kola-upgrade/detail/kola-upgrade/70/pipeline

18:00 <bgilbert> I vaguely recall some old GRUB vulnerabilities. have the old binaries been denylisted in UEFI?

18:01 <dustymabe> but i'm booting images that were created a long time ago (i.e. the bootloader baked in the image shouldn't deny itself)?

18:01 <dustymabe> or maybe there is something in newer qemu

18:01 <bgilbert> qemu is what I was thinking

18:01 <dustymabe> that makes it not boot

18:02 <dustymabe> ahh, yeah

18:02 <bgilbert> our tests probably shouldn't assume that old images will continue to Secure Boot forever

18:02 <dustymabe> yeah, they don't have to (we can add exceptions where needed)

18:03 <dustymabe> but when possible it would be nice to point to the reason something doesn't work (other than "we don't know why but this doesn't work")

18:07 <dustymabe> bgilbert: I'm guessing that means we have the same problem for secureboot too then

18:07 <dustymabe> i.e. anything pre-34 won't continue to boot

18:07 <dustymabe> unless someone applies `sudo bootupctl update` ?

18:07 <dustymabe> well

18:07 <bgilbert> dustymabe: there was a dangling thread from some time ago. maybe it was around "Boot Hole"? whichever vuln prompted the creation of bootupd

18:08 <dustymabe> I guess that depends on if they updated their hardware/firmware on their bare metal machines?

18:08 <bgilbert> dustymabe: we were told we'd need to update bootloaders in due course because a dbx update was coming

18:08 <bgilbert> dustymabe: ...actually, that might be it

18:08 <bgilbert> FCOS assumes it's not dual-booting with anything

18:09 <bgilbert> so if we don't perform dbx updates ourselves, we can continue to run with a denylisted bootloader and be none the wiser

18:09 <bgilbert> and for newer hardware that already has the newer dbx, users start with a newer FCOS anyway

18:09 <bgilbert> the July 2020 dbx update includes binaries identified as being from "Fedora Project"

18:10 <bgilbert> looking at the "CSV file" (actually .xlsx) at https://uefi.org/revocationlistfile

18:12 <dustymabe> bgilbert: so if I take my Compaq server (hypothetical example with unlikely hardware vendor) that has been running f32 with secure boot enabled for years now, unless I do something to the hardware (like run a vendor firmware update), my system will continue to boot?

18:12 <bgilbert> dustymabe: right, the OS is responsible for updating the denylist

18:13 <bgilbert> it feels like a bit of a time-bomb though

18:13 <dustymabe> bgilbert: so what is it in qemu that's making older systems (pre-baked from years ago) not boot?

18:13 guesswhat6 has joined #fedora-coreos

18:14 <bgilbert> dustymabe: I'm still digging, but presumably it ships with a newer denylist

18:14 <bgilbert> as will new hardware

18:15 <bgilbert> re the "dangling thread" ^: I don't think we ever followed through

18:17 <dustymabe> bgilbert: yeah, i wouldn't be surprised

18:18 <bgilbert> okay, looks like dbxtool is the low-level update tool, but it's wrapped by fwupd, which knows to refuse to update dbx if the ESP contains binaries with denied signatures

18:18 <bgilbert> (assuming nothing about the FCOS ESP handling confuses it; I haven't tested)

18:26 <bgilbert> dustymabe: I just spot-checked `dbxtool -l` inside `cosa run --qemu-firmware uefi-secure` and found hashes from the April 2021 dbx update

18:26 <bgilbert> I'd guess that OVMF builds eventually incorporate newer dbx

18:28 <bgilbert> hmm. `fwdupdtool get-updates` in that VM says `cannot find default ESP: No ESP or BDP found`

18:28 <dustymabe> yeah. I think the thing that confuses me is the relationship between the different pieces

18:29 <dustymabe> i.e is the controlling mechanism inside the image that's baked OR is it in the hardware/firmware of the machine

18:29 <dustymabe> the sense that I'm getting is that it's a combination of both

18:29 <bgilbert> the dbx denylist is baked into the firmware and then updated similar to a firmware update

18:30 <bgilbert> OSes are supposed to apply those updates to prevent the machine from later being compromised by a known vuln

18:30 <dustymabe> ok it makes sense now

18:30 <dustymabe> so.. if a person never updates their own firware, they'll continue to be able to boot

18:31 <bgilbert> yes, though somewhat defeating the secure in Secure Boot

18:31 <dustymabe> but if they do update their firmware or take the disk and move it to a different machine, then boot could be denied

18:31 <bgilbert> right

18:31 <dustymabe> well at least I understand it a bit more now :)

18:32 <bgilbert> I think we should probably try booting an old FCOS with old UEFI firmware (in qemu), upgrade to current, and then see whether fwupd correctly prevents a dbx update

18:32 fifofonix has quit [Read error: Connection reset by peer]

18:33 <dustymabe> :) - care to take that one?

18:33 <bgilbert> the concern is that it won't find our ESP (not mounted by default) and then decide to update dbx without checking the existing bootloader

18:33 <dustymabe> or maybe open an issue for it?

18:33 <bgilbert> (or read the code, I suppose)

18:33 <dustymabe> yep. that's a worrying concern

18:33 <dustymabe> just when you think everything is all stable and nice.. dusty has to go writing tests

18:34 <bgilbert> yeah, no more writing tests, all they do is find problems

18:34 <bgilbert> I'll open an issue. it might be a good task for someone to get some experience in the space

18:35 <bgilbert> actually

18:35 <bgilbert> dustymabe: could you open a bug with the test results you saw, and I'll comment on it?

18:37 <dustymabe> yep.

18:37 <bgilbert> ty

18:38 <jlebon> do i understand correctly the concern here is that FCOS isn't "fulfilling its job" of updating the firmware?

18:38 <jlebon> presumably as sole owner of the system

18:39 <jlebon> and so users' systems are still running with out of date revocation lists?

18:40 <bgilbert> that's one concern. it undermines Secure Boot protections

18:41 <bgilbert> another concern is that we need to verify that a user deciding to update their firmware with fwupd won't brick their FCOS install

18:42 <jlebon> ack gotcha

18:42 <jlebon> one the output of this should probably be an entry in the docs for now at least showing how to use fwupd

18:43 <bgilbert> yeah

18:44 <dustymabe> bgilbert: https://github.com/coreos/fedora-coreos-tracker/issues/1452

18:44 <dustymabe> fill in appropriate context ^^

18:44 <dustymabe> because I did a bad job of it

18:45 <bgilbert> +1, thanks

18:45 <jlebon> nice find dustymabe

18:45 gursewak has quit [Ping timeout: 248 seconds]

18:46 <bgilbert> dustymabe++

18:46 <zodbot> bgilbert: Karma for dustymabe changed to 4 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any

18:53 <millerthegorilla> Hi, in my butane/ignition I have a user with a supplementary group membership of the video group. Boot fails with journalctl message with failed to configure user - useradd: group 'video' does not exist. But the group 'video' is definitely in /etc/group (cat /etc/group in emergency mode), and if I try and create the group using the ignition

18:53 <millerthegorilla> file, boot fails complaining that the 'video' group already exists. I am using ansible so I can add the user to the group later on, but I am a bit confused as to why this would happen.

19:01 <bgilbert> millerthegorilla: known issue: https://github.com/coreos/fedora-coreos-tracker/issues/155

19:01 <bgilbert> the problem is that `video` is defined in /usr/lib/group and adduser doesn't see it there

19:03 <bgilbert> millerthegorilla: a bit more advice in https://github.com/coreos/butane/issues/411#issuecomment-1407544648

19:03 <bgilbert> it's definitely awkward

19:04 <jlebon> bgilbert: would you be against ignition learning to promote a group from /usr/lib to /etc ? compile-knob conditionalized probably

19:05 <bgilbert> I see this as a shadow-utils bug, but it's so painful that yeah, it probably deserves a workaround

19:05 <bgilbert> s/adduser/useradd/

19:06 <bgilbert> my Debian roots are showing

19:06 <jlebon> i can file an ignition and we can discuss there

19:06 <millerthegorilla> bgilbert thanks. I have been having all sorts of fun recently with the various group files, getent, lib/group, sssd and other forms of weird misdirection.

19:06 <bgilbert> jlebon: +1

19:06 <bgilbert> millerthegorilla: yeah, it's not intuitive

19:26 <uny[m]> Is there a way to have the CoreOS ISO boot to the system it just installed?

19:26 <uny[m]> It takes a support request to my ISP to mount/unmount an ISO, so I'd like to try the new system without unmounting the ISO first.

19:31 <uny[m]> Was hoping for a boot selector ... or to be able to chainload to /dev/vda from the boot command line ... but I'm not seeing how to do it.

19:41 * uny[m] uploaded an image: (51KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/NvLFcAnSngkNhcxycyYneyeZ/image.png >

19:41 <uny[m]> Also, the installer doesn't seem to have expanded the root partiton.

19:41 <uny[m]> Does it do this on first boot? I expected to see around a 52GB partition in there.

19:43 gursewak has joined #fedora-coreos

19:47 <uny[m]> Here's my .ign, just defaults for partitioning and filesystems: https://gist.githubusercontent.com/bronson/20e0735c9697570db28fec30700b0525/raw/fb7fb29483c2aa911ca83e0749c0e11cf529db4d/example.ign

19:50 <uny[m]> Right, it won't grow until I can figure out how to boot /dev/vda.

19:55 gursewak has quit [Remote host closed the connection]

19:57 <bgilbert> uny[m]: the root partition is expanded on first boot, yes. it's after Ignition runs, and doesn't happen if you've used the Ignition config to customize your storage.

19:58 <uny[m]> thanks, so far so good.

19:58 <bgilbert> uny[m]: the ISO doesn't have a built-in way to launch the installed system directly. kexec exists, but there are no purpose-built tools to use it for that purpose

19:58 <bgilbert> oh, maybe you're asking for a "boot to HD" entry in the boot menu

19:59 <uny[m]> I didn't see how to kexec to /dev/vda ... is it easy? Some web searching made it look pretty hard.

19:59 <bgilbert> haven't done it, but I don't think it's especially easy

19:59 <uny[m]> Yes, "Boot to HD" would have been awesome!

19:59 <bgilbert> I've seen ISOs with a "boot to HD" entry before. might be reasonable to add

20:00 <bgilbert> could you file a feature request? https://github.com/coreos/fedora-coreos-tracker/issues/new/choose

20:00 <uny[m]> sure, will do.

20:01 jpn has quit [Ping timeout: 260 seconds]

20:01 <uny[m]> So I had support unmount the ISO (they responded quick!) and I've booted into my new system ... and I notice I didn't give any users passwords and didn't assign the root user an ssh key in my .ign.

20:02 <uny[m]> Just went with the default example.bu in the docs.

20:02 <uny[m]> How do I get root permission?

20:02 <bgilbert> did you assign the core user an SSH key?

20:03 <uny[m]> AH, I didn't realize core was magic.

20:03 <bgilbert> it has sudo access by default

20:03 <uny[m]> I gave bronson an SSH key, but I don't seem to have sudo access.

20:03 <uny[m]> Crud.

20:04 <bgilbert> do you have console access?

20:04 <uny[m]> Well, I guess I need to reinstall with a core user?

20:04 <uny[m]> Yes I do

20:04 <jlebon> do you have write access to the console?

20:04 <uny[m]> think so

20:04 <jlebon> can you catch the grub boot menu?

20:04 <bgilbert> https://docs.fedoraproject.org/en-US/fedora-coreos/access-recovery/

20:05 <jlebon> right :)

20:06 <uny[m]> Thanks, giving it a shot.

20:11 <millerthegorilla> I included remote files in the config.ign but when I boot I am getting a tcp lookup error. Shouldn't networking be finished and working by the time these files are downloaded? I tried setting timeouts under ignition, of http_total 120, but the http requests didn't timeout, and so I am unable to see the log - is it written to disk at a path I

20:11 <millerthegorilla> can locate?

20:13 <bgilbert> millerthegorilla: does it drop to an emergency shell prompt, or wait forever?

20:13 <millerthegorilla> waits forever. So, I tried to set the timeouts and do it again, but it still tries forever (no timelimit)

20:14 <bgilbert> hmm. what platform?

20:14 <bgilbert> Ignition doesn't wait for networking to come up, which is why it retries forever

20:15 <millerthegorilla> its a rpi4b

20:15 <bgilbert> "networking up" is hard to define, so Ignition just assumes it can wait long enough

20:15 <bgilbert> DHCP?

20:15 <bgilbert> wired Ethernet?

20:16 <millerthegorilla> So, if I just leave it? DHCP should be automatic. If I boot the pi as coreos it is assigned an ip, eventually. Its not quick. but it gets there.

20:16 <bgilbert> I mean, if it's 30 seconds that seems fine, but if it's multiple minutes, something is wrong

20:17 <millerthegorilla> its longer than that, I think, but I will check. The http request isn't blocking then, and so the networking should come up in the background? I will try it again.

20:18 <bgilbert> correct, NetworkManager runs in parallel

20:20 <uny[m]> grub worked, thanks bgilbert and jlebon I can access the core user now.

20:20 <bgilbert> uny[m]: 🎉

20:21 <jlebon> nice! :)

20:21 <uny[m]> And /sysroot is expanded as expected. Time to get to work.

20:25 <millerthegorilla> looks like it worked. Just a missing overwrite and it should boot, thanks!

20:26 <bgilbert> millerthegorilla: 🎉

20:27 <millerthegorilla> fri night celebrations to all!

20:27 millerthegorilla has quit [Quit: millerthegorilla]

20:28 <dustymabe> 🎉

20:28 <dustymabe> walters: wrote up https://github.com/nmstate/nmstate/pull/2301#issuecomment-1492572838

20:32 <dustymabe> marmijo[m]: around? can you check something in azure for me?

20:33 <marmijo[m]> dustymabe: Yeah. What can I do?

20:34 <dustymabe> when you log in to azure web interface - you're able to see all the resource groups, right?

20:35 <marmijo[m]> normally, yes. Let me log in now

20:36 jpn has joined #fedora-coreos

20:39 <bgilbert> is CoreOS CI known to be having scheduling problems? https://jenkins-coreos-ci.apps.ocp.fedoraproject.org/blue/organizations/jenkins/fedora-coreos-config/detail/PR-2340/1/pipeline

20:40 <marmijo[m]> dustymabe: yes I can see them all

20:51 <jlebon> bgilbert: ughh, i think it's fallout from https://github.com/coreos/coreos-ci-lib/pull/145. testing a fix

20:53 <dustymabe> marmijo[m]: thanks!

20:53 <dustymabe> thats all

20:54 <marmijo[m]> cool, you're welcome!

20:57 brianmcarey[m] has joined #fedora-coreos

20:57 <dustymabe> jlebon: this one should be ready now (I hope): https://github.com/coreos/fedora-coreos-pipeline/pull/847

20:57 * dustymabe be back in 3 min

20:59 <jlebon> dustymabe or bgilbert: mind a review on https://github.com/coreos/coreos-ci-lib/pull/146

21:00 <uny[m]> <bgilbert> "I've seen ISOs with a "boot to..." <- https://github.com/coreos/fedora-coreos-tracker/issues/1453

21:01 <bgilbert> uny[m]: thanks!

21:08 <jlebon> dustymabe: had one comment

21:10 <dustymabe> jlebon: good catch, fixed

21:13 <jlebon> :lgtm:

21:14 <dustymabe> jlebon: do you understand the CI failure?

21:14 <dustymabe> from Codacy?

21:15 plarsen has quit [Quit: NullPointerException!]

21:15 <jlebon> i do not. i've been meaning to ask actually who set this up and what it is

21:16 <dustymabe> oh - so it's not something we added? weird

21:24 gursewak has joined #fedora-coreos

21:28 vgoyal has quit [Quit: Leaving]

21:35 Betal has quit [Quit: WeeChat 3.8]

21:47 gursewak has quit [Ping timeout: 255 seconds]

21:55 jpn has quit [Ping timeout: 255 seconds]

22:25 nalind has quit [Quit: bye]

22:38 Betal has joined #fedora-coreos

22:38 jpn has joined #fedora-coreos

23:25 mheon has quit [Ping timeout: 250 seconds]

23:34 jlebon has quit [Quit: leaving]