dustymabe changed the topic of #fedora-coreos to: Fedora CoreOS :: Find out more at https://getfedora.org/coreos/ :: Logs at https://libera.irclog.whitequark.org/fedora-coreos
daMaestro has joined #fedora-coreos
chrish136 has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
vgoyal has quit [Quit: Leaving]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
jlebon has quit [Quit: leaving]
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
ravanelli has quit [Ping timeout: 240 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
ebbex has quit [Remote host closed the connection]
dustymabe has quit [Ping timeout: 268 seconds]
dustymabe has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 256 seconds]
sentenza has quit [Remote host closed the connection]
jpn has joined #fedora-coreos
saschagrunert has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
bgilbert has quit [Ping timeout: 240 seconds]
jpn has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
daMaestro has quit [Quit: Leaving]
jpn has quit [Ping timeout: 246 seconds]
jpn has joined #fedora-coreos
apiaseck has joined #fedora-coreos
jpn has quit [Ping timeout: 246 seconds]
jcajka has joined #fedora-coreos
c4rt0 has joined #fedora-coreos
apiaseck has quit [Ping timeout: 268 seconds]
c4rt0 is now known as apiaseck
Betal has quit [Quit: WeeChat 3.8]
jpn has joined #fedora-coreos
apiaseck has quit [Ping timeout: 256 seconds]
apiaseck has joined #fedora-coreos
uny[m] has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 240 seconds]
apiaseck has quit [Ping timeout: 240 seconds]
apiaseck has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
vgoyal has joined #fedora-coreos
plarsen has joined #fedora-coreos
jpn has quit [Ping timeout: 248 seconds]
nalind has joined #fedora-coreos
ravanelli has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
jlebon has joined #fedora-coreos
saschagrunert has quit [Remote host closed the connection]
jpn has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
jcajka has quit [Quit: Leaving]
<travier[m]> https://github.com/fedora-silverblue/issue-tracker/issues/470 > Might impact rollbacks on FCOS as well
<jlebon> travier[m]: i.e. user does a fresh install of FCOS 39 and explicitly deploys FCOS 38 and reboots into it? not sure we need to go out of our way to support that
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
<dustymabe> jlebon: I agree. Though it does often happen where we've switched to FN in FCOS but OKD rebases to FN-1, so this would probably come up there. I'm not saying we should support this (TBH it's almost impossible to support it), just saying it's probably going to happen and we should know how to respond to issues that get reported.
ravanelli has joined #fedora-coreos
<travier[m]> hum, indeed, it's not a rollback, it's a pure downgrade
<jlebon> dustymabe: hmm, I thought OKD referenced its own bootimages, like RHCOS?
<travier[m]> Agree that we don't want to support that. I had misunderstood that as impacting F X+1 -> F X rollbakcs
<travier[m]> OKD has it's own boot images AFAIK
<travier[m]> s/rollbakcs/rollbacks like the one we had for F38/
<travier[m]> s/it's/its/
<jlebon> travier[m]: +1
bgilbert has joined #fedora-coreos
<bgilbert> there are going to be a few PRs landing in repo-templates today, so I'll let them accumulate in the downstream repos and merge at the end
<dustymabe> bgilbert: +1
<dustymabe> travier[m]: jlebon: I'm thinking of UPI OKD
<dustymabe> but there are probably documentation steps that the user which version of FCOS to grab? I don't know
<dustymabe> the only reason I'm saying something is because I recently hit an issue: https://github.com/okd-project/okd/issues/1607#issuecomment-1553625380
<dustymabe> as a user of OKD I don't necessarily know which version of FCOS it targets so I just grabbed the latest FCOS as a starting point
<travier[m]> yeah, you should not do that :)
<dustymabe> and we (FCOS) don't really tell people how to grab older versions of bootimages, so I maintain it's probably a legit problem
<dustymabe> travier[m]: what should I do? not run OKD on my own hardware? only run IPI?
<travier[m]> OKD as specific boot images versions just like RHCOS/OCP
<travier[m]> You should not pick an aritrary FCOS image as boot image :)
<travier[m]> s/aritrary/arbitrary/
<dustymabe> well, it's not arbitrary :) - it's whatever the latest is, but yeah - are there docs for this?
<dustymabe> it's definitely possible things have improved in the past few years (I'm working on some old experiences with OKD)
<travier[m]> It's the same for OCP & OKD
<travier[m]> openshift-install coreos print-stream-json | grep '.iso[^.]'
<dustymabe> travier[m]: nice
<dustymabe> I didn't know about thta
<jlebon> yeah, i think the reason you can get it wrong with OKD is that FCOS builds are public. whereas with RHCOS they're not, so the only thing you can easily do is the right thing.
<travier[m]> yes, I agree that it's not obvious and somewhat hidden
<dustymabe> though I will say that we (FCOS) really don't (or haven't) consider that use case as valid (public API)
<dustymabe> the unofficial builds browser is... unofficial
<travier[m]> Do you mean downgrades?
<dustymabe> travier[m]: no, using an old version of FCOS as a starting point
<travier[m]> We do support that
<dustymabe> emmm.. no :)
<travier[m]> Why would we not support using previous Fedora CoreOS releases to setup nodes?
<dustymabe> it works, but we don't go out of our way
<travier[m]> You've written the test that basically verifies that this works
<dustymabe> i'm channeling my inner bgilbert here
<travier[m]> :)
<bgilbert> \o/
<dustymabe> travier[m]: right. what that test is trying to do is verify that if you had started your node X months ago that it can continue to upgrade
<jlebon> travier[m]: the test is simulating users who installed when those old versions were the latest
<dustymabe> this ^^
<travier[m]> I agree that beyond a certain time frame, things get more complex, but just like RHCOS in OCP, OKD does not support updating boot images
<bgilbert> travier[m]: OKD has intentionally chosen to use a flow that's not supported by FCOS
<travier[m]> That's the same thing here
<dustymabe> it's my understanding that from our "public" stance we could disallow access to old bootimages and that would be inline with our desired level of support
<bgilbert> that's their right, but that choice doesn't transform it to a supported flow
<dustymabe> correct
<dustymabe> to be clear what I'm trying to do by starting this conversation is emphasize this so it's clear at least amongst us
ravanelli has quit [Remote host closed the connection]
<dustymabe> here's an example
<travier[m]> What OKD is doing is exactly what your test is doing
<dustymabe> OKD comes to us and says we need to stay on F37 but we also need Igntion to support "new feature X" in f37
<dustymabe> our answer to that is "no", sorry
<jlebon> travier[m]: though i don't think OKD is barrier-aware, right?
<dustymabe> does that make it more clear?
<travier[m]> If we say that we don't support that then barrier releases don't matter that much indeed then and all the discussion around that goes away and we say "you must update at least once per year"
<dustymabe> travier[m]: no no no
<travier[m]> When did that happen?
<travier[m]> That's not what OKD does
<travier[m]> neither RHCOS
<travier[m]> the MCO downgrades Ignition configs on demand
<travier[m]> to match the Ignition in the boot imahe
<travier[m]> s/imahe/image/
<dustymabe> what we are trying to do with barrier releases is keep existing nodes updating, not allow new nodes deployed with old media to get up to date. It happens to be a side effect. but the first goal is the real reason
<bgilbert> +1
<dustymabe> ok that's a bad example then
<dustymabe> i'm just saying if they needed something new in F37 right now in a bootimage, they wouldn't be able to get it from us.. hence it's not really supported
<travier[m]> I don't understand how that would be different
<dustymabe> what OKD is doing works, but I'm just trying to draw the line and make it more clear
<dustymabe> travier[m]: ok here's another example
<travier[m]> That's incredibly unlikely to happen by design in OCP as we support older boot images
<travier[m]> on newer clusters
<travier[m]> * on updated clusters
<jlebon> travier[m]: that's something we need to fix in OCP too :)
<dustymabe> so if you started on a version < 35 you wouldn't be able to update all the way to latest FCOS
<dustymabe> because of a gpgkey issue
<travier[m]> jlebon: agree!
<dustymabe> if you started on F31 we changed the cincinnati update URL - so you wouldn't get updates either
<travier[m]> note that all of those are pure FCOS issues that don't affect OKD
<dustymabe> right, but we are talking about supporting older bootimages and why we don't do it
<dustymabe> not why "none of those reasons matters and OKD works anyway"
<travier[m]> OK, I see the difference now
<jlebon> that said, even outside OKD, based on incoming issues, there are definitely users who do pin for a bit
<dustymabe> TL;DR it works in some cases, but if you need support for an older bootimage we're probably going to tell you to use latest
<jlebon> i think roughly, if users report upgrade issues, we should help them. most other issues would probably be "use the latest version"
<dustymabe> jlebon: correct. there is a difference between swimming in a pool with a lifeguard or swimming in a pool without one
<dustymabe> it still works without one, but if you have trouble...
<dustymabe> yeah, upgrading is something we do want to support
<dustymabe> it's definitely a subtle difference
<travier[m]> I see now why Colin says that we don't want to support barrier releases as this is the same discussion here
<travier[m]> If we decide that if your image is 2 Fedora releases old, you're no getting auto-updated / you're not guaranteed to update, then we don't need barrier releases.
<bgilbert> travier[m]: that's true for the most common reason we need barrier releases, but it's not true in general, unless we have a different way of running scripts on upgrade
<bgilbert> e.g. the pre-upgrade container idea
<bgilbert> xref the aarch64 bootloader issue we just had
<travier[m]> hum, indeed, that does not work here
<dustymabe> maybe we are talking past each other here
<dustymabe> Ignition: Fetching the Ignition config via the Virtio block driver is currently experimental and subject to change.
<dustymabe> wondering if we should promote that ^^
<dustymabe> to non-experimental
<bgilbert> dustymabe: we still don't have a solution for the race condition problem
<dustymabe> bgilbert: +1 - I wasn't familiar with the details, just was observing we've been using it a while (I assume without issue)
<bgilbert> dustymabe: "without issue" in the sense that we've set a five-minute timeout
<bgilbert> so anyone booting with that provider and _without_ an Ignition config always has to wait five minutes
<bgilbert> (on the first boot)
<dustymabe> interesting
<dustymabe> i guess the cases are few for that (I'm thinking openstack or another cloud platform, maybe IMBCloud, where you could get an SSH key from a metadata service)?
<dustymabe> well.. here I am thinking about ppc64le only
<dustymabe> we use it for s390x too?
<jlebon> yes
<jlebon> the ignition PR to add this was initially just to allow us to use it in CI. it kinda leaked out though and is now used by users.
<dustymabe> jlebon: can I ask you some questions about ostree autoprune real quick?
<jlebon> sure
<dustymabe> is there a case where it won't prune even though it should?
<dustymabe> this seemed to work on aarch64 yesterday but isn't on ppc64le and I'm wondering if I'm doing something wrong or not
<dustymabe> I know if it does prune it will print a message, but maybe we should print a message too if pruning was requested and considered, but not performed
<bgilbert> dustymabe: I agree that the qemu image is less likely to be used without a config, but that approach would make the ability to omit the config dependent on the platform, which is unexpected
<jlebon> it won't do anything if even with autopruning we'd hit ENOSPC
<jlebon> we could log something indeed in that case
<jlebon> but i'm not sure if that's the case you're hitting
<dustymabe> right. maybe we should even have a log message (for at least the time period that autoprune is experimental) saying autoprune was requested or something
<dustymabe> it's hard to tell right now if the env var is plumbed through correctly OR if the code decided not to prune
<dustymabe> I swear i tested it yesterday :)
<dustymabe> but it was also on the system I was developing the fix, so it's possible something didn't get back into my PR that should have
<dustymabe> #thisiswhywetest
<bgilbert> btw kola caught a legitimate regression in an Ignition dependency update: https://github.com/coreos/ignition/pull/1634
<jlebon> yup, experimental logging sounds fine to me
<dustymabe> bgilbert: nice! this is a win for sure
<jlebon> bgilbert: nice! were you the one to report it?
<bgilbert> no, it was fixed before I got to it
<dustymabe> jlebon: here's the scenario I'm in: https://paste.centos.org/view/36c1fdb2
<dustymabe> [root@cosa-devsh ~]# rpm -q ostree
<dustymabe> ostree-2023.3-1.fc38.ppc64le
<jlebon> how large are the (kernel, initrd) pairs?
<dustymabe> [root@cosa-devsh ostree]# ls -lh */
<dustymabe> fedora-coreos-28983714bf02bf4d0cade8c13e72a487398daebab5ce3059415d7d956edd2dcd/:
<dustymabe> total 112M
<dustymabe> -rw-r--r--. 1 root root 70M May 26 15:46 initramfs-6.2.15-300.fc38.ppc64le.img
<dustymabe> -rwxr-xr-x. 1 root root 43M May 26 15:46 vmlinuz-6.2.15-300.fc38.ppc64le
<dustymabe> fedora-coreos-ba044800cd32c148c49bc3c464d9260d2bf51f8461e7068a3c5aade4593a29b6/:
<dustymabe> total 112M
<dustymabe> -rw-r--r--. 1 root root 70M May 26 15:57 initramfs-6.2.15-300.fc38.ppc64le.img
<dustymabe> -rwxr-xr-x. 1 root root 43M May 26 15:57 vmlinuz-6.2.15-300.fc38.ppc64le
* dustymabe brb - switching to home location
oo has joined #fedora-coreos
vgoyal has quit [Quit: Leaving]
<dustymabe> back
jpn has quit [Ping timeout: 268 seconds]
ravanelli has joined #fedora-coreos
jpn has joined #fedora-coreos
peko[m] has quit [Excess Flood]
Betal has joined #fedora-coreos
jpn has quit [Ping timeout: 256 seconds]
<mhayden> decided to write myself a little blog post on coreos as a "pet" instance: https://major.io/p/coreos-as-pet/
sentenza has joined #fedora-coreos
plarsen has quit [Ping timeout: 250 seconds]
plarsen has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
jpn has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
jpn has quit [Ping timeout: 265 seconds]
plarsen has quit [Ping timeout: 250 seconds]
plarsen has joined #fedora-coreos
<dustymabe> mhayden: look at you!
<mhayden> writin' things and stuff
<dustymabe> mhayden: you could mention typhoon too https://typhoon.psdn.io/
<mhayden> whaaaaaaat? first time hearing about it. usually ended up in k3s
<dustymabe> yeah dghubble maintains it - from what I hear it's pretty solid
<mhayden> kubernetes often just ends up causing me too much frustration for my personal projects. i usually end up back with docker-compose 🙃
<dustymabe> yeah, it's a balance for sure
<dustymabe> it's not really a 2h per week thing (which is what most side projects are)
<fifofonix[m]> i think typhoon is especially appealing if you're already doing a lot via terraform.we've enjoyed for our higher end needs when we've needed to grow beyond swarm (but we still use swarm a bunch for now).
<dustymabe> jlebon: I think this is what we had discussed: https://github.com/coreos/fedora-coreos-config/pull/2438
jpn has joined #fedora-coreos
<jlebon> thanks!
<dustymabe> though, I wonder if the "starting earlier" part could throw off some of our other tests. Maybe we should by default make the kola systemd units run after say systemd-user-sessions and then allow a tag or something to override that behavior
<jlebon> yeah, it's possible we might've unknowingly taken a dependency on the existing behaviour in other places. maybe let's keep an eye out for other fallout and then do something fancier if it's a nontrivial amount
jpn has quit [Ping timeout: 268 seconds]
<dustymabe> +1
<dustymabe> I enabled automerge on the linked PR above
<quentin9696[m]> Hey guys, I create the PR to update the doc about wireguard as discuss during the weekly meeting
<quentin9696[m]> I create 2 PR, 1 to add it, 1 to remove it
<dustymabe> quentin9696[m]: dropped in a review
<dustymabe> quentin9696[m]: for that particular issue the wireguard maintainer is busy and the selinux maintainer might not have enough expertise to drive it forward. If you're motivated you could work with the selinux maintainer or may have to wait for some time
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
plarsen has quit [Remote host closed the connection]
nalind has quit [Quit: bye for now]
<dustymabe> mhayden: that other PR merged so now I'm unblocked to open a PR to the google guest configs RPM
<dustymabe> let me check with upstairs how much time I have - might be able to whip something up now
apiaseck has quit [Quit: Konversation terminated!]
<jlebon> dustymabe: were you planning to carry https://github.com/GoogleCloudPlatform/guest-configs/pull/51 there?
<dustymabe> yep
<jlebon> +1
<dustymabe> jlebon: ^^
<dustymabe> if that looks good.. we need it in f38 and f39 if possible
* dustymabe has to run upstairs now
jpn has joined #fedora-coreos
gursewak has quit [Ping timeout: 240 seconds]
jpn has quit [Ping timeout: 240 seconds]
<quentin9696[m]> <dustymabe> "quentin9696: dropped in a review" <- thanks, will make the required changes
<quentin9696[m]> <dustymabe> "quentin9696: for that particular..." <- Sure I can work with him. Where can I contact them ?
<dustymabe> quentin9696[m]: you could start by offering up help with a comment in the BZ - it doesn't always work but is one way. In the comment you can ask for advice or say that you'll be in an IRC channel XYZ if they want to talk more real time. https://bugzilla.redhat.com/show_bug.cgi?id=2188714
oo has quit [Ping timeout: 256 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]