dustymabe changed the topic of #fedora-coreos to: Fedora CoreOS :: Find out more at https://getfedora.org/coreos/ :: Logs at https://libera.irclog.whitequark.org/fedora-coreos
mnguyen has quit [Ping timeout: 245 seconds]
mnguyen has joined #fedora-coreos
mnguyen_ has quit [Ping timeout: 268 seconds]
mnguyen_ has joined #fedora-coreos
<bgilbert> justJanne: yes, you can merge multiple trees into the same folder. but the individual files are inlined into the Ignition config. there's no support for tar files.
jpn has quit [Ping timeout: 252 seconds]
bgilbert has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 268 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 268 seconds]
gursewak has quit [Ping timeout: 272 seconds]
saqali_ has quit [Remote host closed the connection]
saqali_ has joined #fedora-coreos
jpn has joined #fedora-coreos
saqali_ has quit [Remote host closed the connection]
jpn has quit [Ping timeout: 245 seconds]
gursewak has joined #fedora-coreos
gursewak has quit [Ping timeout: 272 seconds]
gursewak has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
paragan has joined #fedora-coreos
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
gursewak has quit [Ping timeout: 272 seconds]
gursewak has joined #fedora-coreos
gursewak has quit [Remote host closed the connection]
gursewak has joined #fedora-coreos
gursewak has quit [Ping timeout: 240 seconds]
jpn has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
jcajka has joined #fedora-coreos
gursewak has joined #fedora-coreos
gursewak has quit [Ping timeout: 240 seconds]
jpn has joined #fedora-coreos
Betal has quit [Ping timeout: 268 seconds]
jcajka has quit [Quit: Leaving]
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
fifofonix has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
jpn has joined #fedora-coreos
ravanelli has joined #fedora-coreos
HappyMan has quit [Quit: HappyMan]
HappyMan has joined #fedora-coreos
jcajka has joined #fedora-coreos
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 240 seconds]
jpn has quit [Ping timeout: 268 seconds]
vgoyal has joined #fedora-coreos
ravanelli has joined #fedora-coreos
jpn has joined #fedora-coreos
ravanelli has quit [Ping timeout: 244 seconds]
crobinso has joined #fedora-coreos
ravanelli has joined #fedora-coreos
mheon has joined #fedora-coreos
ravanelli has quit [Ping timeout: 260 seconds]
jcajka_ has joined #fedora-coreos
jcajka has quit [Ping timeout: 276 seconds]
<u1106> we just had an update to 36.20220716.3.1 and the serial console is silent
<u1106> I see the setting has changed
<u1106> # cat /proc/sys/kernel/printk
<u1106> 4417
<u1106> used to be 7 4 1 7
<dustymabe> travier[m]: jlebon: mind a review on https://github.com/coreos/fedora-coreos-streams/pull/543
<u1106> is that an intended change?
jcajka_ is now known as jcajka
<u1106> we are use EC2 images
<dustymabe> u1106: we made a change recently to lower the kernel printk level after the boot completes
<u1106> yes thanks looking at those
<u1106> what is the easiest way to revert those on EC2? The commit message mentions debug on the kernel command line would prevent it, but that has other side effects
<u1106> we have a lot of stuff ongoing in "late boot" and we certainly want to see those messages on serial console
<jlebon> you can add a sysctl.d dropin via Ignition that restores it to 7
<u1106> jlebon: thanks, that sounds doable
<jlebon> u1106: see the last bit of the first butane config in this section: https://docs.fedoraproject.org/en-US/fedora-coreos/tutorial-autologin/#_first_ignition_config_via_butane
<jlebon> but s/4/7/ :)
ravanelli has joined #fedora-coreos
<dustymabe> walters: jlebon: we should probably add a section in our docs for this (separate from the tutorial)
<dustymabe> also, the tutorial should be changed now??
<jlebon> probably and yes :)
<dustymabe> also.. thinking on it a bit more.. I kind of wish we could lower the log level JUST for the messages that print to the console and not what gets put in the kernel log buffer (dmesg)
* dustymabe brb
ravanelli has quit [Ping timeout: 240 seconds]
vgoyal has quit [Quit: Leaving]
jpn has quit [Ping timeout: 268 seconds]
<walters> u1106: can you please comment in the tracker issue with a summary of your use case?
jpn has joined #fedora-coreos
<u1106> I will do when I have the modifcation working, so others can use it, too (It works already by just editing /etc/sysctl.d/ of the existing instance, but have not yest tested deploying a new one
<u1106> in short the use case is: By default we have the EC2 serial console disabled. So we don't care about spamming. But when the machine does not come up as expected and we cannot log in via ssh, we want as much information as possible. We have no console login.
bgilbert has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
jpn has joined #fedora-coreos
plarsen has joined #fedora-coreos
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 240 seconds]
<walters> u1106: we should still display kernel warnings and fatal errors,
<u1106> at the moment my instance is working fine, so I just wondered why it suddenly is silent
ravanelli has joined #fedora-coreos
<u1106> but when it doesn't work I rather have more messages than fewer
<dustymabe> walters: can you handle updating our docs?
<u1106> the audit messages (although a nuisance under normal circumstances) have been very helpful to understand what is going on if we cannot log in
<u1106> I guess most of the errors we have had were user space. So the kernel would not necessarily create any warning or error for those
<u1106> of course for full logs you always need the disk image so you can get the journal. But often the kernel messages from audit are just enough so I wouldn't like to miss them. I don't say you should revert, I am fine with /etc/sysctl.d/99-foo.conf
<walters> do you agree that https://github.com/coreos/ignition/issues/585 would be even better?
<u1106> walters: this was not a question to me was it? At least I don't get how that issue relates to my question
<walters> if rather than scraping the console on failure, you had a way to configure an ignition config that would do whatever you want (e.g. send the full journal to a remote logging service, or store the logs directly in S3, or whatever you want)
<dustymabe> bgilbert: are we blocked on anything for https://github.com/coreos/fedora-coreos-pipeline/pull/541 ?
<dustymabe> walters: that would only cover first boot failures, right? I think u1106 is concerned with "machine has been up and running fine and is now unresponsive" failures
<bgilbert> dustymabe: the original motivation for that PR hasn't occurred yet, but if we need it for other reasons, the PR itself can land
<bgilbert> in which case I should update the commit message
<bgilbert> lmk
<dustymabe> bgilbert: i guess in a roundabout way I was asking if the original motivation for that PR was blocked on something else
<bgilbert> it is, yes
<dustymabe> ahh, I didn't realize that - waiting on an upstream project release?
<dustymabe> or just ENOTIME?
<bgilbert> it's waiting on coreos-installer, but that's not the root cause
<bgilbert> when I was writing up the announcement, I realized that the current design has a regression
<dustymabe> oh
<bgilbert> if you do want serial console, the current design doesn't let you get the GRUB boot menu over serial
<bgilbert> I need to write up a design for a fix
<dustymabe> cool. thanks for the info
<dustymabe> walters: i'm trying to do some "consolidation" of some code in cosa regarding pushing container images to registries
<u1106> dustymade: Yes, any kind of failures. As said normally we have the EC2 serial console completely disabled, so we are not worried about log spam. We only enable it when the machine has failed (or maybe if we are testing some change) and then have a high log level is good.
<dustymabe> walters: is the code in oscontainer.py and cmd-upload-oscontainer for the old syle (non-native) format we used to use?
<dustymabe> AFAICT in FCOS we're not using and we're using `cosa push-container` instead: https://github.com/coreos/fedora-coreos-pipeline/blob/main/jobs/release.Jenkinsfile#L132
<bgilbert> walters: not only would that not cover subsequent boot failures, it wouldn't cover boot failures unrelated to the running of Ignition (e.g. successfully applied a bad config)
vgoyal has joined #fedora-coreos
<bgilbert> u1106: thanks for pointing out this use case
<u1106> I mean one could say trying to understand user space errors from kernel messages is not a valid use case :) But getting the full disk image of a failed machine costs time and (a bit of) money, so if we can skip it we do so.
<bgilbert> u1106: sure, I didn't say we'd act on it. :-) but it's always helpful to better understand how the OS is used
<bgilbert> and sometimes there are tweaks we can make to existing behavior
<bgilbert> e.g. docs in this case
crobinso has quit [Remote host closed the connection]
<walters> dustymabe: correct, both those are rhcos-only format
<dustymabe> walters: and will be dropped in the future?
<walters> dustymabe: definitely
<walters> like a hot potato
<dustymabe> :)
<dustymabe> and... what about the use of the term oscontainer?
<dustymabe> in my head I think it still makes sense (basically the container that has an ostree in it and can be rebased to) even in the newly formatted work you've been doing
<dustymabe> WDYT?
<walters> i don't have a really strong opinion on that
<walters> i've sometimes said "bootable container" or "ostree container" too but, dunno
cyberpear has joined #fedora-coreos
<dustymabe> basically I'm thinking of renaming cmd-push-container to cmd-push-oscontainer (and add a comment to the top of cmd-upload-oscontainer to say it's going away in the future). The rename is to more accurately reflect what the file is doing (pushing the fcos oscontainer (all arches soon) and updating the meta.json).
gursewak has joined #fedora-coreos
<walters> I'd vote for adding e.g. `-deprecated-legacy-format` or something to the old format paths
<walters> but we also need to "ratchet" these types of changes carefully with the pipeline
fifofonix has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<dustymabe> yeah. maybe I can just move the files and symlink the old file to the new file or something. I'd prefer not to have to ratchet changes. We can just remove it when they are no longer used/needed
jcajka has quit [Ping timeout: 245 seconds]
<walters> move and symlink sounds fine to me
<dustymabe> will do
ravanelli has quit [Remote host closed the connection]
crobinso has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
Betal has joined #fedora-coreos
fifofonix has joined #fedora-coreos
paragan has quit [Quit: Leaving]
ravanelli has joined #fedora-coreos
ravanelli has quit [Ping timeout: 240 seconds]
jpn has joined #fedora-coreos
ravanelli has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
ravanelli has quit [Remote host closed the connection]
<u1106> jlebon: Creating a /etc/syscfg.d/99-serial.conf file in ignition brings us the old console spam back, I will comment that into the gitlab issue
<u1106> out dirty(?) secret is we are not using butane. I have never looked into how it works. We are using terraform and have create it it ignition JSON. We have been doing so for years (before FCOS in the original Container Linux already).
<u1106> Terraform has support for JSON. How it would support butane I have not idea, haven't looked into what format that is.
<u1106> s/out/our/ s/not idea/no idea/
<u1106> do you think using ignition could cause us any problems? (not limited to this single conf file, but in general)
<dustymabe> u1106: you're free to use Ignition directly - whatever works for you
<dustymabe> I would suggest running it through ignition-validate, though
<dustymabe> butane does validates the config up front for you
jpn has joined #fedora-coreos
<u1106> @dustymabe: Thanks for the hint, we should add this to our CI
mnguyen has quit [Ping timeout: 245 seconds]
mnguyen has joined #fedora-coreos
mnguyen_ has quit [Ping timeout: 268 seconds]
mnguyen_ has joined #fedora-coreos
hiredman has joined #fedora-coreos
<hiredman> I have a fedora-coreos install which I have previously used rpm-ostree to layer some packages over
<hiredman> When I go to use rpm-ostree to layer some additional packages, everything looks fine, rpm-ostree status shows those additional packages as layered, then I reboot, and the new packages disapear but the previously layered packages are still there
jpn has quit [Ping timeout: 268 seconds]
jpn has joined #fedora-coreos
<hiredman> I must just be missing something about the model, but this behavior has left me rather perplexed
<dustymabe> hiredman: what does `sudo rpm-ostree status` show ?
<dustymabe> most likely failed during finalization on the reboot
<hiredman> before the reboot it shows the old layered packages and the new layered packages, after just the the old layered packages
<hiredman> I don't see anything like ERROR: or whatever that jumps out at me to explain things
<dustymabe> what does `sudo rpm-ostree status` show now?
<hiredman> just the old layered packages
<dustymabe> usually if there was an error on last boot it will show some info at the top about the error (gets this info from the journal)
<dustymabe> can you post your full journal log somewhere like https://paste.centos.org/ ?
jpn has quit [Ping timeout: 252 seconds]
jpn has joined #fedora-coreos
<bgilbert> u1106: short version is: Butane configs are YAML instead of JSON; Butane runs ignition-validate and also some additional validation; Butane provides some additional syntax for higher-level operations that get translated into Ignition configs
<u1106> no idea whether terraform would support yaml data, need to check that
<bgilbert> u1106: converting Ignition configs to Butane can be done by converting to YAML, adding "variant: fcos" and "version: 1.4.0", and converting camel-case fields to snake case
<bgilbert> and yes, the difficult part here is terraform
jpn has quit [Ping timeout: 245 seconds]
<bgilbert> Butane is just a converter tool that produces Ignition output. you still pass Ignition configs to the machine
<bgilbert> depending on how you use Terraform, Butane may not work for you, and that's fine
<bgilbert> though please feel free to file bugs if there are functionality gaps
<u1106> of course terrraform would support a string containing yaml. But we don't have a fixed string with the whole config, many parts are interpolated by Terraform as you can guess from my snippet in the gitlab comment. Because Terraform has built-in JSON support that is not difficult. How easy or difficult it would be to use yaml I have not evaluated. The code was originally written by someone else and I maintain it
<u1106> only as side responsibility, so I have not made fundamental changes to it.
<fifofonix> terraform does support interpolated yml format very nicely via poseidon/ct provider
<bgilbert> fifofonix: oh yeah, good call
<fifofonix> it is also very composable meaning if you have some security stuff for all hosts that can be one yml snippet
<fifofonix> one yml snippet with interpolations.
<u1106> yeah my web search returned the same URL. Bookmarked. Thanks fifofonix
<fifofonix> the only thing that might be iffy is that this is a self-signed provider presently (well actually I haven't upgraded beyond 0.9)
<fifofonix> hopefully now signed.
plarsen has quit [Remote host closed the connection]
jpn has joined #fedora-coreos
crobinso has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
jpn has quit [Ping timeout: 252 seconds]
<hiredman> dustymabe: thank you, sorry for disappearing, I think I figured it out, I have to stop and disable zincati before layering
vgoyal has quit [Quit: Leaving]
<dustymabe> jlebon: are you done poking around in staging?
<dustymabe> i'm going to start testing out this cosa multi-arch work (related to https://github.com/coreos/coreos-assembler/pull/3015)
ravanelli has quit [Remote host closed the connection]
vgoyal has joined #fedora-coreos
jpn has joined #fedora-coreos
<jlebon> dustymabe: yup, done!
ravanelli has joined #fedora-coreos
jpn has quit [Ping timeout: 268 seconds]
ravanelli has quit [Ping timeout: 255 seconds]
mnguyen has quit [Ping timeout: 245 seconds]
mnguyen_ has quit [Ping timeout: 245 seconds]
mnguyen_ has joined #fedora-coreos
mnguyen has joined #fedora-coreos
vgoyal has quit [Quit: Leaving]
ravanelli has joined #fedora-coreos
mheon has quit [Ping timeout: 245 seconds]
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
fifofonix has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
ravanelli has quit [Remote host closed the connection]
ravanelli has joined #fedora-coreos
arnulfo_7 has joined #fedora-coreos
arnulfo_7 has joined #fedora-coreos