<crobinso>
dustymabe: my experience with qemu-user and binfmt is basically non-existent, beyond packaging work. so any runtime questions I can't help with. no one at rh really works on it either so it's probably better to raise in regular qemu support channels
<dustymabe>
crobinso: thanks for the context - I appreciate it'
paragan has quit [Ping timeout: 252 seconds]
<dustymabe>
jlebon ravanelli: need to chat with you briefly when you get a chance
<dustymabe>
it's about building cosa from a git ref versus git commit
<ravanelli>
dustymabe: I'm free now if you have time
<dustymabe>
yep.
rsalveti has quit [Quit: Connection closed for inactivity]
<dustymabe>
basically I'm trying to figure out if we'll ever need to be able to build cosa from a specific commit. The `podman build https://github.com/coreos/coreos-assembler.git#main` syntax we are using doesn't support commits specifically
<dustymabe>
oh hmm. wait let me check one more time (I was using the short hash)
<dustymabe>
ok, yeah, no - it doesn't seem to work
<dustymabe>
so basically if we ever need to do a build and it's not latest in a ref we'd need to make a tag
<dustymabe>
i think that will probably happen rare enough that it's ok.
<ravanelli>
dustymabe: Yeah, I tried that, but couldn't find a way to use commits neither
<dustymabe>
alternatively we modify the code to git checkout the git repo first
<dustymabe>
which is an easy enough fix
<dustymabe>
maybe I should just do that now to prevent race conditions
<dustymabe>
i'll do that
<dustymabe>
sorry for leading you down a wrong path ravanelli
<ravanelli>
dustymabe: The first time I did it that was the path I went, git checkout passing a dir to the build.
<ravanelli>
dustymabe: What about asking about it as a feature request for podman in the future?
<ravanelli>
dustymabe: aa that's ok. I didn't even know it existed, good to know anyway. It is strange to not have commits working
<ravanelli>
tag works fine too
<dustymabe>
I can ask over in podman.. maybe I'm doing something wrong
<ravanelli>
the test I did, seems it gets the commit itself as a file let say, and not the commit in the tree.
bgilbert has joined #fedora-coreos
<jlebon>
yeah, i think for our sanity we really should make sure we're building the same commit for all arches. so +1 to workaround it for now but in parallel file an RFE with podman.
<jlebon>
"workaround it" = use git clone && checkout
ravanelli has quit [Remote host closed the connection]
Betal has joined #fedora-coreos
jpn has quit [Ping timeout: 245 seconds]
jpn has joined #fedora-coreos
saqali has joined #fedora-coreos
jpn has quit [Ping timeout: 240 seconds]
gursewak has quit [Remote host closed the connection]
gursewak has joined #fedora-coreos
<dustymabe>
jlebon: i'm trying to run a cosa build on the staging cluster.. seems that my pods that get scheduled keep cycling.. were you and saqib working on some similar issue today?
<dustymabe>
reviewed.. I guess we can revisit how to apply it properly in git next week?
<jlebon>
+1 yeah let's
<dustymabe>
ok i'm going to nuke/pave staging
<jlebon>
you should be able to just redefine jcasc
<jlebon>
and respawn jenkins
<dustymabe>
ok
<jlebon>
...maybe :)
Guest24 has joined #fedora-coreos
<dustymabe>
will let you know soon
Guest6641 has joined #fedora-coreos
<dustymabe>
yep still broken
* dustymabe
pulls out the big hammer
<jlebon>
+1
Guest24 has quit [Quit: Client closed]
Guest6641 has quit [Quit: Client closed]
quentin96 has joined #fedora-coreos
Guidon has joined #fedora-coreos
<quentin96>
Hi Guys
<quentin96>
I've got some issue with the systemd network-online.target, I don't see why this target is failing. This issue seams to be caused by NetworkManager-wait-online.service. That unit never start and I don't know why. This issue is random, some of my AWS instance have this issue, some other not.
<quentin96>
My version is Fedora CoreOS 36.20220716.3.1
ravanelli has quit [Remote host closed the connection]
<dustymabe>
quentin96: what is in your Ignition config? Must be a service you are creating that pulls in that target
<dustymabe>
what does `sudo journalctl -u NetworkManager-wait-online.service` show you ?
<dustymabe>
jlebon: I think I may have deployed the new staging instance with your change still in place.. how do I tell in the jenkins interface if that setting is still set?
<dustymabe>
if you `systemctl enable --now wg-setup@wg0.service` does the service come up fine?
<quentin96>
dustymabe I don't understand, because with the EXACT same ignition, sometimes it start the unit, sometimes not. After investigation, I find that's it's related to my requirement to `network-online.target`. When my unit don't start, it's because `network-online.target` is dead, and when my unit start up correctly, I saw the `network-online.target`
<quentin96>
active.
<dustymabe>
quentin96: if you systemctl cat NetworkManager-wait-online.service you'll notice that it's just calling a program that waits 60s for the network to come up and then times out
<dustymabe>
so if networking isn't good for 60s it will fail
<dustymabe>
what does `sudo journalctl -u NetworkManager-wait-online.service` show you?
jpn has quit [Ping timeout: 245 seconds]
<dustymabe>
hmm I guess you showed me that before with systemctl status NetworkManager-wait-online.service
<dustymabe>
still. anything from journalctl?
<quentin96>
We thought about the same issue, regarding NetworkManger.
<quentin96>
Here is the output on a failing server:
<dustymabe>
oh I see. builds it using a buildconfig?
<dustymabe>
yeah I was just planning to use a new x86_64 builder
<jlebon>
it'll build it in the same namespace, and then you should be able to `skopeo copy` it to quay too
<jlebon>
+1
<jlebon>
that'd be nicer yeah
<dustymabe>
jlebon: what do you think about building with `--no-cache`?
<jlebon>
dustymabe: definitely :)
<dustymabe>
part of me thinks we should always `--no-cache` but another part of me thinks of days where we have a lot of commits go into cosa and the big waste --no-cache would be
<jlebon>
that should be our default IMO unless we find out it causes serious performance issues :)
<dustymabe>
I wish there was some sort of --cache-expire=1d
<jlebon>
fair
<dustymabe>
will go with --no-cache for now
<jlebon>
i guess we could implement that manually by passing `--no-cache` only if the last push was X time ago
<jlebon>
+1
<quentin96>
jlebon I check in that `journalctl -b 0` and I don't find anything
misuto has quit [Remote host closed the connection]
misuto has joined #fedora-coreos
<quentin96>
jlebon there's no obvious cycle in that logs
<jlebon>
quentin96: hmm sorry, I'm not sure. can you open an issue in https://github.com/coreos/fedora-coreos-tracker with the full Ignition and journal logs in the case where it misbehaves?
<jlebon>
this might not be an FCOS issue, but we can start diagnosing there
<dustymabe>
also I'm interested to know.. if you start 10 instances fresh in the exact same way.. how many of them succeed and how many fail
jpn has joined #fedora-coreos
<quentin96>
jlebon dustymabe thank you so much for you help, I will do that Monday and will post an issue with the full logs and details.
<quentin96>
Thank a lot and have a good week end !
<jlebon>
quentin96: have a good weekend!
Guidon has quit [Ping timeout: 252 seconds]
fifofonix has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]