00:01
crobinso has quit [Remote host closed the connection]
00:24
gursewak has quit [Ping timeout: 240 seconds]
01:53
gursewak has joined #fedora-coreos
02:22
jpn has joined #fedora-coreos
02:28
jpn has quit [Ping timeout: 268 seconds]
03:18
jpn has joined #fedora-coreos
03:32
gursewak has quit [Ping timeout: 240 seconds]
04:15
gursewak has joined #fedora-coreos
04:28
paragan has joined #fedora-coreos
04:42
paragan has quit [Quit: Leaving]
05:50
arnulfo_7 has quit [Read error: Connection reset by peer]
05:50
arnulfo_7 has joined #fedora-coreos
05:50
arnulfo_7 has quit [Changing host]
05:50
arnulfo_7 has joined #fedora-coreos
05:53
arnulfo_7 has quit [Read error: Connection reset by peer]
05:53
arnulfo_7 has joined #fedora-coreos
05:53
arnulfo_7 has quit [Changing host]
05:53
arnulfo_7 has joined #fedora-coreos
06:20
jpn has quit [Ping timeout: 252 seconds]
06:23
jpn has joined #fedora-coreos
06:27
jcajka has joined #fedora-coreos
06:35
paragan has joined #fedora-coreos
06:46
bgilbert has quit [Ping timeout: 268 seconds]
07:28
jpn has quit [Ping timeout: 268 seconds]
07:33
jpn has joined #fedora-coreos
07:42
jpn has quit [Ping timeout: 245 seconds]
08:18
jpn has joined #fedora-coreos
09:23
arnulfo_7 has quit [Read error: Connection reset by peer]
09:23
arnulfo_7 has joined #fedora-coreos
09:27
Betal has quit [Quit: WeeChat 3.6]
09:59
crobinso has joined #fedora-coreos
11:05
jpn has quit [Ping timeout: 268 seconds]
11:37
jpn has joined #fedora-coreos
11:50
jpn has quit [Ping timeout: 252 seconds]
12:00
nalind has joined #fedora-coreos
12:16
jpn has joined #fedora-coreos
12:20
ravanelli has joined #fedora-coreos
12:22
jpn has quit [Ping timeout: 240 seconds]
13:09
<
dustymabe >
the "provided port:50000 is not reachable" looks similar to what we saw in the staging pipeline the other day
13:10
<
dustymabe >
I was going to cycle the jenkins pod anyway (to pick up new secrets I created) so maybe that will help
13:15
aleeku_ has quit [Ping timeout: 268 seconds]
13:16
aleeku_ has joined #fedora-coreos
13:17
jpn has joined #fedora-coreos
13:30
aleeku_ has quit [Ping timeout: 245 seconds]
13:46
aleeku_ has joined #fedora-coreos
14:02
<
dustymabe >
I just disabled the pipeline for now and killed any jobs (that weren't completing anyway because no pods were coming up)
14:44
<
jlebon >
dustymabe: we need to make sure the jenkins is up to date and then cycle it
14:45
<
jlebon >
yup, the k8s-plugin-tweaks.yaml is there. i'm not sure how, I hadn't created it yet when I originally rolled out that PR.
14:46
<
jlebon >
new pod coming up. let's see... hopefully the PVC won't throw it off
14:51
<
jlebon >
ahh sorry, i had already cycled it. but i've got bad news
14:52
<
jlebon >
the kube cloud config wasn't restored, i think because of the PVC
14:53
<
jlebon >
casc runs on every jenkins start, but the auto-cloud configuration happens on the first start only
14:53
<
dustymabe >
hmm. I feel like cycling jenkins didn't yield it inoperable in the past?
14:54
<
jlebon >
so i think we need to nuke the PVC. there's a way to retain logs i think.
14:55
<
jlebon >
i think for most things, yes. but in this case, the cloud config added by the s2i run script was clobbered
14:55
<
jlebon >
hmm, let me check something
14:56
<
jlebon >
yup, exactly
14:58
<
dustymabe >
ok, let me know how you want to proceed. we can nuke/pave if needed
15:00
<
dustymabe >
here's my plan once we are ready to rollout the build-cosa changes
15:00
<
dustymabe >
3. oc delete configmap/jenkins-casc-cfg
15:00
<
dustymabe >
4. oc create configmap jenkins-casc-cfg --from-file=jenkins/config
15:00
<
dustymabe >
5. oc scale dc/jenkins --replicas=0
15:00
<
dustymabe >
6. oc scale dc/jenkins --replicas=1
15:11
<
jlebon >
to be clear, there's definitely a chance we lose build logs with this, which would be unfortunate but not a big deal either
15:13
<
dustymabe >
jlebon: honestly I wouldn't mind losing logs every once in a while (I think starting fresh and making sure our steps and code for fresh bringup are accurate is worth the loss)
15:15
<
jlebon >
dustymabe: agreed
15:17
<
dustymabe >
jlebon: let me know when we should proceed
15:32
<
dustymabe >
working on it
15:34
<
dustymabe >
jlebon: updated
15:37
<
jlebon >
with that, the plan above SGTM
15:38
<
dustymabe >
I guess we're now blocked on CI for that PR?
15:38
<
jlebon >
let me try to get 582 ready
15:39
<
jlebon >
in the past, changing the mirroring bits required approval from other owners, but looks like that changed recently
15:40
<
jlebon >
dustymabe: can you insert a step between 5 and 6 to rerun ./deploy?
15:40
<
jlebon >
it's needed for #587
15:43
<
dustymabe >
jlebon: `./deploy --official`?
15:45
<
jlebon >
dustymabe: yup!
15:47
<
jlebon >
"Only merges with author openshift-bot are currently allowed"... interesting
15:47
stephan has quit [Ping timeout: 245 seconds]
15:47
<
jlebon >
let's ask internally about that
15:48
<
dustymabe >
can you tag me in the conversation?
15:51
paragan has quit [Quit: Leaving]
15:51
stephan has joined #fedora-coreos
15:54
jcajka has quit [Quit: Leaving]
15:54
<
dustymabe >
anything else we can do in the meantime?
15:56
<
jlebon >
we could temporarily drop the githubPush() trigger and roll it out now
15:57
<
jlebon >
then add it back in once the openshift/release PR is merged
15:59
<
dustymabe >
but won't anything pushed get cloberred by the syncing done by registry.ci ?
15:59
<
dustymabe >
oh actually - let's just make set the bot permissions to "read"
15:59
<
dustymabe >
then we can be unblocked, right?
16:00
BobSlept has quit [Quit: You have been kicked for being idle]
16:01
<
jlebon >
well, we would only test it on a side branch not covered by CI. but yeah, flipping the bot perms is nicer.
16:02
<
dustymabe >
ok so new set of steps
16:02
<
dustymabe >
1. change bot perms for openshift_ci_cosa_push to "read"
16:02
<
dustymabe >
3. oc delete configmap/jenkins-casc-cfg
16:02
<
dustymabe >
4. oc create configmap jenkins-casc-cfg --from-file=jenkins/config
16:03
<
dustymabe >
5. oc scale dc/jenkins --replicas=0
16:03
<
dustymabe >
6. ./deploy --official
16:03
<
dustymabe >
7. oc scale dc/jenkins --replicas=1
16:06
<
dustymabe >
I'm at step 2 (just completed)
16:06
<
dustymabe >
i'll note before I execute further steps that the sync-stream-metadata job is having trouble starting pods
16:07
<
dustymabe >
jlebon: expected?
16:07
<
dustymabe >
ahh I think the answer is yes
16:07
<
jlebon >
yes, expected
16:07
<
dustymabe >
i.e. that's why I need to run deploy again
16:07
<
dustymabe >
continuing
16:10
<
dustymabe >
ok I completed all the steps
16:10
<
dustymabe >
let's see if the sync-stream-metadata pods come up now
16:11
<
dustymabe >
still seeing "provided port:50000 is not reachable" errors
16:11
<
jlebon >
hmm no, the cloud config is still missing
16:12
<
jlebon >
grrr. used the wrong var name.
16:13
<
jlebon >
though actually, we do want that one too, so i'll just leave it ;)
16:13
<
jlebon >
working on a patch
16:13
<
dustymabe >
after this do I need to start over a step 3 or step 5 ?
16:14
<
jlebon >
no wait, i did type it correctly
16:14
<
jlebon >
hmm, it's like deploy didn't apply the change
16:14
<
jlebon >
were you on the latest git main?
16:15
<
jlebon >
$ oc get dc jenkins -o yaml | grep OVERRIDE_PV_CONFIG_WITH_IMAGE_CONFIG
16:18
<
jlebon >
oh right of course
16:18
<
jlebon >
jenkins.yaml isn't handled by deploy
16:18
<
dustymabe >
hand edit?
16:19
<
jlebon >
dustymabe: let me do it
16:19
<
jlebon >
lucab: sure, will do
16:20
<
jlebon >
dustymabe: new pod coming up
16:23
<
dustymabe >
interesting..
16:23
<
dustymabe >
only the seed job remains :)
16:23
<
dustymabe >
expected?
16:24
<
jlebon >
so if i'm right
16:24
<
jlebon >
once we seed, all logs should magically be there
16:24
ravanelli has quit [Remote host closed the connection]
16:25
<
dustymabe >
shall I run or you?
16:25
<
jlebon >
sadly not. oh well! :)
16:25
<
jlebon >
ran it already :)
16:25
<
dustymabe >
looks like you did!
16:25
<
dustymabe >
i'm running the build-cosa job!
16:28
<
lucab >
aaradhak davdunc dustymabe gursewak jaimelm jbrooks jcajka jdoss jlebon jmarrero lorbus miabbott nasirhm ravanelli saqali skunkerk walters
16:28
<
lucab >
FCOS community meeting in #fedora-meeting-1
16:31
mnguyen has joined #fedora-coreos
16:36
aaradhak has joined #fedora-coreos
16:39
crobinso has quit [Remote host closed the connection]
16:41
ravanelli has joined #fedora-coreos
16:41
bgilbert has joined #fedora-coreos
16:53
<
dustymabe >
(sorry for the non-public link)
17:09
mnguyen_ has joined #fedora-coreos
17:23
Betal has joined #fedora-coreos
17:28
ravanelli has quit [Remote host closed the connection]
17:44
<
dustymabe >
jlebon: can you help me with the webhook for COSA?
17:48
<
dustymabe >
actually I think I just added it - let's see if it works
17:48
<
jlebon >
dustymabe: it should be auto-added
17:48
<
dustymabe >
auto-added by what?
17:49
<
dustymabe >
the jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org one
17:49
<
jlebon >
i think it's done every X period or on some events or something
17:49
<
jlebon >
but you can ask it manually too
17:50
<
jlebon >
on the jenkins configuration page
17:50
<
dustymabe >
should I delete what I just created?
17:50
<
jlebon >
sure, and i'll tickle it
17:50
<
dustymabe >
deleted
17:51
<
dustymabe >
ok I see it now
17:51
<
dustymabe >
are all the other hooks in there needed?
17:51
<
jlebon >
i wonder why the coreos-ci one has issue_comment too
17:52
<
jlebon >
actually, the app.ci ones no. but let's leave them for now until we're sure we're not reverting the release PR
17:52
<
dustymabe >
ok i'm going to go eat lunch
18:25
jpn has quit [Ping timeout: 268 seconds]
18:46
aaradhak has quit [Quit: Connection closed for inactivity]
18:57
jpn has joined #fedora-coreos
19:11
ravanelli has joined #fedora-coreos
19:47
jpn has quit [Ping timeout: 268 seconds]
19:48
jpn has joined #fedora-coreos
19:53
jpn has quit [Ping timeout: 252 seconds]
19:53
jpn has joined #fedora-coreos
20:24
jpn has quit [Ping timeout: 268 seconds]
20:25
nalind has quit [Quit: bye]
20:25
jpn has joined #fedora-coreos
20:30
jpn has quit [Ping timeout: 268 seconds]
20:33
<
dustymabe >
jlebon: another option is that we just autotrigger builds (webhook) for `main` and require manual build for the other branches
20:37
<
jlebon >
dustymabe: not ideal, but that works, yeah
20:37
<
jlebon >
i'm confused why it's only spawning a single job. but anyway, even if it spawned for all branches, we still have the PVC problem
20:51
<
dustymabe >
jlebon: are you triggering the jobs manually?
20:51
<
jlebon >
i haven't so far. i was testing stuff by redelivering webhook events from the github UI
20:57
jpn has joined #fedora-coreos
20:59
<
dustymabe >
I have to head out for now
20:59
<
dustymabe >
will catch back up later
21:31
gursewak has quit [Ping timeout: 240 seconds]
21:33
ravanelli has quit [Remote host closed the connection]
21:35
jpn has quit [Ping timeout: 268 seconds]
21:47
gursewak has joined #fedora-coreos
21:56
ravanelli has joined #fedora-coreos
21:59
jpn has joined #fedora-coreos
22:06
jpn has quit [Ping timeout: 252 seconds]
22:16
ravanelli has quit [Remote host closed the connection]
22:20
jpn has joined #fedora-coreos
22:32
jpn has quit [Ping timeout: 268 seconds]
22:45
jpn has joined #fedora-coreos
22:52
jpn has quit [Ping timeout: 240 seconds]
22:58
jpn has joined #fedora-coreos
23:02
jpn has quit [Ping timeout: 252 seconds]
23:17
jpn has joined #fedora-coreos
23:23
jpn has quit [Ping timeout: 252 seconds]
23:32
gursewak has quit [Remote host closed the connection]
23:32
gursewak_ has joined #fedora-coreos
23:39
mnguyen has quit [Ping timeout: 268 seconds]
23:39
mnguyen has joined #fedora-coreos
23:40
mnguyen_ has quit [Ping timeout: 268 seconds]
23:40
mnguyen_ has joined #fedora-coreos
23:54
jpn has joined #fedora-coreos
23:56
ravanelli has joined #fedora-coreos
23:58
jpn has quit [Ping timeout: 252 seconds]