tardyp changed the topic of #buildbot to: A Software Freedom Conservancy Project | Buildbot-3.2.0 | docs: http://docs.buildbot.net/current/ | tutorial: http://docs.buildbot.net/current/tutorial | irclogs: https://libera.irclog.whitequark.org/buildbot
kveremitz has quit [Quit: ZNC - http://znc.in]
kveremitz has joined #buildbot
<tardyp> RP: indeed, it is unfortunate. buildbot is too fast sheduling builds
<tardyp> thing that come to mind is that we should debounce the new buildset events.
<tardyp> if my theory is right, this should fix your symptom
<RP> tardyp: I did file an issue which had an attempt at a patch which is very similar https://github.com/buildbot/buildbot/issues/6285 ! :)
<RP> tardyp: We have been running with my version which does seem to help a lot (although I didn't make it delay the builds at all so we do get some out of order)
<RP> tardyp: confirmed your patch also works although 1 second doesn't look like quite long enough on our infrastructure . Needs about 3s I suspect
<tardyp> so this might mean the events should be sent at once, rather than in between db access
<tardyp> probably your db is already busy with logs
<tardyp> grep 'added buildset' twistd.log
<tardyp> we can see the performance of adding a buildset. On the metabuildbot infra, the perf is about one buildset per second
<tardyp> well, more 500ms. it takes 12s to trigger 24 builds.
<tardyp> this is not really expected.
<RP> tardyp: we only add one buildset which contains many different builderids so the logs don't help and it shouldn't be that
<tardyp> RP: ah.. you mean there is only one triggerable, which contains several builders in it?
<RP> tardyp: yes
<tardyp> metabuildbot configures each build with specific properties, so has multiple buildsets
<RP> but it seems to take a while to add all the builders (57 of them)
<tardyp> mmhh, in this case, the codepath is totally different, and the buildrequest events should be sent at the same time.
<RP> tardyp: I think they are sent at the same time but the receiving code seems them with delays between
<RP> s/seems/sees/
<tardyp> you are multimaster?
<RP> tardyp: no, just one
<tardyp> then the message queue backend is just very simple, it shouldn't add delays
<RP> tardyp: What I see is that we see the requests appear to _maybeStartBuildsOn over about 3s or so for the 57 builders. Why, not sure :/
<tardyp> the self.master.data.get in the loop could be a delay
<RP> tardyp: I'll give that a try but the system is live and busy atm so it will have to be when that build completes
<tardyp> ofc
<tardyp> https://github.com/buildbot/buildbot/pull/6286 now contains a better implementation with a real debouncer which waits until there is no more events
<tardyp> the timer is 100ms, because it would impact too much the test time, I have to figure out how to increase it without impacting the test time