tardyp changed the topic of #buildbot to: A Software Freedom Conservancy Project | Buildbot-3.2.0 | docs: http://docs.buildbot.net/current/ | tutorial: http://docs.buildbot.net/current/tutorial | irclogs: https://libera.irclog.whitequark.org/buildbot
koobs has quit [Ping timeout: 258 seconds]
koobs has joined #buildbot
koobs has quit [Quit: koobs]
koobs has joined #buildbot
zware has quit [Quit: No Ping reply in 180 seconds.]
zware has joined #buildbot
aakashjain has joined #buildbot
<aakashjain> For multi-master setup, is it possible to have masters on different machines (sharing a same database over network, and loading same buildbot config)? any known issues with such a setup?
aakashjain has quit [Remote host closed the connection]
aakashjain has joined #buildbot
<p12tic> aakashjain: Just out of interest, why do you want to do that? Is there some resource limit that a master is hitting on a single machine?
<aakashjain> p12tic: Yeah (most likely), one of our buildbot instance (using multi-master) is running under heavy load. web pages loading is also slow, simple build-steps are also visibly slow sometimes. I was thinking to move the webserver master to a separate VM in order to speed up the webpage loading
<p12tic> heavy load, how much is that?
<aakashjain> p12tic: how can I quantify that? I noticed that web pages loading is slow (varies, but sometimes takes 10-60seconds), simple build-steps are also visibly slow sometimes
<p12tic> I mean, how many builds are running concurrently and how large are the logs created by them
<aakashjain> p12tic: between 200-300 builds concurrentloy, logs are somewhat large
<p12tic> right, that's significant load
<p12tic> a single buildbot master is effectively constrained to a single CPU core, so as long as there are more cores on the machine than there are buildbot masters it should not be slower than a separate machine
<p12tic> one potential optimization would be to send logs from workers in larger chunks. it's currently hardcoded in the worker as CHUNK_LIMIT, BUFFER_SIZE and BUFFER_TIMEOUT variables
<p12tic> I think it may make sense to bump BUFFER_TIMEOUT to something like 30 and see what happens in your case
<p12tic> I remember you were trying the buildbot profiler, did you get any useful data out of that?
_whitelogger has joined #buildbot
<aakashjain> p12tic: I didn't get much useful data from profiler, I sent some profiles to tardyp_, he indicated that the master is somewhat overloaded with the stdout log management. I added another master and moved few workers to that master, it helped somewhat, but not much.
<aakashjain> Thanks for the suggestion about CHUNK_LIMIT, BUFFER_SIZE and BUFFER_TIMEOUT
<aakashjain> I guess this is the BUFFER_TIMEOUT you are referring to: https://github.com/buildbot/buildbot/blob/master/worker/buildbot_worker/runprocess.py#L274
<aakashjain> How do I change this on workers? (there are few hundred bots)
<aakashjain> Do I have to re-compile buildbot-worker package on each bot (or maybe directly change the /Library/Python/2.7/site-packages/buildbot_worker/runprocess.py on the bots)?
<p12tic> You could change it directly in the python file
<p12tic> at the start you could just do this for the workers connected to the most loaded master and then you could see whether it improves the situation there
<aakashjain> p12tic: sounds good, Thanks!
tflink_ is now known as tflink
sknebel has quit [Remote host closed the connection]
sknebel has joined #buildbot
gmcdonald has quit [Ping timeout: 246 seconds]
aakashjain has quit [Remote host closed the connection]
aakashjain has joined #buildbot
aakashjain has quit [Read error: Connection reset by peer]
aakashjain has joined #buildbot
aakashjain has quit [Ping timeout: 252 seconds]