<aakashjain>
rjarry: how is %s used? would it be a parameter to the script?
<rjarry>
you should replace it with the number of days until which you want to keep stuff
<rjarry>
you can start with 365
<rjarry>
1 year$
<aakashjain>
rjarry: yeah. I have a copy of the database, on that I tried just the first delete statement from your script (DELETE FROM sourcestamps WHERE created_at < (SELECT max(created_at) - (365 * 24 * 60 * 60) FROM sourcestamps);), and it didn't finish it ~3 hours. This is even though sourcestamps table is just few mbs. Not sure why it was taking so long (cpu usage was at 100% by postgres during that time). That's why I was wondering if I will
<aakashjain>
be able to run the complete script in one go, when I wasn't able to run a single command.
<aakashjain>
rjarry: could it really lead to inconsistency if I execute each of the DELETE, UPDATE statements one by one sequentially, instead of inside a BEGIN.. COMMIT block?
aakashja_ has joined #buildbot
aakashjain has quit [Ping timeout: 265 seconds]
<rjarry>
aakashja_: maybe you can try to tune the postgres server params
<rjarry>
I managed to drastically improve performance by using hugepages
<rjarry>
also, by improving overall cache sizes
<rjarry>
how old is your database ?
gmcdonal1 has joined #buildbot
_whitelogger has joined #buildbot
deepy has joined #buildbot
<aakashja_>
rjarry: ~4 years
<aakashja_>
rjarry: which specific cache parameters?
<p12tic>
note that for hugepages support it must be enabled in the kernel and potentially preallocated
<p12tic>
tardyp: Could you please create a repository at buildbot/trac.buildbot.net or similar? I don't have permissions for that
<p12tic>
I've created a html archive of the old content of trac.buildbot.net from a DB dump
<p12tic>
Hopefully we can point trac.buildbot.net to the archive and avoid broken links which can sometimes be useful
<p12tic>
the archive contains only the tickets, the wiki was already broken and pointing to our github half of the time
<aakashja_>
rjarry: tardyp p12tic: would be there any inconsistency or other problem if I delete old data from some tables, but not from others (e.g.: deleting old data from builds, buildsets, steps etc, but not deleting from logs, logchunks tables)
<p12tic>
we have cascade delete enabled, so deleting stuff from builds will delete all related data that refers to them
<p12tic>
yes, at least in theory. if you dump the DB schema you should see that tables contain "ON DELETE CASCADE" on foreign keys
bdbaddog has joined #buildbot
<aakashja_>
p12tic: nice. In general if I delete old data from some tables, but not others (especially skip logs, logchunks), can it lead to any major issue?
<p12tic>
depends on which table. for example logs and log chunks are safe because it will appear to buildbot that steps just didn't produce any logs