That great, thanks. We were thinking about staging it like that, and using days is simpler to trigger than waiting for the month.
We will also need to increase innodb_lock_wait_timeout first so we don't hit the problems described in https://bugs.schedmd.com/show_bug.cgi?id=4295. Anyone know why sreport would suddenly so much longer in the first place, though? Many thanks, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 132, Computer Centre G5, Elms Road Please note I don’t work on Monday. > -----Original Message----- > From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of > ole.h.niel...@fysik.dtu.dk > Sent: 23 February 2021 13:24 > To: slurm-users@lists.schedmd.com > Subject: Re: [slurm-users] Slurmdbd purge settings > > On 2/23/21 1:25 PM, Luke Sudbery wrote: > > We have suddenly got bad performance from sreport, querying a 1 hour > > period (in the last 24 hours) for TopUsage went from taking under a > minute > > to timing out after the 15 minutes max slurmdbd query time – although > the > > SQL query on the DB server continued long after that. > > > > So firstly we were wondering what might have caused that. > > > > But while investigating we decided we should turn on purging records > in > > slurmdbd.conf, and wanted more detail about when the purge would > occur and > > would it lock the database for other Slurm processes. Docs say “The > purge > > takes place at the start of the each purge interval.” But we assume > it > > will also do so on a restart of slurmdbd so we can manage exactly > when > > that happens – is that true? And as we have many years and millions > of > > records to purge we need to know if this will hang all database > access, > > and what kind of outage that is likely to cause. > > > > Anyone have experience of enabling urging after the fact? > > I worked on progressive database purging a while back and documented it > in > my Slurm Wiki page: > > https://wiki.fysik.dtu.dk/niflheim/Slurm_database#setting-database- > purge-parameters > > Note in particular these recommendations: > > A monthly purge operation can be a huge amount of work for a database > depending on its size, and you certainly want to cut down the amount of > work required during the purges. If you did not use purges before, it > is > probably a good idea to try out a series of daily purges starting with: > > PurgeEventAfter=2000days > PurgeJobAfter=2000days > PurgeResvAfter=2000days > PurgeStepAfter=2000days > PurgeSuspendAfter=2000days > > If this works well over a few days, decrease the purge interval > 2000days > little by little and try again (1800, 1500, etc) until you after many > iterations come down to the desired final purge intervals. > > I hope this helps. > > Best regards, > Ole