That great, thanks. We were thinking about staging it like that, and using days 
is simpler to trigger than waiting for the month.

We will also need to increase innodb_lock_wait_timeout first so we don't hit 
the problems described in https://bugs.schedmd.com/show_bug.cgi?id=4295.

Anyone know why sreport would suddenly so much longer in the first place, 
though?

Many thanks,

Luke

-- 
Luke Sudbery
Architecture, Infrastructure and Systems
Advanced Research Computing, IT Services
Room 132, Computer Centre G5, Elms Road

Please note I don’t work on Monday.

> -----Original Message-----
> From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of
> ole.h.niel...@fysik.dtu.dk
> Sent: 23 February 2021 13:24
> To: slurm-users@lists.schedmd.com
> Subject: Re: [slurm-users] Slurmdbd purge settings
> 
> On 2/23/21 1:25 PM, Luke Sudbery wrote:
> > We have suddenly got bad performance from sreport, querying a 1 hour
> > period (in the last 24 hours) for TopUsage went from taking under a
> minute
> > to timing out after the 15 minutes max slurmdbd query time – although
> the
> > SQL query on the DB server continued long after that.
> >
> > So firstly we were wondering what might have caused that.
> >
> > But while investigating we decided we should turn on purging records
> in
> > slurmdbd.conf, and wanted more detail about when the purge would
> occur and
> > would it lock the database for other Slurm processes. Docs say “The
> purge
> > takes place at the start of the each purge interval.” But we assume
> it
> > will also do so on a restart of slurmdbd so we can manage exactly
> when
> > that happens – is that true? And as we have many years and millions
> of
> > records to purge we need to know if this will hang all database
> access,
> > and what kind of outage that is likely to cause.
> >
> > Anyone have experience of enabling urging after the fact?
> 
> I worked on progressive database purging a while back and documented it
> in
> my Slurm Wiki page:
> 
> https://wiki.fysik.dtu.dk/niflheim/Slurm_database#setting-database-
> purge-parameters
> 
> Note in particular these recommendations:
> 
> A monthly purge operation can be a huge amount of work for a database
> depending on its size, and you certainly want to cut down the amount of
> work required during the purges. If you did not use purges before, it
> is
> probably a good idea to try out a series of daily purges starting with:
> 
> PurgeEventAfter=2000days
> PurgeJobAfter=2000days
> PurgeResvAfter=2000days
> PurgeStepAfter=2000days
> PurgeSuspendAfter=2000days
> 
> If this works well over a few days, decrease the purge interval
> 2000days
> little by little and try again (1800, 1500, etc) until you after many
> iterations come down to the desired final purge intervals.
> 
> I hope this helps.
> 
> Best regards,
> Ole

Reply via email to