Re: [slurm-users] Is it safe to convert cons_res to cons_tres on a running system?

Steven Dick Thu, 26 Mar 2020 08:24:57 -0700

When I changed this on a running system, no jobs were killed, but
slurm lost track of jobs on nodes and was unable to kill them or tell
when they were finished until slurmd on each node was restarted.  I
let running jobs complete and monitored them manually, and restarted
slurmd on each node as they finished.


In desperation, you can do it, but it might be better to wait until no
jobs (or few jobs) are running.

On Thu, Mar 26, 2020 at 10:40 AM Pär Lindfors <par.lindf...@uppmax.uu.se> wrote:
>
> Hi Nate,
>
> On Fri, 2020-02-21 at 11:38 -0800, Nathan R Crawford wrote:
> >   If it just requires restarting slurmctld and the slurmd processes
> > on the nodes, I will be happy! Can you confirm that no running or
> > pending jobs were lost in the transition?
>
> Did you change your SelectType to cons_tres? How did it go?
>
> We need to do the same change on one of our clusters. I have done a few
> tests on a tiny test cluster which so far indicates that changing works
> even with jobs running, but a configuration change with even a small
> risk of purging the job list makes me a little nervous.
>
> Regards,
> Pär Lindfors,
> UPPMAX
>
>
>
>
>
>
>
>
> När du har kontakt med oss på Uppsala universitet med e-post så innebär det 
> att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan 
> du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal data. 
> For more information on how this is performed, please read here: 
> http://www.uu.se/en/about-uu/data-protection-policy

Re: [slurm-users] Is it safe to convert cons_res to cons_tres on a running system?

Reply via email to