Hey Lachlan,

Can you specify how/where you set the walltime and which factor you use in the accounting system to deprioritse?

Thanks, Nadav

On 27/05/2018 11:34, Lachlan Musicman wrote:
On 27 May 2018 at 18:23, Nadav Toledo <nadavtol...@cs.technion.ac.il> wrote:
Hello forum,

I am trying to deal with idle session for some time, and haven't found a solution i am happy with.
The scenario is as follow: users using srun for jupyter-lab(which is fine and even encouraged by me) on image processing cluster with gpus.

problem is, I am trying to have some kind of solution to email/cancel their job if their session is idle for X amount of hours.

the w command or xprintidle cannot be used , since they both work with ssh but not with slurm(checked that)

Writing a script is not as easy as one might think, If i run a script in admin user scope, i need later on to figure out which idle gpu belong to which slurm job.
running a script in the user scope is probably better idea, but in which way? crontab is running even user is not logged, how can i force users to run something only when the job start?

perhaps some combination of sreport and tres?

Hmm.  We address this with accounting. A tight walltime ( 40 minutes) means that most jobs run without worrying about walltime. But some will need to set it. The accounting system keeps people honest by making "hogging" of resources bad for a users job priority - in so much as their next job will be deprioritsed.

Letting people know that their next job will not be de-prioritised if they waste the resources, we find our users behave responsibly.

L.


 

Reply via email to