Hi Ole, Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> writes: > We would like to put limits on interactive jobs (started by salloc) so > that users don't leave unused interactive jobs behind on the cluster > by mistake. > > I can't offhand find any configurations that limit interactive jobs, > such as enforcing a timelimit. > > Perhaps this could be done in job_submit.lua, but I couldn't find any > job_desc parameters in the source code which would indicate if a job > is interactive or not. > > Question: How do people limit interactive jobs, or identify orphaned > jobs and kill them?
We would be interested in this too. Currently we have a very make-shift solution which involves a script which simply pipes all running job IDs to 'sjeff' (https://github.com/ubccr/stubl/blob/master/bin/sjeff) every 30s. This produces an output like the following: Username Mem_Request Max_Mem_Use CPU_Efficiency Number_of_CPUs_In_Use able 3600M 0.94Gn 99.22% (142.88 of 144) baker 8G 0.90Gn 0.60% (0.02 of 4) charlie varied 32.92Gn 42.54% (5.96 of 14) ... == CPU efficiency: data above from Fri 25 Apr 11:17:09 CEST 2025 == where efficiencies under 50% are printed in red. As long as one only has about a screenful of users, it is fairly easy to spot users with a low CPU efficiency, whether it be due to idle interactive jobs or caused by something else. Apart from that, we have a partition called 'interactive' which has an appropriately short MaxTime. We don't actually lie to our users by saying that they have to used this partition, but we don't advertise the fact they could use any of the other partitions for interactive work. This is obviously also even more make-shift :-) Cheers, Loris > Thanks a lot, > Ole > > -- > Ole Holm Nielsen > PhD, Senior HPC Officer > Department of Physics, Technical University of Denmark -- Dr. Loris Bennett (Herr/Mr) FUB-IT, Freie Universität Berlin -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com