[slurm-users] Use a portion of resources already allocated for a script

2018-09-20 Thread Michael Lamparski
Hello all, For years I've been looking for what I might consider to be the holy grail of composable resource allocation in slurm jobs: * A command that can be run inside of an sbatch script... * ...which immediately and synchronously invokes another sbatch script (which may or may not invoke mpir

Re: [slurm-users] Slurm strigger configuration

2018-09-20 Thread Jodie H. Sprouse
Thank you to both Kilian and Chris, I have running on the slurm server to report once when any of the nodes go into “Drain” State: sudo -u slurm bash -c “strigger --set -D -p /etc/slurm/triggers/slurm_admin_notify --flags=perm" /bin/mail -s “ClusterName DrainedNode:$*” our_admin_email_address

Re: [slurm-users] Defining constraints for job dispatching

2018-09-20 Thread Renfro, Michael
Partitions have an ExclusiveUser setting. Not exclusive per job as I’d mis-remembered, but exclusive per user. In any case, none of my few Fluent users run graphically on the HPC. They do their pre- and post-processing on local workstations, copying their .cas.gz and .dat.gz files to the HPC an

[slurm-users] priority: 'job size'-factor scaling parameter

2018-09-20 Thread Daan van Rossum
Dear Slurm users, Is there a 'Job Size'-factor equivalent of the 'Job Age'-factor's PriorityMaxAge parameter in Slurm? What I am looking for is a scaling parameter (or threshold parameter) to normalize the job size factor to a value of 1 for '10-core 1-hour' jobs instead of for 1-core 1-mi

[slurm-users] priority: 'job size'-factor scaling parameter

2018-09-20 Thread Daan van Rossum
Dear Slurm users, Is there a 'Job Size'-factor equivalent of the 'Job Age'-factor's PriorityMaxAge parameter in Slurm? What I am looking for is a scaling parameter (or threshold parameter) to normalize the job size factor to a value of 1 for '10-core 1-hour' jobs instead of for 1-core 1-mi

Re: [slurm-users] Dealing with wrong things that users do

2018-09-20 Thread Chris Samuel
On Thursday, 20 September 2018 5:57:56 PM AEST Mahmood Naderan wrote: > It seems that when their fluent job crashes for some reasons, or they > decide to close the fluent window without terminating the job or > closing the terminal suddenly or ... the fluent processes remain in > the node while t

Re: [slurm-users] Setting up a separate timeout for interactive jobs

2018-09-20 Thread Chris Samuel
On Thursday, 20 September 2018 1:50:39 AM AEST Siddharth Dalmia wrote: > Is it possible to have a separate timeout for interactive jobs? Or can > someone help me come up with a hack to do this? I believe you should be able to catch interactive jobs in the submit filter by looking for the absence

Re: [slurm-users] Setting up a separate timeout for interactive jobs

2018-09-20 Thread Yair Yarom
Hi, We also have multiple partitions, but in addition we use a job submit plugin to distinguish between srun/salloc and sbatch submissions. This plugin forces a specific partition for interactive jobs (and the timelimit with it) and using the license system it limits the number of simultaneous int

[slurm-users] Dealing with wrong things that users do

2018-09-20 Thread Mahmood Naderan
Hi, Since users are not experts with cluster environment and slurm, their wrong works take up my time. It seems that when their fluent job crashes for some reasons, or they decide to close the fluent window without terminating the job or closing the terminal suddenly or ... the fluent processes r

Re: [slurm-users] Defining constraints for job dispatching

2018-09-20 Thread Mahmood Naderan
Hi Michael, Sorry for the late response. Do you mean supplying --exclusive to the srun command? Or I have to do something else for partitions? Currently they use srun -n 1 -c 6 --x11 -A monthly -p CAT --mem=32GB ./fluent.sh where fluent.sh is #!/bin/bash unset SLURM_GTIDS /state/partition1/ansys