Re: [slurm-users] User limits for multiple associated accounts

2018-05-11 Thread Chris Samuel
On Friday, 11 May 2018 11:15:49 PM AEST Mahmood Naderan wrote: > Excuse me... I see the output of squeue which says > 170 IACTIVE bash mahmood PD 0:00 1 (AssocGrpMemLimit) > > I don't understand why the memory limit is reach? That's based on what your job requests, not what is

Re: [slurm-users] How to check if there's a reservation

2018-05-11 Thread Chris Samuel
Hey Prentice, On Friday, 11 May 2018 6:23:06 AM AEST Prentice Bisbal wrote: > They would like to have their submission framework automatically > detect if there's a reservation that may interfere with their jobs, and > act accordingly. As an additional data point there is also srun's "--test-onl

Re: [slurm-users] Historical License Usage by Jobs

2018-05-11 Thread Chris Samuel
On Saturday, 12 May 2018 12:47:29 AM AEST Barry Moore wrote: > This works perfectly, I appreciate the pointer. Great to hear! My pleasure. -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Re: [slurm-users] Issue with salloc

2018-05-11 Thread Chris Samuel
On Saturday, 12 May 2018 3:35:39 PM AEST Mahmood Naderan wrote: > Although I specified one compute node in an interactive partition, the > salloc doesn't ssh to that node. salloc doesn't do that. We use a 2 line script called "sinteractive" to do this, it's really simple. #!/bin/bash exec srun

[slurm-users] Issue with salloc

2018-05-11 Thread Mahmood Naderan
Hi, Although I specified one compute node in an interactive partition, the salloc doesn't ssh to that node. See below [mahmood@rocks7 ~]$ scontrol show partition IACTIVE PartitionName=IACTIVE AllowGroups=ALL AllowAccounts=em1 AllowQos=ALL AllocNodes=rocks7 Default=NO QoS=N/A DefaultTime=

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread Eric F. Alemany
Hi Chris, Thank you for your comments. I will look at Easybuild. There are quite a few options to automate the creation of software modules. I will be doing lots of reading this week-end. By the way, i signed up to the Beowulf mailing list. Thank you, Eric

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread Eric F. Alemany
Hi John, Regarding NFS shares and Python, and plenty of other packages too, pay attention to where the NFS server is located on your network. The NFS server should be part of your cluster, or at least have a network interface on your cluster fabric. If you perhaps have a home directory server wh

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread Eric F. Alemany
HI Miguel, Thank you for your comment. That sounds pretty straight forward. you never had issues with programs relying on the system files or relying on the home directory location? Thanks Eric _

[slurm-users] GrpTRESRunMins and submitting to multiple partitions

2018-05-11 Thread Nathan R.M. Crawford
Hi All, I'm trying out using GrpTRESRunMins to prevent users from opportunistically flooding an empty partition with long jobs. We have a partition set up for each CPU type, and give each association (account/user/partition) a separate limit based on that account's share of the partition. It

Re: [slurm-users] Distribute jobs in similar nodes in the same partition

2018-05-11 Thread Antonio Lara
Thank you all for your answers, I will research some more along these lines! Any other opinion is welcome Regards, Antonio El 11/05/18 a las 16:05, Vicker, Darby (JSC-EG311) escribió: I’ll second that – we have a cluster with 4 generations of nodes.  We assign a processor type feature to e

Re: [slurm-users] Historical License Usage by Jobs

2018-05-11 Thread Barry Moore
Chris, This works perfectly, I appreciate the pointer. Thanks, Barry On Fri, May 11, 2018 at 06:19:40PM +1000, Chris Samuel wrote: > On Friday, 11 May 2018 4:54:32 AM AEST Barry Moore wrote: > > > Is it possible to track all jobs which requested a specific license? I am > > using Slurm 16.05.6

Re: [slurm-users] How to check if there's a reservation

2018-05-11 Thread Douglas Jacobsen
A feature that many slurm users might like is sbatch --time-min. Using both --time-min and --time a user can specify the range of acceptable wall times limits. This can make it much easier to keep jobs running right up to the maintenance reservation. e.g.: sbatch --time-min=30:00 --time=48:00:

Re: [slurm-users] impact of changing SelectTypeParameters?

2018-05-11 Thread Vicker, Darby (JSC-EG311)
In the “other ramifications” category, if you aren’t already planning to, you might consider making this change during a maintenance period when all jobs are drained. We tried to change from SelectType=select/linear to SelectType=select/cons_res on the fly once (via “scontrol reconfig”) and di

Re: [slurm-users] How to check if there's a reservation

2018-05-11 Thread Paul Edmon
In the past we used the LUA job submit plugin to block jobs that would intersect maintenance reservations.  I would look at that. -Paul Edmon- On 05/11/2018 08:19 AM, Bill Wichser wrote: The problem is that reservations can be in there yet have no effect on the submitted job if they would run

Re: [slurm-users] Distribute jobs in similar nodes in the same partition

2018-05-11 Thread Vicker, Darby (JSC-EG311)
I’ll second that – we have a cluster with 4 generations of nodes. We assign a processor type feature to each node and require the users to ask for at least one of those features in their jobs via job_submit.lua – see the code below. For a job that can run on any processor type, you can use thi

Re: [slurm-users] User limits for multiple associated accounts

2018-05-11 Thread Mahmood Naderan
Excuse me... I see the output of squeue which says 170 IACTIVE bash mahmood PD 0:00 1 (AssocGrpMemLimit) I don't understand why the memory limit is reach? I can not see the memory usage of a running job from sacct commands. However, using "top" on the compute node, I see 6 cores

Re: [slurm-users] --uid , --gid option is root only now :'(

2018-05-11 Thread John Marshall
On 10 May 2018, at 19:48, Christopher Benjamin Coffey wrote: > We noticed that recently --uid, and --gid functionality changed where > previously a user in the slurm administrators group could launch jobs > successfully with --uid, and --gid , allowing for them to submit jobs as > another use

Re: [slurm-users] How to check if there's a reservation

2018-05-11 Thread Bill Wichser
The problem is that reservations can be in there yet have no effect on the submitted job if they would run before the reservation takes place. One can pull the starting time simply using something like this scontrol show res -o | awk '{print $2}' with output StartTime=2018-06-12T06:00:00 Star

[slurm-users] User limits for multiple associated accounts

2018-05-11 Thread Mahmood Naderan
Hi I have added a user to multiple partitions. That account name actually corresponds to a set of limitations which I define for a user. [root@rocks7 ~]# sacctmgr list association format=partition,account,user,grptres,maxwall PartitionAccount User GrpTRES MaxWall -- --

Re: [slurm-users] Distribute jobs in similar nodes in the same partition

2018-05-11 Thread Hadrian Djohari
You can use node feature in defining the node types in slurm.conf. Then when requesting for the job, use -C toy just use those node type. On Fri, May 11, 2018, 5:38 AM Antonio Lara wrote: > Hello everyone, > > Hopefully someone can help me with this, I cannot find in the manual if > this is e

[slurm-users] Distribute jobs in similar nodes in the same partition

2018-05-11 Thread Antonio Lara
Hello everyone, Hopefully someone can help me with this, I cannot find in the manual if this is even possible: I'm a system administrator, and the following question is from the administrator point of view, not the user's point of view: I work with a cluster which has a partition containing

Re: [slurm-users] Memory oversubscription and sheduling

2018-05-11 Thread Chris Samuel
Hey Michael! On Friday, 11 May 2018 1:00:24 AM AEST Michael Jennings wrote: > I'm surprised to hear that; this is the first time I've ever heard > that in regards to SLURM. I'd only ever heard folks complain about > TORQUE having that issue. Hmm, you might well be right, I might have done that

Re: [slurm-users] Historical License Usage by Jobs

2018-05-11 Thread Chris Samuel
On Friday, 11 May 2018 4:54:32 AM AEST Barry Moore wrote: > Is it possible to track all jobs which requested a specific license? I am > using Slurm 16.05.6. I looked through `sacct ... --format=all`, but maybe I > am missing something. I don't think licenses are stored in Slurmdbd by default, I t

Re: [slurm-users] --uid , --gid option is root only now :'(

2018-05-11 Thread Chris Samuel
On Friday, 11 May 2018 4:48:16 AM AEST Christopher Benjamin Coffey wrote: > What was the reasoning in making this change? Do people not trust the folks > in the slurm administrator group to allow this behavior? Seems odd. The change was here: https://github.com/SchedMD/slurm/commit/52086a9bc0ff2

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread Chris Samuel
On Friday, 11 May 2018 5:11:38 PM AEST John Hearns wrote: > Eric, my advice would be to definitely learn the Modules system and > implement modules for your users. I will echo that, and the suggestion of shared storage (we use our Lustre filesystem for that). I would also suggest looking at a s

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread John Hearns
Regarding NFS shares and Python, and plenty of other packages too, pay attention to where the NFS server is located on your network. The NFS server should be part of your cluster, or at least have a network interface on your cluster fabric. If you perhaps have a home directory server which is a ca

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-11 Thread Miguel Gutiérrez Páez
Hi, I install all my apps in a shared storage, and change environment variables (path, vars, etc.) with lmod. It's very useful. Regards. El vie., 11 may. 2018 a las 6:19, Eric F. Alemany () escribió: > Hi Lachlan, > > Thank you for sharing your environment. Everyone has their own set of > rules