Hi, I don't think the statement below about --nodes=1 is true. It just means you want one and not more than one node. This can be important multiple cores are requested, but the program is not, say, an MPI program.
You can see which cores a running job is using with scontrol show job --detail <job id> HTH Loris Prentice Bisbal <pbis...@pppl.gov> writes: > Remove this line: > > #SBATCH --nodes=1 > > Slurm assumes you're requesting the whole node. --ntasks=1 should be > adequate. > > > On 11/7/19 4:19 PM, Mike Mosley wrote: > > Greetings all: > > I'm attempting to configure the scheduler to schedule our GPU boxes but > have run into a bit of a snag. > > I have a box with two Tesla K80s. With my current configuration, the > scheduler will schedule one job on the box, but if I submit a second job, it > queues up until the first > one finishes: > > My submit script: > > #SBATCH --partition=NodeSet1 > > #SBATCH --nodes=1 > > #SBATCH --ntasks=1 > > #SBATCH --gres=gpu:k80:1 > > My slurm.conf (the things I think are relevant) > > GresTypes=gpu > > SelectType=select/cons_tres > > PartitionName=NodeSet1 Nodes=cht-c[1-4],cph-gpu1 Default=YES > MaxTime=INFINITE OverSubscribe=FORCE State=UP > > NodeName=cph-gpu1 CPUs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 > RealMemory=257541 Gres=gpu:k80:2 Feature=gpu State=UNKNOWN > > My gres.conf: > > NodeName=cph-gpu1 Name=gpu Type=k80 File=/dev/nvidia[0-1] > > and finally, the results of squeue: > > $ squeue > > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > > 208 NodeSet1 job.sh jmmosley PD 0:00 1 > (Resources) > > 207 NodeSet1 job.sh jmmosley R 4:12 1 cph-gpu1 > > Any idea what I am missing or have misconfigured? > > Thanks in advance. > > Mike > > -- > > J. Michael Mosley > University Research Computing > The University of North Carolina at Charlotte > 9201 University City Blvd > Charlotte, NC 28223 > 704.687.7065 jmmos...@uncc.edu > -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de