[slurm-users] Re: Software builds using slurm

2024-06-10 Thread Cutts, Tim via slurm-users
You have two options for managing those dependencies, as I see it)


  1.  you use SLURM’s native job dependencies, but this requires you to create 
a build script for SLURM
  2.  You use make to submit the jobs, and take advantage of the -j flag to 
make it run lots of tasks at once, just use a job starter prefix to prefix 
tasks you want run under SLURM with srun

The first approach will get the jobs run soonest.  The second approach is a bit 
of a hack, and it means that the dependent jobs don’t get submitted until the 
previous jobs have finished, which isn’t ideal, but it does work, and it meets 
your requirement of having a single build process that works both with and 
without SLURM:


JOBSTARTER=srun -c 1 -t 00:05:00

SLEEP=60



all: jobC.out



clean:

   rm -f job[ABC].out



jobA.out:

   $(JOBSTARTER) sh -c "sleep $(SLEEP); echo done > $@"



jobB.out:

   $(JOBSTARTER) sh -c "sleep $(SLEEP); echo done > $@"



jobC.out: jobA.out jobB.out

   $(JOBSTARTER) sh -c "echo done > $@"

When you want to run it interactively, you set JOBSTARTER to be empty, 
otherwise you use some suitable srun command to run the tasks under SLURM, and 
the above makefile does this:


$ make -j

srun -c 1 -t 00:01:00 sh -c "sleep 60; echo done > jobA.out"

srun -c 1 -t 00:01:00 sh -c "sleep 60; echo done > jobB.out"

srun: job 13324201 queued and waiting for resources

srun: job 13324202 queued and waiting for resources

srun: job 13324201 has been allocated resources

srun: job 13324202 has been allocated resources

srun -c 1 -t 00:01:00 sh -c "echo done > jobC.out"

srun: job 13324220 queued and waiting for resources

srun: job 13324220 has been allocated resources

Regards,

Tim

--
Tim Cutts
Scientific Computing Platform Lead
AstraZeneca

Find out more about R&D IT Data, Analytics & AI and how we can support you by 
visiting our Service 
Catalogue |


From: Duane Ellis via slurm-users 
Date: Sunday, 9 June 2024 at 15:50
To: slurm-users@lists.schedmd.com 
Subject: [slurm-users] Software builds using slurm
I have been lurking here for a while hoping to see some examples that would 
help but have not fit several months

We have a slurm system setup for xilnix FPGA builds (hdl) I want to use this 
for sw builds too

What I seem to see is slurm talks about cpus, GPUs and memory etc I am looking 
for a “run my make file (or shell script) on any available node”

In our case we have 3 top level jobs A B and C
These can all run in parallel and are independent (ie bootloader, linux kernel, 
and the Linux root file system via buildroot)

Job A (boot) is actually about 7 small builds that are independent

I am looking for a means to fork n jobs (ie job A B and C above) across the 
cluster and wait/collect the std output of those n jobs and the exit status

Job A would then fork and build 7 to 8 sub jobs
When they are done it would assemble the result into what Xilinix calls boot.bin

Job B is a Linux kernel build

Job C is buildroot so there are several (n=50) smaller builds ie bash, 
busybody, and other tools like python for the target agian each of these can be 
executed in parallel

Really do not (cannot) re architect my build to be a slurm only build because 
it also need to be able to run without slurm ie build everything on my laptop 
without slurm present

In that case the jobs would run serially and take an hour or so the hope is by 
parallelizing the sw build jobs our overall cycle time will improve

It would also be nice if the slurm cluster would adapt to the available nodes 
automatically

Our hope is we can run our lab pcs as duel boot they normally boot windows but 
we can duel boot them into Linux and they become a compile node and auto join 
the cluster and the cluster sees them as going off line when somebody reboots 
the machine back to  windows


Sent from my iPhone

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


AstraZeneca UK Limited is a company incorporated in England and Wales with 
registered number:03674842 and its registered office at 1 Francis Crick Avenue, 
Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only 
and may contain confidential and privileged information. If they have come to 
you in error, you must not copy or show them to anyone; instead, please reply 
to this e-mail, highlighting the error to the sender and then immediately 
delete the message. For information about how AstraZeneca UK Limited and its 
affiliates may process information, personal data and monitor communications, 
please see our privacy notice at 
www.astrazeneca.com

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: sbatch: Node count specification invalid - when only specifying --ntasks

2024-06-10 Thread George Leaver via slurm-users
Noam,

Thanks for the suggestion but no luck:

sbatch -p multinode -n 80 --ntasks-per-core=1 --wrap="..."
sbatch: error: Batch job submission failed: Node count specification invalid

sbatch -p multinode -n 2 -c 40 --ntasks-per-core=1 --wrap="..."
sbatch: error: Batch job submission failed: Node count specification invalid

sbatch -p multinode -N 2 -n 80 --ntasks-per-core=1 --wrap="..."
Submitted batch job

I guess that the MinNodes=2 in the partition def is now being enforced somewhat 
more strictly, or earlier in the submission process, before it can be 
determined that the request will satisfy the constraint.

Regards,
George

--
George Leaver
Research Infrastructure, IT Services, University of Manchester
http://ri.itservices.manchester.ac.uk | @UoM_eResearch


From: Bernstein, Noam CIV USN NRL WASHINGTON DC (USA) 

Sent: 09 June 2024 19:33
To: George Leaver; slurm-users@lists.schedmd.com
Subject: Re: sbatch: Node count specification invalid - when only specifying 
--ntasks

It would be a shame to lose this capability.  Have you tried adding 
`--ntasks-per-core` explicitly (but not number of nodes)?


Noam


-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: sbatch: Node count specification invalid - when only specifying --ntasks

2024-06-10 Thread Loris Bennett via slurm-users
Hi George,

George Leaver via slurm-users  writes:

> Hello,
>
> Previously we were running 22.05.10 and could submit a "multinode" job
> using only the total number of cores to run, not the number of nodes.
> For example, in a cluster containing only 40-core nodes (no
> hyperthreading), Slurm would determine two nodes were needed with
> only:
> sbatch -p multinode -n 80 --wrap=""
>
> Now in 23.02.1 this is no longer the case - we get:
> sbatch: error: Batch job submission failed: Node count specification invalid
>
> At least -N 2 is must be used (-n 80 can be added)
> sbatch -p multinode -N 2 -n 80 --wrap=""
>
> The partition config was, and is, as follows (MinNodes=2 to reject
> small jobs submitted to this partition - we want at least two nodes
> requested)
> PartitionName=multinode State=UP Nodes=node[081-245]
> DefaultTime=168:00:00 MaxTime=168:00:00 PreemptMode=OFF PriorityTier=1
> DefMemPerCPU=4096 MinNodes=2 QOS=multinode Oversubscribe=EXCLUSIVE
> Default=NO

But do you really want to force a job to use two nodes if it could in
fact run on one?

What is the use-case for having separate 'uninode' and 'multinode'
partitions?  We have a university cluster with a very wide range of jobs
and essentially a single partition.  Allowing all job types to use one
partition means that the different resource requirements tend to
complement each other to some degree.  Doesn't splitting up your jobs
over two partitions mean that either one of the two partitions could be
full, while the other has idle nodes?

Cheers,

Loris

> All nodes are of the form
> NodeName=node245 NodeAddr=node245 State=UNKNOWN Procs=40 Sockets=2 
> CoresPerSocket=20 ThreadsPerCore=1 RealMemory=187000
>
> slurm.conf has
> EnforcePartLimits   = ANY
> SelectType  = select/cons_tres
> TaskPlugin  = task/cgroup,task/affinity
>
> A few fields from: sacctmgr show qos multinode
> Name|Flags|MaxTRES
> multinode|DenyOnLimit|node=5
>
> The sbatch/srun man page states:
> -n, --ntasks  If -N is not specified, the default behavior is to
> allocate enough nodes to satisfy the requested resources as expressed
> by per-job specification options, e.g. -n, -c and --gpus.
>
> I've had a look through release notes back to 22.05.10 but can't see anything 
> obvious (to me).
>
> Has this behaviour changed? Or, more likely, what have I missed ;-) ?
>
> Many thanks,
> George
>
> --
> George Leaver
> Research Infrastructure, IT Services, University of Manchester
> http://ri.itservices.manchester.ac.uk | @UoM_eResearch
-- 
Dr. Loris Bennett (Herr/Mr)
FUB-IT (ex-ZEDAT), Freie Universität Berlin

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Software builds using slurm

2024-06-10 Thread Renfro, Michael via slurm-users
At a certain point, you’re talking about workflow orchestration. Snakemake [1] 
and its slurm executor plugin [2] may be a starting point, especially since 
Snakemake is a local-by-default tool. I wouldn’t try reproducing your entire 
“make” workflow in Snakemake. Instead, I’d define the roughly 60 parallel tasks 
you describe among jobs A, B, and C.

[1] https://snakemake.github.io
[2] 
https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html

From: Duane Ellis via slurm-users 
Date: Sunday, June 9, 2024 at 9:51 AM
To: slurm-users@lists.schedmd.com 
Subject: [slurm-users] Software builds using slurm
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



I have been lurking here for a while hoping to see some examples that would 
help but have not fit several months

We have a slurm system setup for xilnix FPGA builds (hdl) I want to use this 
for sw builds too

What I seem to see is slurm talks about cpus, GPUs and memory etc I am looking 
for a “run my make file (or shell script) on any available node”

In our case we have 3 top level jobs A B and C
These can all run in parallel and are independent (ie bootloader, linux kernel, 
and the Linux root file system via buildroot)

Job A (boot) is actually about 7 small builds that are independent

I am looking for a means to fork n jobs (ie job A B and C above) across the 
cluster and wait/collect the std output of those n jobs and the exit status

Job A would then fork and build 7 to 8 sub jobs
When they are done it would assemble the result into what Xilinix calls boot.bin

Job B is a Linux kernel build

Job C is buildroot so there are several (n=50) smaller builds ie bash, 
busybody, and other tools like python for the target agian each of these can be 
executed in parallel

Really do not (cannot) re architect my build to be a slurm only build because 
it also need to be able to run without slurm ie build everything on my laptop 
without slurm present

In that case the jobs would run serially and take an hour or so the hope is by 
parallelizing the sw build jobs our overall cycle time will improve

It would also be nice if the slurm cluster would adapt to the available nodes 
automatically

Our hope is we can run our lab pcs as duel boot they normally boot windows but 
we can duel boot them into Linux and they become a compile node and auto join 
the cluster and the cluster sees them as going off line when somebody reboots 
the machine back to  windows


Sent from my iPhone

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] scontrol create partition fails

2024-06-10 Thread Long, Daniel S. via slurm-users
Hi,

I need to temporarily dedicated one of our compute nodes to a single account. 
To do this, I was going to create a new partition but I'm running into an error 
where

scontrol create partition

outputs "scontrol: error: Invalid input: partition  Request aborted" regardless 
of what parameters I give it. As far as I can tell, this should be allowed; the 
man page for scontrol has a whole section titled "Partitions - Specficiations 
for Create, Update, and Delete Commands". What am I missing?

Also, is there a better way to approach this? This is really just a one or two 
day thing and I'm a little surprised there's no easy way to cordon off a node 
for a user or a project without spinning up an entire partition. Am I missing 
something obvious?

Thanks for any help you can provide.

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: scontrol create partition fails

2024-06-10 Thread Schneider, Gerald via slurm-users
Hi Daniel,

you can create a reservation for the node for the said account.

Regards,
Gerald Schneider


--
Gerald Schneider

Fraunhofer-Institut für Graphische Datenverarbeitung IGD
Joachim-Jungius-Str. 11 | 18059 Rostock | Germany
Tel. +49 6151 155-309 | +49 381 4024-193 | Fax +49 381 4024-199
gerald.schnei...@igd-r.fraunhofer.de
 | www.igd.fraunhofer.de

From: Long, Daniel S. via slurm-users 
Sent: Montag, 10. Juni 2024 15:03
To: slurm-us...@schedmd.com
Subject: [slurm-users] scontrol create partition fails

Hi,

I need to temporarily dedicated one of our compute nodes to a single account. 
To do this, I was going to create a new partition but I'm running into an error 
where

scontrol create partition

outputs "scontrol: error: Invalid input: partition  Request aborted" regardless 
of what parameters I give it. As far as I can tell, this should be allowed; the 
man page for scontrol has a whole section titled "Partitions - Specficiations 
for Create, Update, and Delete Commands". What am I missing?

Also, is there a better way to approach this? This is really just a one or two 
day thing and I'm a little surprised there's no easy way to cordon off a node 
for a user or a project without spinning up an entire partition. Am I missing 
something obvious?

Thanks for any help you can provide.

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: scontrol create partition fails

2024-06-10 Thread Long, Daniel S. via slurm-users
This looks perfect. Thank you very much.

From: Schneider, Gerald via slurm-users 
Sent: Monday, June 10, 2024 9:14 AM
To: slurm-us...@schedmd.com
Subject: [slurm-users] Re: scontrol create partition fails

Hi Daniel,

you can create a reservation for the node for the said account.

Regards,
Gerald Schneider


--
Gerald Schneider

Fraunhofer-Institut für Graphische Datenverarbeitung IGD
Joachim-Jungius-Str. 11 | 18059 Rostock | Germany
Tel. +49 6151 155-309 | +49 381 4024-193 | Fax +49 381 4024-199
gerald.schnei...@igd-r.fraunhofer.de
 | www.igd.fraunhofer.de

From: Long, Daniel S. via slurm-users 
mailto:slurm-users@lists.schedmd.com>>
Sent: Montag, 10. Juni 2024 15:03
To: slurm-us...@schedmd.com
Subject: [slurm-users] scontrol create partition fails

Hi,

I need to temporarily dedicated one of our compute nodes to a single account. 
To do this, I was going to create a new partition but I'm running into an error 
where

scontrol create partition

outputs "scontrol: error: Invalid input: partition  Request aborted" regardless 
of what parameters I give it. As far as I can tell, this should be allowed; the 
man page for scontrol has a whole section titled "Partitions - Specficiations 
for Create, Update, and Delete Commands". What am I missing?

Also, is there a better way to approach this? This is really just a one or two 
day thing and I'm a little surprised there's no easy way to cordon off a node 
for a user or a project without spinning up an entire partition. Am I missing 
something obvious?

Thanks for any help you can provide.

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Issue about selecting cpus for optimization

2024-06-10 Thread Purvesh Parmar via slurm-users
Hi,

We have 16 nodes cluster with DGX-A100 (80 GB).

We have 128 cores of each node separated in to a separate partition for cpu
only jobs and 8 GPUs and 128 cores in other partitions for cpugpu jobs.

We want to ensure that only selected 128 cores should be part of the cpu
partition. (NUMA / Symmetry) for optimization. How to achieve it?

cores parameter in gres.conf help?

Regards,

Purvesh

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] srun hostname - Socket timed out on send/recv operation

2024-06-10 Thread Arnuld via slurm-users
I have two machines. When I run "srum hostname" on one machine (it's both a
controller and a node) then I get the hostname fine but I get socket timed
out error in these two situations:

1) "srun hostname" on 2nd machine (it's a node)
2) "srun -N 2 hostname" on controller

"scontrol show node" shows both mach2 and mach4. "sinfo" shows both nodes
too.  Also the job gets stuck forever in CG state after the error. Here is
the output:

$ srun -N 2 hostname
mach2
srun: error: slurm_receive_msgs: [[mach4]:6818] failed: Socket timed out on
send/recv operation
srun: error: Task launch for StepId=.0 failed on node hpc4: Socket
timed out on send/recv operation
srun: error: Application launch failed: Socket timed out on send/recv
operation
srun: Job step aborted


Output form "squeue" 3 seconds apart

Tue Jun 11 05:09:56 2024
 JOBID PARTITION NAME USER ST   TIME  NODES
NODELIST(REASON)
   poxo hostname   arnuld  R   0:19  2
mach4,mach2

Tue Jun 11 05:09:59 2024
 JOBID PARTITION NAME USER ST   TIME  NODES
NODELIST(REASON)
   poxo hostname   arnuld CG   0:20  1 mach4

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com