Re: [slurm-users] Substituions for "see META file" in slurm.spec file of 15.08.11-1 release

2019-07-09 Thread Fulcomer, Samuel
...and for the SchedMD folks, it would be a lot simpler to drop/disambiguate the "year it was released" first element in the version number, and just use it as an incrementing major version number. On Tue, Jul 9, 2019 at 6:42 PM Fulcomer, Samuel wrote: > Hi Pariksheet, > > To confirm, "14", "15

Re: [slurm-users] Substituions for "see META file" in slurm.spec file of 15.08.11-1 release

2019-07-09 Thread Fulcomer, Samuel
Hi Pariksheet, To confirm, "14", "15", "16", and "17" do not denote major versions. For example, "17.02" and "17.11" are different major versions. Only "MM.NN" denotes a major version. This is somewhat unintuitive, and I've suggested some documentation clarification, but it's still somewhat easily

Re: [slurm-users] Jobs waiting while plenty of cpu and memory available

2019-07-09 Thread Burian, John
To emphasize what Thomas wrote: backfill will only be useful if users submit jobs with realistic runtime limits. If every job is submitted with a default runtime of, for example, 7 days, then Slurm will not backfill your small jobs while it waits for the resources for the highest-priority large

Re: [slurm-users] Jobs waiting while plenty of cpu and memory available

2019-07-09 Thread Edward Ned Harvey (slurm)
> From: slurm-users On Behalf Of > Thomas M. Payerle > Sent: Tuesday, July 9, 2019 10:23 AM > > Do you have backfill enabled?  This can help in many cases. Yup - I checked for backfill yesterday. It's backfill. > If the job with highest priority is quite wide, Slurm will reserve resources > f

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Daniel Torregrosa
I see. Thanks a lot for the in depth explanation. On Tue, 9 Jul 2019 at 16:27, Jeffrey Frey wrote: > "-uid" is a perfectly valid sbatch flag: > > >-u, --usage > Display brief help message and exit. > >-i, --input= > Instruct Slurm to connect the batch

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Jeffrey Frey
"-uid" is a perfectly valid sbatch flag: -u, --usage Display brief help message and exit. -i, --input= Instruct Slurm to connect the batch script's standard input directly to the file name specified in the "filename pattern".

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Daniel Torregrosa
Slight correction, it does not look for a file named "d" in the home folder of the user in the (mistyped) -uid parameter, it looks for a file named "d" in the home folder of the user running sbatch. If this is not an expected behaviour, I will make a complete bug report. On Tue, 9 Jul 2019 at 15

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Daniel Torregrosa
Thanks a lot for your answers again! @Marcus Thanks a lot for the clarification. About --uid, you are correct, I was mistyping it as -uid. But, the behaviour is slightly inconsistent: * If correctly typed (--uid) sbatch properly complains about needing to be root * If not present at all, or ignor

Re: [slurm-users] Hints, Cheatsheets, etc

2019-07-09 Thread Marcus Boden
> > Yeah, on our systems, I get: > Sorry, gawk version 4.0 or later is required. Your version is: GNU Awk > 3.1.7 > (RHEL 6). So this one wasn't as useful for me. But thanks anyway! Just an FYI: Building gawk locally is pretty easy (a simple configure, make, make install), so that might b

[slurm-users] JobState=FAILED Reason=NonZeroExitCode Dependency=(null) ExitCode=1:0

2019-07-09 Thread Robert Kudyba
>From this tutorial https://www.brightcomputing.com/blog/bid/174099/slurm-101-basic-slurm-usage-for-linux-clusters I am trying to run the below and it always fails. I've made sure to run 'module load slurm'. What could be wrong? Logs from slurmctld show ok: [2019-07-09T10:19:44.183] prolog_running_

Re: [slurm-users] Jobs waiting while plenty of cpu and memory available

2019-07-09 Thread Thomas M. Payerle
You can use squeue to see the priority of jobs. I believe it normally shows jobs in order of priority, even though does not display priority. If you want to see actual priority, you need to request it in the format field. I typically use squeue -o "%.18i %.12a %.6P %.8u %.2t %.8m %.4D %.4C %12l

Re: [slurm-users] Jobs waiting while plenty of cpu and memory available

2019-07-09 Thread Edward Ned Harvey (slurm)
> From: slurm-users On Behalf Of > Ole Holm Nielsen > Sent: Tuesday, July 9, 2019 2:36 AM > > When some jobs are pending with Reason=Priority this means that other > jobs with a higher priority are waiting for the same resources (CPUs) to > become available, and they will have Pending=Resources i

Re: [slurm-users] scavenger partition/qos

2019-07-09 Thread Paul Edmon
a. You pretty much have to roll your own.  We do it with our serial_requeue partition which underlays all our hardware and is at the lower priority. b. I haven't used the suspend function for partition scheduling so I'm not aware of what quirks there are.  We use requeue.  A caution I would h

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Jeffrey Frey
> So, if I understand this correctly, for some reason, `srun` does not need > root privileges on the computation node side, but `sbatch` does when > scheduling. I was afraid doing so would mean users could do things such as > apt install and such, but it does not seem the case. The critical par

Re: [slurm-users] Job not running of the specified node

2019-07-09 Thread Marcus Wagner
Hi mahmood, yes that is totally normal. please use sbatch instead of salloc. If you use salloc, you just create an allocation. You would normally srun to that allocation. To be clear, salloc does not create a batchjob, which gets executed on the remote host. After salloc returns (which migh

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Marcus Wagner
Hi Daniel, I strongly recommend to let SlurmdUser be root. slurmd starts slurmstepd, but without root privileges, as the specific user. That is the program, that actually executes the jobscript. But slurmd needs to bee root, e.g. to execute prolog and epilog scripts, which in many cases need

Re: [slurm-users] Hints, Cheatsheets, etc

2019-07-09 Thread Edward Ned Harvey (slurm)
> From: slurm-users On Behalf Of > Ole Holm Nielsen > Sent: Tuesday, July 9, 2019 2:17 AM > > * pestat Prints a Slurm cluster nodes status with 1 line per node and job > info. Yep, using it. :-) Definitely valuable, thanks. The one thing I wish was to have the ability to select which columns

[slurm-users] [Cross post - Slurm, PMIx, UCX] Using srun with SLURM_PMIX_DIRECT_CONN_UCX=true fails with input/output error

2019-07-09 Thread Daniel Letai
Cross posting to Slurm, PMIx and UCX lists. Trying to execute a simple openmpi (4.0.1) mpi-hello-world via Slurm (19.05.0) compiled with both PMIx (3.1.2) and UCX (1.5.0) results in: [root@n1 ~]# SLURM_PMIX_DIRECT_CONN_UCX=true SLURM_PMI

Re: [slurm-users] Problem with sbatch

2019-07-09 Thread Daniel Torregrosa
Thanks a lot for the answers! So, if I understand this correctly, for some reason, `srun` does not need root privileges on the computation node side, but `sbatch` does when scheduling. I was afraid doing so would mean users could do things such as apt install and such, but it does not seem the cas

[slurm-users] Job not running of the specified node

2019-07-09 Thread Mahmood Naderan
Hi, I use the following script for qemu run #!/bin/bash #SBATCH --nodelist=compute-0-1 #SBATCH --cores=8 #SBATCH --mem=40G #SBATCH --partition=QEMU #SBATCH --account=q20_8 USERN=`whoami` qemu-system-x86_64 -m 4 -smp cores=8 -hda win7_sp1_x64.img -boot c -usbdevice tablet -enable-kvm -device e

Re: [slurm-users] Slurm topology.conf file

2019-07-09 Thread Ole Holm Nielsen
On 7/9/19 10:14 AM, Priya Mishra wrote: Hi Ole, I am using slurm emulator and would soon start working with the slurm simulator. I need these larger topology files for the purpose of a project and not actual job scheduling. If there are any suitable resources for me to use, please let me know.

Re: [slurm-users] Slurm topology.conf file

2019-07-09 Thread Priya Mishra
Hi Ole, I am using slurm emulator and would soon start working with the slurm simulator. I need these larger topology files for the purpose of a project and not actual job scheduling. If there are any suitable resources for me to use, please let me know. Thanks, Priya

Re: [slurm-users] Slurm topology.conf file

2019-07-09 Thread Ole Holm Nielsen
On 7/9/19 9:04 AM, Priya Mishra wrote: Hi, I am using the slurmibtopology tool to generate the topology.conf file from the cluster at my institute which gives me a file with around 400 nodes. I need a topology file with a larger no of nodes for further use. Is there anyway of generating a synt

[slurm-users] Slurm topology.conf file

2019-07-09 Thread Priya Mishra
Hi, I am using the slurmibtopology tool to generate the topology.conf file from the cluster at my institute which gives me a file with around 400 nodes. I need a topology file with a larger no of nodes for further use. Is there anyway of generating a synthetic topology file , or any source where I