Since we’re running MPI jobs, we allocate entire nodes (i.e. partitions are set
up as OverSubscribe=Exclusive) to avoid resource contention. So most of our
scripts look like this:
#SBATCH --ntasks=288
#SBATCH --constraint=[wes|san|has|bro]
And each job gets enough nodes to satisfy the task cou
On Wed, Jun 20, 2018 at 10:29 AM, Vicker, Darby (JSC-EG311) <
darby.vicke...@nasa.gov> wrote:
> Yes, Boolean constraints do work with minimal configuration. We run mostly
> MPI jobs on our cluster and want our jobs to run on a single processor
> type. We assign the processor types as a feature and
Greetings,
I found the issue. It seems that the EL7 rpms do not properly configure
the auth_munge.so location. Here is the work around for this problem.
ln -s /usr/lib64/slurm/auth_munge.so /usr/local/lib/slurm/auth_munge.so
Thanks!
~Stack~
signature.asc
Description: OpenPGP digital signature
You will get whatever cores Slurm can find which will be an assortment
of hosts.
-Paul Edmon-
On 6/20/2018 11:01 AM, Nathan Harper wrote:
sorry to hijack, but we've been considering a similar configuration,
but I was wondering what happens if you don't set a processor type?
Will it scatter
Greetings,
An update. I was unable to get any further. I removed all of the OHPC
packages and built on the EL7 system from source the exact same version
of SLURM 17.11.7 as I have on the EL6 cluster. I end up with the EXACT
same error. The compile was done with `rpmbuild -ta`.
Here is what I find
sorry to hijack, but we've been considering a similar configuration, but I
was wondering what happens if you don't set a processor type? Will it
scatter across types?
On Wed, 20 Jun 2018 at 15:30, Vicker, Darby (JSC-EG311) <
darby.vicke...@nasa.gov> wrote:
> Yes, Boolean constraints do work wit
Yes, Boolean constraints do work with minimal configuration. We run mostly MPI
jobs on our cluster and want our jobs to run on a single processor type. We
assign the processor types as a feature and then the sbatch requests
—constraint=[wes|san|has|bro] to run on whatever processor type is free.
Hello,
We currently use an in-house customized version of Torque and an in-house
scheduler. I'm now considering Slurm instead. We have a feature in our
scheduler where by users can submit jobs with multiple-prioritized job
specifications (constraints). For example, a job can request 4 nodes of
typ
Hi,
I haven't encountered this specific error (it probably means some
permissions issues somewhere), but my first try will be to look at the
slurmd log file. You can also update the SlurmdDebug (and maybe
SlurmdLogFile) config option to get more information.
Yair.
On Tue, Jun 19, 2018 at 6:1