Hi Edward,
The squeue command tells you about job status. You can get extra
information using format options (see the squeue man-page). I like to
set this environment variable for squeue:
export SQUEUE_FORMAT="%.18i %.9P %.6q %.8j %.8u %.8a %.10T %.9Q %.10M
%.10V %.9l %.6D %.6C %m %R"
Wh
Hi Edward,
Besides my Slurm Wiki page https://wiki.fysik.dtu.dk/niflheim/SLURM, I
have written a number of tools which we use for monitoring our cluster,
see https://github.com/OleHolmNielsen/Slurm_tools. I recommend in
particular these tools:
* pestat Prints a Slurm cluster nodes status wi
I have a cluster, where I submit a bunch (600) jobs, but the cluster only runs
about 20 at a time. By using pestat, I can see there are a bunch of systems
with plenty of available cpu and memory.
Hostname Partition Node Num_CPU CPUload Memsize Freemem
Sta
Hello,
I am trying to setup my SLURM cluster. One of thing I want to achieve was to
schedule jobs which will be run on when there are no high priority tasks.
My understanding is that this can be achieved by either configuring a partition
with pre-empt mode 'Suspend/Reque' with priority for this
Thanks Brian indeed we did have it set in bytes. I set it to the MB value.
Hoping this takes care of the situation.
> On Jul 8, 2019, at 4:02 PM, Brian Andrus wrote:
>
> Your problem here is that the configuration for the nodes in question have an
> incorrect amount of memory set for them. Loo
Hi Samuel,
On Mon, Jul 8, 2019 at 8:19 PM Fulcomer, Samuel
wrote:
>
> The underlying issue is database schema compatibility/regression. Each
upgrade is only intended to provided capability to successfully upgrade the
schema from two versions back.
--snip--
> ...and you should follow the upgrade i
Hi;
There is a official page which gives a lot of link to third party
solutions you can use:
https://slurm.schedmd.com/download.html
According to me, the best slurm page for system administration is:
https://wiki.fysik.dtu.dk/niflheim/SLURM
At this page, You can find a lot of links and inf
Hi Pariksheet,
Note that an "upgrade", in the sense that retained information is converted
to new formats, is only relevant for the slurmctld/slurmdbd (and backup)
node.
If you're planning downtime in which you quiesce job execution (i.e.,
schedule a maintenance reservation), and have image conf
Hi Brian,
On Mon, Jul 8, 2019 at 8:09 PM Brian Andrus wrote:
>
> Yours are probably simple enough:
>
> Name: slurm
> Version: 15.08.11
> Release 1
>
> which becomes slurm-15.08.11-1
> You may see some issues with License and/or changelog as the format of
SPEC files changed a little awhile back, s
Yours are probably simple enough:
Name: slurm
Version: 15.08.11
Release 1
which becomes slurm-15.08.11-1
You may see some issues with License and/or changelog as the format of
SPEC files changed a little awhile back, so the latest rpmbuild may not
like things.
However, I highly suggest you u
Your problem here is that the configuration for the nodes in question
have an incorrect amount of memory set for them. Looks like you have it
set in bytes instead of megabytes
In your slurm.conf you should look at the RealMemory setting:
*RealMemory*
Size of real memory on the node in megab
Hi SLURM devs,
TL;DR: What magic incantations are needed to preprocess the slurm.spec file
in SLURM 15?
Our cluster is currently running SLURM version 15.08.11. We are planning
some downtime to upgrade to 17 and then to 19, and in preparation for the
upgrade I'm simulating the upgrade steps in l
I am an experienced sysadmin, new to being a slurm admin, and I'm encountering
some difficulty:
If you have a simple question such as "how many cpu's are currently being used
in the foobar partition," or "give me an overview of the waiting jobs and what
are the reasons they're waiting" I don't
I’m new to Slurm and we have a 3 node + head node cluster running Centos 7 and
Bright Cluster 8.1. Their support sent me here as they say Slurm is configured
optimally to allow multiple tasks to run. However at times a job will hold up
new jobs. Are there any other logs I can look at and/or sett
Hi
I can't find the reference here, but if I recall correctly the preferred
user for slurmd is actually root. It is the default.
> I assume this can be fixed by modifying the configuration so
"SlurmdUser=root", but does this imply that anything run with `srun` will
be actually executed by root?
Sudo is more flexible than than; for example you can just give the
slurmduser sudo access to the chown command and nothing else.
On 7/8/19 11:37 AM, Daniel Torregrosa wrote:
> You are right. The critical part I was missing is that chown does not
> work without sudo.
>
> I assume this can be fix
You are right. The critical part I was missing is that chown does not work
without sudo.
I assume this can be fixed by modifying the configuration so
"SlurmdUser=root", but does this imply that anything run with `srun` will
be actually executed by root? This seems dangerous.
Thanks a lot.
On Mon
Hi all,
I am currently testing slurm (slurm-wlm 17.11.2 from a newly installed and
updated Ubuntu server LTS). I managed to make it work on a very simple 1
master node and 2 compute nodes configuration. All three nodes have the
same users (namely root, slurm and test), with slurm running both slur
18 matches
Mail list logo