On 11/12/19 11:31 am, Eli V wrote:
Look for libmariadb-client. That's needed for slurmdbd on debian.
Looking at the output from building some Slurm 19.05.4 RPMs earlier
tonight, this is what I see in the output of configure:
[...]
checking for mysql_config... /usr/bin/mysql_config
MySQL 10.
Sure; they’ll need to have the appropriate part of SLURM installed and the
config file. This is similar to having just one login node per user. Typically
login nodes don’t run either daemon.
--
|| \\UTGERS, |---*O*---
||_// the State
Hi,
We are trying to setup a tiny Slurm cluster to manage shared access to the
GPU server in our team. Both slurmctld and slumrd are going to run on this
GPU server. But here is a problem. On one hand, we don't want to give
developers ssh access to that box, because otherwise they might bypass
Slu
Is that logged somewhere or do I need to capture the output from the make
command to a file?
-Original Message-
From: slurm-users On Behalf Of Kurt
H Maier
Sent: Wednesday, December 11, 2019 6:32 PM
To: Slurm User Community List
Subject: Re: [slurm-users] Need help with controller issues
You prompted me to dig even deeper into my epilog. I was trying to
access a semaphore file in the user's home directory.
It seems that when the epilogue is run the ~ is not expanded in anyway.
So I can't even use ~${SLURM_JOB_USER} to access their semaphore file.
Potentially problematic for a
On Wed, Dec 11, 2019 at 04:04:44PM -0700, Dean Schulze wrote:
> I tried again with a completely new system (virtual machine). I used the
> latest source, I used mysql instead of mariadb, and I installed all the
> client and dev libs (below). I still get the same error. It doesn't
> build the /us
Snapshot of a job_submit.lua we use to automatically to route jobs to a GPU
partition if the user asks for a GPU:
https://gist.github.com/mikerenfro/92d70562f9bb3f721ad1b221a1356de5
All our users just use srun or sbatch with a default queue, and the plugin
handles it from there. There’s more de
Hi Brian,
can you maybe elaborate on how exactly you verified that your epilog
does not run when a job exceeds it's walltime limit? Does it run when
the jobs end normally or when a running job is cancelled by the user?
I am asking because in our environment the epilog also runs when a job
hits
I tried again with a completely new system (virtual machine). I used the
latest source, I used mysql instead of mariadb, and I installed all the
client and dev libs (below). I still get the same error. It doesn't
build the /usr/lib/slurm/accounting_storage_mysql.so file.
Could the ./configure c
Look for libmariadb-client. That's needed for slurmdbd on debian.
On Wed, Dec 11, 2019 at 11:43 AM Dean Schulze wrote:
>
> Turns out I've already got libmariadb-dev installed:
>
> $ dpkg -l | grep maria
> ii libmariadb-dev 3.0.3-1build1
>
All,
So I have verified that the Epilog script is NOT run for any job that times
out. Even though in the documentation (
https://slurm.schedmd.com/prolog_epilog.html), it states "At job
termination"
I guess timeouts are not considered terminated??
So, is there a recommended way to have a cleanup s
We do this via looking at gres. The info is in the job_desc.gres
variable. We basically do the inverse where we ensure some one is
asking for the gpu before allowing them to submit to a gpu partition.
-Paul Edmon-
On 12/11/2019 12:32 PM, Grigory Shamov wrote:
Hi All,
I am trying the newest
Hi All,
I am trying the newest SLURM 19.05 and its new cons_tres plugin.
Is there a way to handle its new GPU options in Lua job submit plugin?
That is, something like "detect if a job has ‹gpus-per-node, assign it to
a GPU partition"?
Thank you very much in advance!
--
Grigory Shamov
WestGrid
Turns out I've already got libmariadb-dev installed:
$ dpkg -l | grep maria
ii libmariadb-dev 3.0.3-1build1
amd64MariaDB Connector/C, development files
ii libmariadb3:amd64 3.0.3-1build1
amd64
Hi,
We have a strange behaviour of Slurm after updating from 18.08.7 to 18.08.8,
for jobs using --exclusive and --mem-per-cpu.
Our nodes have 128GB of memory, 28 cores.
$ srun --mem-per-cpu=3 -n 1 --exclusive hostname
=> works in 18.08.7
=> doesn’t work in 18.08.8
In 18.08.8 :
-
These are the packages I installed prior to building slurm:
libmariadb-client-lgpl-dev
libmysqlclient-dev
mariadb-server
This installs mariadb 10.1.43 which is old.
On the Ubuntu site (https://packages.ubuntu.com/search?keywords=mariadb)
there's a package called
libmariadb-dev
Maybe this is th
Partial progress. The scientist that developed the model took a look at the
output and found that instead of one model run being ran in parallel srun
had ran multiple instances of the model, one per thread, which for this
test was 110 threads.
I have a feeling this just verified the same thing that
I tried a simple thing of swapping out mpirun in the sbatch script for
srun. Nothing more, nothing less.
The model is now working on at least two nodes, I will have to test again
on more but this is progress.
Thanks,
Chris Woelkers
IT Specialist
National Oceanic and Atmospheric Agency
Great Lakes
Thanks all for the ideas and possibilities. I will answer all in turn.
Paul: Neither of the switches in use, Ethernet and Infiniband, have any
form of broadcast storm protection enabled.
Chris: I have passed on your question to the scientist that created
the sbatch script. I will also look into o
Hi Angelines,
I use a plugin for that - I believe this one
https://github.com/hpc2n/spank-private-tmp
which sort of does it all; your job sees an (empty) /tmp/.
(It doesn't do cleanup, I simply rely on OS cleaning up /tmp, at the
moment.)
Tina
On 05/12/2019 15:57, Angelines wrote:
> Hello,
>
I had a simmilar issue, please check if the home drive, or the place
the data should be stored is mounted on the nodes.
On Tue, 2019-12-10 at 14:49 -0500, Chris Woelkers - NOAA Federal wrote:
> I have a 16 node HPC that is in the process of being upgraded from
> CentOS 6 to 7. All nodes are diskles
21 matches
Mail list logo