TLDR; If you get a timeout for the Slurm database, and a longer timelimit
in innodb doesn't help, you might want to consider loosening the lock mode
in MariaDB.
The long story!
So, we’ve just upgraded our main cluster to 17.11.3 and moved our database
to Mariadb. There have been some glitches and
I should also mention that of course we are aware that R is a single threaded
application - but users can be doing all sorts of things within their R
scripting. In this particular case this user is using the FLARE package I
believe. Often they are seeking to do embarrassingly parallel types of t
I posted a question similar to this a couple months ago regarding CPU
utilization which we figured out - sometimes too many threads on one cpu
creates high CPU load, and thus slower compute time because things are waiting.
A more proper allocation should be set in the submit script (e.g.
--cpu
This solution is even better.
I am actually using pestat for my (as admin) needs.
But I originally asked the question in order to enhance the ability
of slurm_exporter which is a client side code for prometheus/grafana
that export slurm statistics to be read as graphs.
On 02/13/2018 08:13 AM, Nadav Toledo wrote:> Does anyone know of way to
get amount of idle gpu per node or for all
cluster ?
sinfo -o %G gives the total amount of gres resource for each node. Is
there a way to get the idle amount same as you can get for cpu (%C)?
Perhaps if one use lock file li
Thanks ,that might be enough I will check it out
On 13/02/2018 16:33, Yair Yarom wrote:
Hi,
I haven't found a direct way. Here I have my own script that parses the
output of "scontrol show node" and "scontrol show job", summing up and
displaying the allocated g
Hi,
I haven't found a direct way. Here I have my own script that parses the
output of "scontrol show node" and "scontrol show job", summing up and
displaying the allocated gres.
Yair.
On Tue, Feb 13 2018, Nadav Toledo wrote:
> Hello everyone,
>
> Does anyone know of way to get amount of i
Hello all,
as a workaround, i finally use a Epilog script to archive the jobs
EpilogSlurmctld=/cm/local/apps/cmd/scripts/epilog-postjob
in slurm.conf
The script does:
scontrol show job -d $SLURM_JOB_ID >> $JOBS_FILE
Hth,
Gérard
Le 09/02/2018 à 17:58, Henry Gérard a écrit :
Hello all,
we have