Howdy,
I am trying to get my cluster's native X11 support enabled, but am having some
difficulty. I generated the RPMs for slurm, installed them on my systems, and
added the "PrologFlags=X11" in the slurm.conf file as well. However, when I
start up slurmd with the flag added, I get this error
Nicholas,
Why do you have?
SchedulerParameters = (null)
I did not set these parameters, so I assume "(null)" means all the
default values are used.
John,
thanks, I'll try that, and look into these SchedulerParameter more.
Cheers,
Colas
On 2018-01-12 09:08, John DeSantis wrote:
Colas,
Hi all,
I am trying to figure out how to create a reservation that includes
reserving gpus. We normally request them using something like --gres=gpu:2
I looked through the documentation for reservations and scontrol and don't
see any reference to reserving gres's. The scontrol documentation se
Ciao Alessandro,
> Do we have to apply any particular setting to avoid incurring the
> problem?
What is your "MessageTimeout" value in slurm.conf? If it's at the
default of 10, try changing it to 20.
I'd also check and see if the slurmctld log is reporting anything
pertaining to the server thr
You could do this using a job_submit.lua script that inspects for that
application and routes them properly.
-Paul Edmon-
On 01/12/2018 11:31 AM, Juan A. Cordero Varelaq wrote:
Dear Community,
I have a node (20 Cores) on my HPC with two different partitions: big
(16 cores) and small (4 core
Dear Community,
I have a node (20 Cores) on my HPC with two different partitions: big
(16 cores) and small (4 cores). I have installed software X on this
node, but I want only one partition to have rights to run it.
Is it then possible to restrict the execution of an specific application
to a
Colas,
We had a similar experience a long time ago, and we solved it by adding
the following SchedulerParameters:
max_rpc_cnt=150,defer
HTH,
John DeSantis
On Thu, 11 Jan 2018 16:39:43 -0500
Colas Rivière wrote:
> Hello,
>
> I'm managing a small cluster (one head node, 24 workers, 1160 total
Hi all,
we are setting up SLURM 17.11.2 on a small test cluster of about 100 nodes.
Sometimes we get the error in the subject when running any SLURM command (e.g.
sinfo, squeue, scontrol reconf, etc...)
Do we have to apply any particular setting to avoid incurring the problem?
We found t