Hi Bruno,
Bruno Santos writes:
> Hi everyone,
>
> I have recently set-up slurm to use priority/multifactor giving fair share
> the major weight.
>
> Since the queue is up I have submitted about 1500 small jobs and just today 2
> other users jumped on the queue with their first job. However, i
On Friday, 1 December 2017 4:21:43 AM AEDT Brian W. Johanson wrote:
> Almost there, add in PriorityFlags=FAIR_TREE
+1 on fair tree.
To see the state of your fairshare config run:
sshare -l
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
I added it as suggested and restarted both the controller and the mode
daemon but no change. Priority is still the same
On 30 Nov 2017 17:24, "Brian W. Johanson" wrote:
> Almost there, add in PriorityFlags=FAIR_TREE
>
> If you missed it, check out https://slurm.schedmd.com/fair_tree.html
> -b
>
> Also FWIW, in setting-up the 17.11 on CentOS 7, I encountered these minor
> gotchas:
>
> - Your head/login node's sshd MUST be configured with "X11UseLocalhost no" so
> the X11 TCP port isn't bound to the loopback interface alone
Anyone who read this, please ignore. Red herring thanks to slu
Hi,
You should look at that bug : https://bugs.schedmd.com/show_bug.cgi?id=4412
I thought it would be resolved in 17.11.0.
Regards
Matthieu
Le 30 nov. 2017 00:56, "Andy Riebs" a écrit :
> We've just installed 17.11.0 on our 100+ node x86_64 cluster running
> CentOS 7.4 this afternoon, and per
Almost there, add in PriorityFlags=FAIR_TREE
If you missed it, check out https://slurm.schedmd.com/fair_tree.html
-b
On 11/30/2017 12:10 PM, Bruno Santos wrote:
Hi everyone,
I have recently set-up slurm to use priority/multifactor giving fair share the
major weight.
Since the queue is up I
Hi everyone,
I have recently set-up slurm to use priority/multifactor giving fair
share the major weight.
Since the queue is up I have submitted about 1500 small jobs and just today
2 other users jumped on the queue with their first job. However, it seems
that slurm is not taking into account the
FWIW, though the "--x11" flag is available to srun in 17.11.0, neither the man
page nor the built-in --help mention its presence or how to use it.
Also FWIW, in setting-up the 17.11 on CentOS 7, I encountered these minor
gotchas:
- Your head/login node's sshd MUST be configured with "X11UseLoc
Hello David,
slurmd daemon is not running (while slurmctld and slurmdbd are).
slurmd.log (different from slurmctld.log) should contain more information.
Regards,
Pierre-Marie Le Biot
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
david vilanova
Sent: Thursday, No
I now realised I probably need some kind of job preemption to make
things work the way I want them to. I'll take a look at how slurm does that.
Cheers, Christian.
On 30-11-2017 13:29, Chris Samuel wrote:
On Thursday, 30 November 2017 9:40:53 PM AEDT Christian Anthon wrote:
The queue has a t
Dear Ole,
You are absolutely right! Thank you for pointing this out. I hadn't noticed
the RPMs were re-arranged so much as of 17.11.
Thanks again,
On Thu, Nov 30, 2017 at 4:04 PM Ole Holm Nielsen
wrote:
> On 11/30/2017 01:40 PM, Alan Orth wrote:
> > I just built SLURM 17.11.0 on a CentOS 7 mac
We've just installed 17.11.0 on our 100+ node x86_64 cluster running
CentOS 7.4 this afternoon, and periodically see a single node (perhaps
the first node in an allocation?) get drained with the message "batch
job complete failure".
On one node in question, slurmd.log reports
pam_unix(slur
On 11/30/2017 01:40 PM, Alan Orth wrote:
I just built SLURM 17.11.0 on a CentOS 7 machine and was surprised to
see that several systemd unit files were missing from the RPMs. For some
reason the slurmdbd.service file is present though:
$ rpmbuild -ta slurm-17.11.0.tar.bz2
$ rpm -qlp slurm-17.1
Hello,
I just built SLURM 17.11.0 on a CentOS 7 machine and was surprised to see
that several systemd unit files were missing from the RPMs. For some reason
the slurmdbd.service file is present though:
$ rpmbuild -ta slurm-17.11.0.tar.bz2
$ rpm -qlp slurm-17.11.0-1.el7.centos.x86_64.rpm | egrep "
On Thursday, 30 November 2017 9:58:08 PM AEDT Markus Köberl wrote:
> I also saw the wrong Sockets, CPU and Threads. I did not recognize the wrong
> values for RAM. Therefore I did define Sockets, CoresPerSocket,
> ThreadsPerCore and RealMemory.
You can run "slurmd -C" to get it to tell you the co
On Thursday, 30 November 2017 9:40:53 PM AEDT Christian Anthon wrote:
> The queue has a ton of of single-core jobs and somebody submits a high
> priority multi-core job, will the mulit-core job not run before all
> single-core jobs are done or will slurm free up a node?
I can see you are weightin
Hi everyone,
is there any way to see the actual job running time with sacct, and not
the time resources were allocated to that job? I see there are many
fields like Elapsed (which to my understanding and testing is the total
time resources have been allocated to the job), but I can't figure ou
On Tuesday, 21 November 2017 16:38:48 CET Ing. Gonzalo E. Arroyo wrote:
> I have a problem detecting RAM and Arch (maybe some more), check this...
>
> NodeName=fisesta-21-3 Arch=x86_64 CoresPerSocket=1
>CPUAlloc=0 CPUErr=0 CPUTot=2 CPULoad=0.01
>AvailableFeatures=rack-21,2CPUs
>ActiveF
Okay,
how is slurm handling the following situation:
The queue has a ton of of single-core jobs and somebody submits a high
priority multi-core job, will the mulit-core job not run before all
single-core jobs are done or will slurm free up a node?
Cheers, Christian.
On 30-11-2017 07:57, Ch
Sorry for the delay, was trying to fix it but still not working.
The node is always down. The master machine is also the compute machine.
It's a single server that i use for that. 1 node and 12 cpus.
In the log below i see this line
[2017-11-30T09:24:41.764] agent/is_node_resp: node:linuxcluster
20 matches
Mail list logo