ahh, ...
one thing, I forgot. The following is working again ...
--ntasks=24 --ntasks-per-node=24
NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=48,mem=12M,energy=63,node=1,billing=48
Socks/Node=* NtasksPerN:B:S:C=24:0:*:1 CoreSpec=*
MinCPUsNode=24 MinM
Hi Andreas,
I'll try to sum this up ;)
First of all, I used now a Broadwell node, so there is no interference
with Skylake and SubNuma clustering.
We are using slurm 18.08.5-2
I have configured the node as slurmd -C tells me:
NodeName=lnm596 Sockets=2 CoresPerSocket=12 ThreadsPerCor
On 2/20/19 12:08 AM, Marcus Wagner wrote:
Hi Prentice,
On 2/19/19 2:58 PM, Prentice Bisbal wrote:
--ntasks-per-node is meant to be used in conjunction with --nodes
option. From https://slurm.schedmd.com/sbatch.html:
*--ntasks-per-node*=
Request that /ntasks/ be invoked on each node.
Hi Chris,
Hi Marcus,
Just want to understand the cause, too. I'll try to sum it up.
Chris you have
CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2
and
srun -C gpu -N 1 --ntasks-per-node=80 hostname
works.
Marcus has configured
CPUs=48 Sockets=4 CoresPerSocket=12 Threa
Dear all,
I did a little bit more testing.
* I have reenabled CR_ONE_TASK_PER_CORE.
* My testnode is still configured, as slurmd -C tells me.
* "--ntasks=24" or "--ntasks=24 --ntasks-per-node=24" can both be
submitted, resulting in a job with the "free" hyperthread per task.
Nearly perfect.
Hi Chris,
I assume, you have not set
CR_ONE_TASK_PER_CORE
CR_ONE_TASK_PER_CORE
Allocate one task per core by default.
Without this option, by default one task will be allocated per thread on
nodes with more than one ThreadsPerCore configured.
On Tuesday, 19 February 2019 10:14:21 PM PST Marcus Wagner wrote:
> sbatch -N 1 --ntasks-per-node=48 --wrap hostname
> submission denied, got jobid 199805
On one of our 40 core nodes with 2 hyperthreads:
$ srun -C gpu -N 1 --ntasks-per-node=80 hostname | uniq -c
80 nodename02
The spec is:
I just made a little bit debugging, setting the debug level to debug5
during submission.
I submitted (or at least tried to) two jobs:
sbatch -n 48 --wrap hostname
got submitted, got jobid 199801
sbatch -N 1 --ntasks-per-node=48 --wrap hostname
submission denied, got jobid 199805
The only diff
Hi Prentice,
On 2/19/19 2:58 PM, Prentice Bisbal wrote:
--ntasks-per-node is meant to be used in conjunction with --nodes
option. From https://slurm.schedmd.com/sbatch.html:
*--ntasks-per-node*=
Request that /ntasks/ be invoked on each node. If used with the
*--ntasks* option, the
--ntasks-per-node is meant to be used in conjunction with --nodes
option. From https://slurm.schedmd.com/sbatch.html:
*--ntasks-per-node*=
Request that /ntasks/ be invoked on each node. If used with the
*--ntasks* option, the *--ntasks* option will take precedence and
the *--ntasks-
No, but that was expected ;)
Thanks nonetheless.
Best
Marcus
On 2/18/19 6:01 AM, Andreas Henkel wrote:
Not the answer you hoped for there I guess...
On 15.02.19 07:15, Marcus Wagner wrote:
I have filed a bug:
https://bugs.schedmd.com/show_bug.cgi?id=6522
Lets see, what ScheMD has to tell
Not the answer you hoped for there I guess...
On 15.02.19 07:15, Marcus Wagner wrote:
> I have filed a bug:
>
> https://bugs.schedmd.com/show_bug.cgi?id=6522
>
>
> Lets see, what ScheMD has to tell us ;)
>
>
> Best
> Marcus
>
> On 2/15/19 6:25 AM, Marcus Wagner wrote:
>> NumNodes=1 NumCPUs=48 NumT
I have filed a bug:
https://bugs.schedmd.com/show_bug.cgi?id=6522
Lets see, what ScheMD has to tell us ;)
Best
Marcus
On 2/15/19 6:25 AM, Marcus Wagner wrote:
NumNodes=1 NumCPUs=48 NumTasks=48 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=48,mem=182400M,node=1,billing=48
--
Marcus Wagner, Di
Hi Chris,
that can't be right, or there is some bug elsewhere:
We have configured CR_ONE_TASK_PER_CORE, so two tasks won't get a core
and its hyperthread.
According to your theory, I configured 48 threads. But then using just
--ntasks=48 would give me two nodes, right?
But Slurm schedules t
On 2/14/19 12:22 AM, Marcus Wagner wrote:
CPUs=96 Boards=1 SocketsPerBoard=4 CoresPerSocket=12 ThreadsPerCore=2
RealMemory=191905
That's different to what you put in your config in the original email
though. There you had:
CPUs=48 Sockets=4 CoresPerSocket=12 ThreadsPerCore=2
This config
Hi Andreas,
might be that this is one of the bugs in Slurm 18.
I think, I will open a bug report and see what they say.
Thank you very much, nonetheless.
Best
Marcus
On 2/14/19 2:36 PM, Andreas Henkel wrote:
Hi Marcus,
for us slurmd -C as well as numactl -H looked fine, too. But we're
Hi Marcus,
for us slurmd -C as well as numactl -H looked fine, too. But we're using
task/cgroup only and every job starting on a skylake node gave us
|error("task/cgroup: task[%u] infinite loop broken while trying " "to
provision compute elements using %s (bitmap:%s)", |
from src/plugins/task/cg
Hi Andreas,
as slurmd -C shows, it detects 4 numa-nodes taking these as sockets.
This was also the way, we configured slurm.
numactl -H clearly shows the four domains and which belongs to which socket:
node distances:
node 0 1 2 3
0: 10 11 21 21
1: 11 10 21 21
2: 21 2
Hi Marcus,
We have skylake too and it didn’t work for us. We used cgroups only and process
binding went completely havoc with subnuma enabled.
While searching for solutions I found that hwloc does support subnuma only with
version > 2 (when looking for skylake in hwloc you will get hits in versi
Hi Andreas,
On 2/14/19 8:56 AM, Henkel, Andreas wrote:
Hi Marcus,
More ideas:
CPUs doesn’t always count as core but may take the meaning of one thread, hence
makes different
Maybe the behavior of CR_ONE_TASK is still not solid nor properly documente
and ntasks and ntasks-per-node are honor
Hi Marcus,
More ideas:
CPUs doesn’t always count as core but may take the meaning of one thread, hence
makes different
Maybe the behavior of CR_ONE_TASK is still not solid nor properly documente
and ntasks and ntasks-per-node are honored different internally. If so solely
using ntasks can mea
Hi Chris,
this are 96 thread nodes with 48 cores. You are right, that if we set it
to 24, the job will get scheduled. But then, only half of the node is
used. On the other side, if I only use --ntasks=48, slurm schedules all
tasks onto the same node. The hyperthread of each core is included i
Hi Andreas,
I get the same result if I set --ntasks-per-node=48 and --ntasks=48, or
96, or whatever.
What we wanted to achieve is, that exactly ntasks-per-node tasks get
scheduled onto one host.
Best
Marcus
On 2/14/19 7:09 AM, Henkel, Andreas wrote:
Hi Marcus,
What just came to my mind:
On Wednesday, 13 February 2019 4:48:05 AM PST Marcus Wagner wrote:
> #SBATCH --ntasks-per-node=48
I wouldn't mind betting is that if you set that to 24 it will work, and each
thread will be assigned a single core with the 2 thread units on it.
All the best,
Chris
--
Chris Samuel : http://w
Hi Marcus,
What just came to my mind: if you don’t set —ntasks isn’t the default just 1?
All examples I know using ntasks-per-node also set ntasks with ntasks >=
ntasks-per-node.
Best,
Andreas
> Am 14.02.2019 um 06:33 schrieb Marcus Wagner :
>
> Hi all,
>
> I have narrowed this down a litt
Hi all,
I have narrowed this down a little bit.
the really astonishing thing is, that if I use
--ntasks=48
I can submit the job, it will be scheduled onto one host:
NumNodes=1 NumCPUs=48 NumTasks=48 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=48,mem=182400M,node=1,billing=48
but as soon as
26 matches
Mail list logo