Re: [slurm-users] sacct: error

2018-05-08 Thread Marcel Sommer
Thanks for the hint, Chris! Best regards, Marcel Am 04.05.2018 um 16:06 schrieb Chris Samuel: > On Friday, 4 May 2018 4:25:04 PM AEST Marcel Sommer wrote: > >> Does anyone have an explanation for this? > > I think you're asking for functionality that is only supported with slurmdbd. > > All the b

Re: [slurm-users] sacct: error

2018-05-07 Thread Eric F. Alemany
Thank you Chris, Marcus, Patrick and Ray. I guess i am still a bit confused. We will se what happen when we run a job asking for the CPU’s of the cluster. _ Eric F. Alemany System Administrator

Re: [slurm-users] sacct: error

2018-05-07 Thread Chris Samuel
On Monday, 7 May 2018 5:41:27 PM AEST Marcus Wagner wrote: > To me it looks like CPUs is the synonym for hardware threads. Interesting, at ${JOB-1} we experimented with HT on a system back in 2013 and I didn't do the slurm.conf side at that time, but then you could only request physical cores a

Re: [slurm-users] sacct: error

2018-05-07 Thread Marcus Wagner
Hi Chris, this is not correct. From the slurm.conf manpage: CPUs: Number of logical processors on the node (e.g. "2").  CPUs and Boards are mutually exclusive. It can be set to the total number of sockets, cores or threads. This can be useful when you want to schedule only  the  cores on a hy

Re: [slurm-users] sacct: error

2018-05-06 Thread Chris Samuel
On Sunday, 6 May 2018 2:58:26 PM AEST Chris Samuel wrote: > Very very interesting - both slurmd and lscpu report 32 cores, but with > differing interpretations of the number of the layout. Meanwhile the AMD > website says these are 16 core CPUs, which means both Slurm and lscpu are > wrong! Of c

Re: [slurm-users] sacct: error

2018-05-05 Thread Chris Samuel
On Sunday, 6 May 2018 2:00:44 AM AEST Eric F. Alemany wrote: > Working on weekends - hey ? [...] This isn't my work. ;-) > It seems as the commands give different result (?) - What do you think ? Very very interesting - both slurmd and lscpu report 32 cores, but with differing interpretations

Re: [slurm-users] sacct: error

2018-05-05 Thread Eric F. Alemany
Hi Chris, Working on weekends - hey ? when i do "slurmd -C” on one of my execute node, i get: eric@radonc01:~$ slurmd -C NodeName=radonc01 slurmd: Considering each NUMA node as a socket CPUs=32 Boards=1 SocketsPerBoard=4 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=64402 UpTime=2-17:35:12 Al

Re: [slurm-users] sacct: error

2018-05-05 Thread Chris Samuel
On Saturday, 5 May 2018 2:45:19 AM AEST Eric F. Alemany wrote: > With Ray suggestion i have a error message for each nodes. Here i am giving > you only one error message from a node. > sacct: error: NodeNames=radonc01 CPUs=32 doesn't match > Sockets*CoresPerSocket*ThreadsPerCore (16), resetting CP

Re: [slurm-users] sacct: error

2018-05-04 Thread Eric F. Alemany
Hi Patrick Hi Ray Happy Friday! Thank you both for your quick reply. This is what i found out. With Patrick one liner it works fine. NodeName=radonc[01-04] CPUs=32 RealMemory=64402 Sockets=2 CoresPerSocket=8 ThreadsPerCore=2 With Ray suggestion i have a error message for each nodes. Here i am g

Re: [slurm-users] sacct: error

2018-05-04 Thread Patrick Goetz
I concur with this. Make sure your nodes are in the /etc/hosts file on the SMS. Also, if you name them by base + numerical sequence, you can configure them with a single line in Slurm (using the example below): NodeName=radonc[01-04] CPUs=32 RealMemory=64402 Sockets=2 CoresPerSocket=8 Thread

Re: [slurm-users] sacct: error

2018-05-03 Thread Raymond Wan
Hi Eric, On Fri, May 4, 2018 at 6:04 AM, Eric F. Alemany wrote: > # COMPUTE NODES > NodeName=radonc[01-04] NodeAddr=10.112.0.5 10.112.0.6 10.112.0.14 > 10.112.0.16 CPUs=32 RealMemory=64402 Sockets=2 CoresPerSocket=8 > ThreadsPerCore=2 State=UNKNOWN > PartitionName=debug Nodes=radonc[01-04] Def

[slurm-users] sacct: error

2018-05-03 Thread Eric F. Alemany
Greetings, Installed SLURM on Ubuntu 18.04. Edited slurm.conf file. Ran “sacct” and got the following error message: sacct sacct: error: Parse error in file /etc/slurm-llnl/slurm.conf line 166: " 10.112.0.6 10.112.0.14 10.112.0.16 CPUs=32 RealMemory=64402 Sockets=2 CoresPerSocket=8 ThreadsPerC