Your slurm.conf line doesn't specify the node's physical memory:
NodeName=ozd2485u Gres=gpu:2 Sockets=2 CoresPerSocket=14
ThreadsPerCore=2 State=UNKNOWN
See "man slurm.conf":
RealMemory
Size of real memory on the node in megabytes (e.g.
"2048"). The default value is 1.
On
Attaching my slurm.conf file. can you please help me to find the issue.
On Tue, Apr 9, 2019 at 12:08 PM Ole Holm Nielsen
wrote:
> On 09-04-2019 08:33, sudhagar s wrote:
> > Thanks Ole,
> >
> > when i give "scontrol show node" it list down the details. where i can
> > see RealMemory=1 is this wil
i didnt place any additional GPU card. i run this z840 workstation with
default GPU (p2000) which is used for display(VGA).
This might be the reason for this error then?
On Tue, Apr 9, 2019 at 12:01 PM Ole Holm Nielsen
wrote:
> On 09-04-2019 08:25, sudhagar s wrote:
> > Thanks For the respons
On 09-04-2019 08:33, sudhagar s wrote:
Thanks Ole,
when i give "scontrol show node" it list down the details. where i can
see RealMemory=1 is this will be a problem?
In your "scontrol show node" image I read RealMemory=1 (units of MB) and
mem=1M. I think you configured slurm.conf incorrectl
Thanks Ole,
when i give "scontrol show node" it list down the details. where i can see
RealMemory=1 is this will be a problem?
On Tue, Apr 9, 2019 at 11:53 AM Ole Holm Nielsen
wrote:
> On 09-04-2019 07:37, sudhagar s wrote:
> > Hi, Iam newbee in slurm. trying to setup a cluster for ML trainin
On 09-04-2019 08:25, sudhagar s wrote:
Thanks For the response.
here is my node and partition information:
Well, 1 MB of real memory in the node is not a lot :-) This reminds me
of the very old days where PCs had 640 kB RAM...
On Tue, Apr 9, 2019 at 11:53 AM Ole Holm Nielsen
mailto:ole.h.
On 09-04-2019 07:37, sudhagar s wrote:
Hi, Iam newbee in slurm. trying to setup a cluster for ML training
purpose. i created controle node and compute node. both are up and running.
when i enter "srun -N 1 hostname" it says
" srun error memory specification can not be satisfied"
"unable to allo
Hi, Iam newbee in slurm. trying to setup a cluster for ML training purpose.
i created controle node and compute node. both are up and running.
when i enter "srun -N 1 hostname" it says
" srun error memory specification can not be satisfied"
"unable to allocate resources: requested node configurati