So there is a --chdir for sbatch too. This implies that the same path has
to exist on all nodes. Something to keep in mind when creating a slurm
cluster.
On Tue, Jan 21, 2020 at 12:58 PM William Brown
wrote:
> The srun man page says:
>
>
>
> When initiating remote processes *srun* will propaga
The srun man page says:
When initiating remote processes srun will propagate the current working
directory, unless --chdir= is specified, in which case path will become
the working directory for the remote processes.
William
From: slurm-users On Behalf Of Dean
Schulze
Sent: 21 Janua
I run this sbatch script from the controller:
===
#!/bin/bash
#SBATCH --job-name=test_job
#SBATCH --mail-type=NONE# Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --ntasks=1
#SBATCH --mem=1gb
#SBATCH --time=00:05:00 # Time limit hrs:min:sec
#SBATCH --output=test_job_
Thank you, thank you, thank you. It was the firewall on CentOS 7. Once I
disabled that it worked.
For anyone else who runs into this issue here is how to disable the
firewall on CentOS 7:
https://linuxize.com/post/how-to-stop-and-disable-firewalld-on-centos-7/
On Tue, Jan 21, 2020 at 7:24 AM
>
>
> are you sure, your 24 core nodes have 187 TERABYTES memory?
>
> As you yourself cited:
>
> Size of real memory on the node in megabytes
>
> The settings in your slurm.conf:
>
> NodeName=node[001-003] CoresPerSocket=12 RealMemory=196489092 Sockets=2
> Gres=gpu:1
>
> so, your machines should h
On 1/21/2020 12:32 AM, Chris Samuel wrote:
On 20/1/20 3:00 pm, Dean Schulze wrote:
There's either a problem with the source code I cloned from github,
or there is a problem when the controller runs on Ubuntu 19 and the
node runs on CentOS 7.7. I'm downgrading to a stable 19.05 build to
see