On 6/3/22 11:39 am, Ransom, Geoffrey M. wrote:
Adding “--export=NONE” to the job avoids the problem, but I’m not seeing
a way to change this default behavior for the whole cluster.
There's an SBATCH_EXPORT environment variable that you could set for
users to force that (at $JOB-1 we used to d
Hi,
We're using a cli filter for doing this. But it's more tricky then just
`--export=NONE`. For a srun inside a sbatch, you want `--export=ALL` again
because MPI will break otherwise.
We have this in our cli filter:
function slurm_cli_pre_submit(options, pack_offset)
local default_e
Well, after increase slurmctld log level to debug, we do found some error
related to munge like:
[2022-06-04T15:17:21.258] debug: auth/munge: _decode_cred: Munge decode
failed: Failed to connect to "/run/munge/munge.socket.2": Resource temporarily
unavailable (retrying ...)
But when test m