Re: [slurm-users] How do you make --export=NONE the default behavior for our cluster?

2022-06-04 Thread Christopher Samuel
On 6/3/22 11:39 am, Ransom, Geoffrey M. wrote: Adding “--export=NONE” to the job avoids the problem, but I’m not seeing a way to change this default behavior for the whole cluster. There's an SBATCH_EXPORT environment variable that you could set for users to force that (at $JOB-1 we used to d

Re: [slurm-users] How do you make --export=NONE the default behavior for our cluster?

2022-06-04 Thread Ward Poelmans
Hi, We're using a cli filter for doing this. But it's more tricky then just `--export=NONE`. For a srun inside a sbatch, you want `--export=ALL` again because MPI will break otherwise. We have this in our cli filter: function slurm_cli_pre_submit(options, pack_offset) local default_e

[slurm-users] 答复: what is the possible reason for secondary slurmctld node not allocate job after takeover?

2022-06-04 Thread taleintervenor
Well, after increase slurmctld log level to debug, we do found some error related to munge like: [2022-06-04T15:17:21.258] debug: auth/munge: _decode_cred: Munge decode failed: Failed to connect to "/run/munge/munge.socket.2": Resource temporarily unavailable (retrying ...) But when test m