[slurm-users] How can I do to prevent a specific job from being prempted?

2021-09-13 Thread 顏文
Dear slurm users, I have some specific jobs that can't be terminated, otherwise they need to be rerun from the beginning. Can we simply apply some settings (either by user or administrator) so that these jobs will not be preempted ? Thanks. with regards, Peter

[slurm-users] FreeMem is not equal to (RealMem - AllocMem)

2021-09-13 Thread Pavel Vashchenkov
Hi all I have cluster with 6 nodes, 2 GPUs per node, 256 GB of RAM per each node. I'm interesting of a node status (its name is node05). There is a job on this node (38 cores, 4GB per core, 152GB total used memory on the node) When I ask scontrol show node node05, I get the folloing output:

[slurm-users] Slurm Job Error Output is Missing

2021-09-13 Thread Maria Semple
Hi all, I have some jobs which write error messages to stderr, and I've noticed that the stderr output is not being written to file. Here is a simple reproduction case: test.sh: #!/bin/bash echo "out" echo "err" >&2 echo "err 2" 1>&2 >&2 echo "err 3" echo "err 4" >/dev/stderr echo "err 5" 1>/dev

Re: [slurm-users] max_script_size

2021-09-13 Thread Brian Andrus
Do they realize they can chain scripts? Have the script being submitted to sbatch be something like: #!/bin/bash #SBATCH nodes=1 #SBATCH --job-name="Test Job" echo "do something" /opt/path/script2.sh Folks often use template or wrapper scripts to facilitate administrating a cluster. This may

[slurm-users] max_script_size

2021-09-13 Thread Ozeryan, Vladimir
max_script_size=# Specify the maximum size of a batch script, in bytes. The default value is 4 megabytes. Larger values may adversely impact system performance. I have users who've requested to increase this setting, what are some of system performance issues might arise from changing that value