Re: [slurm-users] Why every job will sleep 100000000

2022-11-04 Thread Jeffrey T Frey
If you examine the process hierarchy, that "sleep 1" process if probably the child of a "slurmstepd: [.extern]" process. This is a housekeeping step launched for the job by slurmd -- in older Slurm releases it would handle the X11 forwarding, for example. It should have no impact on th

[slurm-users] Why every job will sleep 100000000

2022-11-04 Thread GHui
I found a sleep process running by root, when I submit a job. And it sleep 1 seconds. Sometimes, my job is hung up. The job state is "R". Though it runs nothing, the jobscript like the following, -- #!/bin/bash #SBATCH -J sub #SBATCH -N 1 #SBATCH -n 1 #SBATCH -p vpartition --

[slurm-users] hierarchies/dependencies between QoSs

2022-11-04 Thread Sebastian Schmutzhard-Höfler
Hi, is there a way to have hierarchies/dependencies between different QoS's, except from preemption? Is it possible to change the qos of a running job? We have qos=gpus2, qos=gpus4 and qos=gpus6 (allowing a certain maximal total number of gpus for the user). I want that the running/pending

[slurm-users] Why job memory request may be automatically set by Slurm to RealMemory of some node?

2022-11-04 Thread Taras Shapovalov
Hey, I noticed a weird behavior of Slurm 21 and 22. When the following conditions are satisfied, then Slurm implicitly sets job memory request equal to RealMemory of some node (perhaps first node that satisfies other job's requests, but this is not documented, or I could not find in the documen