[slurm-users] Re: Large memory jobs stuck Pending. Should use --time parameter?

2025-05-06 Thread Gerhard Strangar via slurm-users
Mike via slurm-users wrote: > None of our users specify a --time param If that results in a runtime of more than 28 days, the scheduler does not reserve slots. -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Change a job from --exclusive to --exclusive=user

2024-09-18 Thread Gerhard Strangar via slurm-users
Hello, is it possible to change a pending job from --exclusive to --exclusive=user? I tried scontrol update jobid=... oversubscribe=user, but it seems to only accept yes or no. Gerhard -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...

[slurm-users] Re: Limit GPU depending on type

2024-06-13 Thread Gerhard Strangar via slurm-users
Gestió Servidors via slurm-users wrote: > What I want is users could user all of them but simultaniously, a user only > could use one of the RTX3080. How about two partitions: One contains only the RTX3080, using the QoS MaxTRESPerUser=gres/gpu=1 and another one with all the other GPUs not havin

[slurm-users] Avoiding fragmentation

2024-04-08 Thread Gerhard Strangar via slurm-users
Hi, I'm trying to figure out how to deal with a mix of few- and many-cpu jobs. By that I mean most jobs use 128 cpus, but sometimes there are jobs with only 16. As soon as that job with only 16 is running, the scheduler splits the next 128 cpu jobs into 96+16 each, instead of assigning a full 128

[slurm-users] Re: Suggestions for Partition/QoS configuration

2024-04-04 Thread Gerhard Strangar via slurm-users
thomas.hartmann--- via slurm-users wrote: > My idea was to basically have three partitions: > > 1. PartitionName=short MaxTime=04:00:00 State=UP Nodes=node[01-99] > PriorityTier=100 > 2. PartitionName=long_safe MaxTime=14-00:00:00 State=UP Nodes=node[01-50] > PriorityTier=100 > 3. PartitionNam

[slurm-users] Re: "Optimal" slurm configuration

2024-02-26 Thread Gerhard Strangar via slurm-users
Max Grönke via slurm-users wrote: > (b) introduce a "small" partition for the <4h jobs with higher priority but > we're unsure if this will block all the larger jobs to run... Just limit the number of cpus in that partition. Gerhard -- slurm-users mailing list -- slurm-users@lists.schedmd.com

[slurm-users] Memory used per node

2024-02-09 Thread Gerhard Strangar via slurm-users
Hello, I'm wondering if there's a way to tell how much memory my job is using per node. I'm doing #SBATCH -n 256 srun solver inputfile When I run sacct -o maxvmsize, the result apparently is the maxmimum VSZ of the largest solver process, not the maximum of the sum of them all (unlike when calli