Perhaps you could provide the exact error message or log output from a
failed attempt
Brian Andrus
On 4/29/2025 7:56 AM, milad--- via slurm-users wrote:
My partitions definition is super simple:
```
PartitionName=t4 Nodes=slurm-t4-[1-30] DEFAULT=YES MaxTime=INFINITE State=UP
DefCpuPerGPU=16
My partitions definition is super simple:
```
PartitionName=t4 Nodes=slurm-t4-[1-30] DEFAULT=YES MaxTime=INFINITE State=UP
DefCpuPerGPU=16 DefMemPerGPU=14350
PartitionName=a100-40 Nodes=slurm-a100-40gb-[1-30] MaxTime=INFINITE State=UP
DefCpuPerGPU=12 DefMemPerGPU=85486
PartitionName=a100-80 Node
Perhaps some of the partition's default (maybe even implicit) are to blame?
On Mon, Apr 28, 2025 at 7:56 AM milad--- via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> Update: I also noticed that specifying -ntasks makes a difference when
> --gpus is present.
>
> if I have two partitions a
Update: I also noticed that specifying -ntasks makes a difference when --gpus
is present.
if I have two partitions a100 and h100 that both have free GPUs:
✅ h100 specified first in -p: works
sbatch -p h100,a100 --gpus h100:1 script.sh
❌ h100 specified second: doesn't work
sbatch -p a100,h100 --