Cool, node weights are useful. I will split this big partition to two
partitions: one for small jobs, one for 8 gpus jobs. This will also help.
On Wed, Aug 11, 2021 at 3:57 AM Brian Andrus wrote:
> You may also want to look at node weights. By setting them at different
> levels for each node, yo
You may also want to look at node weights. By setting them at different
levels for each node, you can give a preference to one over the other.
That may be a way to do a "try this node first" method of job placement.
Brian Andrus
On 8/10/2021 9:19 AM, Jack Chen wrote:
Thanks for your reply! It'
Thanks for your reply! It's certain that slurm will not place small jobs on
same node if resources are not available. But I'm using default values in
my issue, job cmd is : srun -n 1 --cpus-per-task=2 --gres=gpu:1 'sleep
12000'.
When I submit another 8 one gpu jobs, they can run both on node A an
You may want to look at your resources. If the memory allocation adds up
such that there isn't enough left for any job to run, it won't matter
that there are still GPUs available.
Similar for any other resource (CPUs, cores, etc)
Brian Andrus
On 8/10/2021 8:07 AM, Jack Chen wrote:
Does anyo
mpact scheduling strategy for small GPU jobs
External Email Warning
This email originated from outside the university. Please use caution when
opening attachments, clicking links, or responding to requests.
Does anyone have any ideas on this?
On Fri, Aug 6, 2021 at 2:
Does anyone have any ideas on this?
On Fri, Aug 6, 2021 at 2:52 PM Jack Chen wrote:
> I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm doesn't
> allocate nodes using compact strategy. Anyone know how to solve this? Will
> upgrading slurm latest version help ?
>
> For example, the
Hi.
Maybe your jobs are requesting more RAM (or other resources) that after
6 other jobs are no longer available on first node?
Try checking with scontrol show node .
BYtE,
Diego
Il 06/08/2021 08:52, Jack Chen ha scritto:
I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm does
I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm doesn't
allocate nodes using compact strategy. Anyone know how to solve this? Will
upgrading slurm latest version help ?
For example, there are two nodes A and B with 8 gpus per node, I submitted
8 1 gpu jobs, slurm will allocate fir