Dear Slurm Users, one of my cluster users would like to run a Ray cluster on Slurm. I noticed that the batch script example requires running the "srun" command on a compute node, which already is allocated: https://docs.ray.io/en/latest/cluster/examples/slurm-template.html#slurm-template
This is the first time I see or hear about this type of usage and I have problems wrapping my head around this. Is there anything wrong or unusual about this? I understand that this would allocate some resources on other nodes. Would Slurm enforce limits properly ("qos" or "partition" limits)? Kind Regards -- Kamil Wilczek [https://keys.openpgp.org/] [D415917E84B8DA5A60E853B6E676ED061316B69B]
OpenPGP_signature
Description: OpenPGP digital signature