After a bit more testing I can answer my original question: I was just
too impatient. When the ResumeProgram comes back with an exit code != 0
SLURM doesn't taint the node, i.e., it tries to start it again after a
while. Exactly what I want! :-)
@Lachlan Musicman: My slurm.conf Node and Partition
On 29 July 2018 at 04:32, Felix Wolfheimer
wrote:
> I'm experimenting with SLURM Elastic Compute on a cloud platform. I'm
> facing the following situation: Let's say, SLURM requests that a compute
> instance is started. The ResumeProgram tries to create the instance, but
> doesn't succeed because
I'm experimenting with SLURM Elastic Compute on a cloud platform. I'm
facing the following situation: Let's say, SLURM requests that a compute
instance is started. The ResumeProgram tries to create the instance, but
doesn't succeed because the cloud provider can't provide the instance type
at this