On Tue, Aug 1, 2023 at 3:27 PM Daniel Letai <d...@letai.org.il> wrote:
> The other OTHER approach might be to use some epilog (or possibly 
> epilogslurmctld) to log exit codes for first 20 tasks in each array, and 
> cancel the array if non-zero. This is a global approach which will affect all 
> job arrays, so might not be appropriate for your use case.

you can setup task prolog/epilog.  just test for the error condition
inthe task epilog and then cancel your array if need be

https://slurm.schedmd.com/prolog_epilog.html

i've not tried it, nor how it relates to array's but might work

Reply via email to