Typo on my part. Yes, it is SIGSTOP, which cannot be caught..
Maybe you could, "pre-signal" any running job with SIGUSR1 before
sending the suspend command. At least, if you are manually suspending
the job(s). That could be caught and acted upon before the SIGSTOP was
received.
Brian
On 10/15/2019 10:52 PM, Oytun Peksel wrote:
Brian,
Thanks for your response. I am looking into that option. I am a bit
confused about which signal is sent though. I thought it was SIGSTOP
not SIGSTP. And I read you can’t really catch and stop SIGSTOP or
SIGCONT signals but I am not very good at sys admin stuff anyway.
So in the end, these feel like dirty tricks to me. The select/*
plugins should have mechanisms to run scripts and such before sending
signals. But apparently there is no such mechanism.
So probably I will dig deeper into what you suggested.
Thanks
*Oytun Peksel*
oytun.pek...@semcon.com <mailto:oytun.pek...@semcon.com>
+46739205917
*From:*slurm-users <slurm-users-boun...@lists.schedmd.com> *On Behalf
Of *Brian Andrus
*Sent:* den 15 oktober 2019 20:58
*To:* slurm-users@lists.schedmd.com
*Subject:* Re: [slurm-users] Execute scripts on suspend and cancel
It seems that there are some details that would need addressed.
A suspend signal is nothing more than sending a SIGSTP (like hitting
ctrl-s), so the application is still in memory awaiting SIGCONT
So what should happen when it continues and there are no more
licenses? So the proper place for what you are looking for is in the
application itself. If it is given a SIGSTP, it could release the
licenses and then check them out again when SIGCONT is received.
If you are able to tell your app to release/request a license
externally, you may want to have a wrapper to do the signal handling
until they have it as part of their app.
Brian Andrus
On 10/14/2019 4:40 AM, Oytun Peksel wrote:
It is quite weird if slurm has no mechanism as described. I have
been digging more into it and someone suggested a workaround using
mail notifications. You use a script instead of the mail
application and catch the event then use use sacct to see what is
happening.
Two problems with this:
·There is no mail sent with suspended preemption
·If you use requeue instead there will be a mail event and you can
catch it. Sacct will flag it as “preempted” so you know it is
requeued. But then it would change it pending. So you really need
to be quick to catch it. Also there is no distinctive flag for
resuming.
Anyone has any other method to execute scripts during preemption?
*Oytun Peksel*
oytun.pek...@semcon.com <mailto:oytun.pek...@semcon.com>
+46739205917
*From:*slurm-users <slurm-users-boun...@lists.schedmd.com>
<mailto:slurm-users-boun...@lists.schedmd.com>*On Behalf Of *Oytun
Peksel
*Sent:* den 11 oktober 2019 09:10
*To:* slurm-users@lists.schedmd.com
<mailto:slurm-users@lists.schedmd.com>
*Subject:* [slurm-users] Execute scripts on suspend and cancel
Hi,
I was wondering is there an option in Slurm to execute custom
scripts before Suspend signal. What I need to do is to tell an
application to release it’s licenses before sending the suspend
signal during preemption. I think went through all the
documentation but could not find a mechanism like this.
BR
/Oytun
/When you communicate with us or otherwise interact with Semcon,
we will process personal data that you provide to us or we collect
about you, please read more in our //Privacy Policy/
<https://semcon.com/data-privacy-policy/>/./