Thanks Ole, for this clarification, this is very good to know.

However, the problem is that the very example provided by slurm itself
is the one that has the error. I removed the unpack part with the
variable arguments and that fixed that part.

Unfortunately, the job_desc table is always empty so the whole
job_submit.lua seems like a moot point? Or the example is so outdated
(given that it cannot even log correctly) that this is now performed
in a different way??
Davide

On Thu, Sep 8, 2022 at 12:23 AM Ole Holm Nielsen
<ole.h.niel...@fysik.dtu.dk> wrote:
>
> Hi Davide,
>
> In your slurmctld log you see an entry "error: job_submit/lua:
> /opt/slurm/job_submit.lua".
>
> What I think happens is that when slurmctld encounters an error in
> job_submit.lua, it will revert to the last known good script cached by
> slurmctld and ignore the file on disk from now on, even if it has been
> corrected.  An "scontrol reconfig" may make slurmctld reread the
> job_submit.lua, please try it.
>
> I believe that this slurmctld behavior is undocumented at present.  Please
> see https://bugs.schedmd.com/show_bug.cgi?id=14472#c15 for a description:
>
> > And, if after the reconfigure, the job_submit.lua is wrong formatted (or 
> > missing), it will use the previous version of the script (which we have 
> > stored backup previously):
>
> /Ole
>
>
> On 9/7/22 14:21, Davide DelVento wrote:
> > Thanks Ole, your wiki page sheds some light on this mystery.
> > Very frustrating that even the simple example provided in the release
> > fails, and it fails at the most basic logging functionality.
> >
> > Note that "my" job_submit.lua is now the unmodified, slurm-provided
> > one.... and that the luac command returns nothing in my case (this is
> > Lua 5.3.4) so syntax seems correct?
> >
> > Yet the logs report the problem I mentioned rather than the actual
> > content that the plugin is attempting to log.
> >
> > On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
> > <ole.h.niel...@fysik.dtu.dk> wrote:
> >>
> >> Hi Davide,
> >>
> >> I suggest that you check your job_submit.lua script with the LUA compiler:
> >>
> >> luac -p /etc/slurm/job_submit.lua
> >>
> >> I have written some more details in my Wiki page
> >> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
> >>
> >> Best regards,
> >> Ole
> >>
> >> On 9/7/22 01:51, Davide DelVento wrote:
> >>> Thanks again to both of you.
> >>>
> >>> I actually did not build Slurm myself, otherwise I'd keep extensive
> >>> logs of what I did. Other people did, so I don't know. However, I get
> >>> the same grep'ing results as yours.
> >>>
> >>> Looking at the logs reveals some info, but it's cryptic.
> >>>
> >>> [2022-09-06T17:33:56.513] debug3: job_submit/lua:
> >>> slurm_lua_loadscript: skipping loading Lua script:
> >>> /opt/slurm/job_submit.lua
> >>> [2022-09-06T17:33:56.513] error: job_submit/lua:
> >>> /opt/slurm/job_submit.lua: [string "slurm.user_msg
> >>> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
> >>> (no value)
> >>>
> >>> As you can see, there is no line number and there is nothing like
> >>> user_msg in this code. There is indeed an "unpack" which is used in
> >>> the SchedMD-defined logging helper function which has a comment
> >>> "Implicit definition of arg was removed in Lua 5.2" and that's where I
> >>> speculate the error occurs.
> >>>
> >>> I should stress, this is with their own example, not my code. I guess
> >>> I could forgo the logging and move forward, but that won't probably
> >>> lead me very far.
> >>>
> >>> I am contemplating submitting a github issue about it? I did check
> >>> that the version of the job_submit.lua I have is the same currently in
> >>> the repo at 
> >>> https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
> >>>
> >>> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
> >>> <ole.h.niel...@fysik.dtu.dk> wrote:
> >>>>
> >>>> Did you install all prerequiste packages (including lua) on the server
> >>>> where you built the Slurm packages?
> >>>>
> >>>> On my system I get:
> >>>>
> >>>> $ strings `which slurmctld ` | grep HAVE_LUA
> >>>> HAVE_LUA 1
> >>>>
> >>>> /Ole
> >>>>
> >>>> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
> >>>>
> >>>> On 9/2/22 05:15, Davide DelVento wrote:
> >>>>> Thanks.
> >>>>>
> >>>>> I did try a lua script as soon as I got your first email, but that
> >>>>> never worked (yes, I enabled it in slurm.conf and ran "scontrol
> >>>>> reconfigure" after). Slurm simply acted as if there was no job_submit 
> >>>>> script.
> >>>>>
> >>>>> After various tests, all unsuccessful, today I found that link which I
> >>>>> mentioned saying that lua might not be compiled in, hence all my most
> >>>>> recent messages of this thread.
> >>>>>
> >>>>> That file is indeed there, so that's good news that I don't need to 
> >>>>> recompile.
> >>>>> However I'm puzzled on what might be missing...
> >>>>>
> >>>>>
> >>>>> On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus <toomuc...@gmail.com> wrote:
> >>>>>>
> >>>>>> lua is the language you can use with the job_submit plugin.
> >>>>>>
> >>>>>> I was showing a quick way to see that job_submit capability is indeed 
> >>>>>> in
> >>>>>> there.
> >>>>>>
> >>>>>> You can see if lua support is there by looking for the 
> >>>>>> job_submit_lua.so
> >>>>>> file is there.
> >>>>>> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
> >>>>>>
> >>>>>> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
> >>>>>>
> >>>>>> If that is there, you should be good with trying out a job_submit lua
> >>>>>> script.
> >>>>>>
> >>>>>> Brian Andrus
> >>>>>>
> >>>>>> On 9/1/2022 1:24 PM, Davide DelVento wrote:
> >>>>>>> Thanks again, Brian, indeed that grep returns many hits, but none of
> >>>>>>> them includes lua, i.e.
> >>>>>>>
> >>>>>>>      strings `which slurmctld ` | grep -i job_submit | grep -i lua
> >>>>>>>
> >>>>>>> returns nothing. So I should use the C rather than the more convenient
> >>>>>>> lua interface, unless I recompile or am I missing something?
> >>>>>>>
> >>>>>>> On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus <toomuc...@gmail.com> 
> >>>>>>> wrote:
> >>>>>>>> I would be surprised if it were compiled without the support. 
> >>>>>>>> However,
> >>>>>>>> you could check and run something like:
> >>>>>>>>
> >>>>>>>> strings /sbin/slurmctld | grep job_submit
> >>>>>>>>
> >>>>>>>> (or where ever your slurmctld binary is). There should be quite a few
> >>>>>>>> lines with that in it.
> >>>>>>>>
> >>>>>>>> Brian Andrus
> >>>>>>>>
> >>>>>>>> On 9/1/2022 10:54 AM, Davide DelVento wrote:
> >>>>>>>>> Thanks Brian for the suggestion, which I am now exploring.
> >>>>>>>>>
> >>>>>>>>> The documentation is a bit cryptic for me, but exploring a few 
> >>>>>>>>> things
> >>>>>>>>> and checking 
> >>>>>>>>> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
> >>>>>>>>> I suspect my slurm install (provided by cluster vendor) was not
> >>>>>>>>> compiled with the lua plugin installed. Do you know how to verify if
> >>>>>>>>> that is the case or if it's something else? I don't see a way to 
> >>>>>>>>> show
> >>>>>>>>> if the plugin is actually being "seen" by slurm, and I suspect it's
> >>>>>>>>> not.
> >>>>>>>>>
> >>>>>>>>> Does anyone else have other suggestions or comment on either the
> >>>>>>>>> plugin or the prolog workaround?
> >>>>>>>>>
> >>>>>>>>> Thanks!
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus <toomuc...@gmail.com> 
> >>>>>>>>> wrote:
> >>>>>>>>>> Not sure if you can do all the things you intend, but the 
> >>>>>>>>>> job_submit
> >>>>>>>>>> script is precisely where you want to check submission options.
> >>>>>>>>>>
> >>>>>>>>>> https://slurm.schedmd.com/job_submit_plugins.html
> >>>>>>>>>>
> >>>>>>>>>> Brian Andrus
> >>>>>>>>>>
> >>>>>>>>>> On 8/30/2022 12:58 PM, Davide DelVento wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to soft-enforce license utilization only when the -L 
> >>>>>>>>>>> is
> >>>>>>>>>>> set. My idea: check in the prolog if the license was requested and
> >>>>>>>>>>> only if it were, set the environmental variables needed for the
> >>>>>>>>>>> license.
> >>>>>>>>>>>
> >>>>>>>>>>> I looked at all environmental variables set by slurm and did not 
> >>>>>>>>>>> find
> >>>>>>>>>>> any related to the license as I was hoping.
> >>>>>>>>>>>
> >>>>>>>>>>> As a workaround, I could check
> >>>>>>>>>>>
> >>>>>>>>>>> scontrol show job $SLURM_JOB_ID | grep License
> >>>>>>>>>>>
> >>>>>>>>>>> and that would work, but (as discussed in other messages in this 
> >>>>>>>>>>> list)
> >>>>>>>>>>> the documentation at https://slurm.schedmd.com/prolog_epilog.html 
> >>>>>>>>>>> say
> >>>>>>>>>>>
> >>>>>>>>>>>> Prolog and Epilog scripts should be designed to be as short as 
> >>>>>>>>>>>> possible
> >>>>>>>>>>>> and should not call Slurm commands (e.g. squeue, scontrol, 
> >>>>>>>>>>>> sacctmgr,
> >>>>>>>>>>>> etc). [...] Slurm commands in these scripts can potentially lead 
> >>>>>>>>>>>> to performance
> >>>>>>>>>>>> issues and should not be used.
> >>>>>>>>>>> This is a bit of a concern, since the prolog would be invoked for
> >>>>>>>>>>> every job on the cluster, and it's a prolog (rather than the 
> >>>>>>>>>>> epilogue
> >>>>>>>>>>> like discussed in earlier messages).
> >>>>>>>>>>>
> >>>>>>>>>>> So two questions:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) is there a better workaround to check in the prolog if the 
> >>>>>>>>>>> current
> >>>>>>>>>>> job requested a license and/or
> >>>>>>>>>>> 2) would this kind of use of scontrol be okay or is indeed a 
> >>>>>>>>>>> concern
>

Reply via email to