Thanks Ole, for this clarification, this is very good to know. However, the problem is that the very example provided by slurm itself is the one that has the error. I removed the unpack part with the variable arguments and that fixed that part.
Unfortunately, the job_desc table is always empty so the whole job_submit.lua seems like a moot point? Or the example is so outdated (given that it cannot even log correctly) that this is now performed in a different way?? Davide On Thu, Sep 8, 2022 at 12:23 AM Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> wrote: > > Hi Davide, > > In your slurmctld log you see an entry "error: job_submit/lua: > /opt/slurm/job_submit.lua". > > What I think happens is that when slurmctld encounters an error in > job_submit.lua, it will revert to the last known good script cached by > slurmctld and ignore the file on disk from now on, even if it has been > corrected. An "scontrol reconfig" may make slurmctld reread the > job_submit.lua, please try it. > > I believe that this slurmctld behavior is undocumented at present. Please > see https://bugs.schedmd.com/show_bug.cgi?id=14472#c15 for a description: > > > And, if after the reconfigure, the job_submit.lua is wrong formatted (or > > missing), it will use the previous version of the script (which we have > > stored backup previously): > > /Ole > > > On 9/7/22 14:21, Davide DelVento wrote: > > Thanks Ole, your wiki page sheds some light on this mystery. > > Very frustrating that even the simple example provided in the release > > fails, and it fails at the most basic logging functionality. > > > > Note that "my" job_submit.lua is now the unmodified, slurm-provided > > one.... and that the luac command returns nothing in my case (this is > > Lua 5.3.4) so syntax seems correct? > > > > Yet the logs report the problem I mentioned rather than the actual > > content that the plugin is attempting to log. > > > > On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen > > <ole.h.niel...@fysik.dtu.dk> wrote: > >> > >> Hi Davide, > >> > >> I suggest that you check your job_submit.lua script with the LUA compiler: > >> > >> luac -p /etc/slurm/job_submit.lua > >> > >> I have written some more details in my Wiki page > >> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins > >> > >> Best regards, > >> Ole > >> > >> On 9/7/22 01:51, Davide DelVento wrote: > >>> Thanks again to both of you. > >>> > >>> I actually did not build Slurm myself, otherwise I'd keep extensive > >>> logs of what I did. Other people did, so I don't know. However, I get > >>> the same grep'ing results as yours. > >>> > >>> Looking at the logs reveals some info, but it's cryptic. > >>> > >>> [2022-09-06T17:33:56.513] debug3: job_submit/lua: > >>> slurm_lua_loadscript: skipping loading Lua script: > >>> /opt/slurm/job_submit.lua > >>> [2022-09-06T17:33:56.513] error: job_submit/lua: > >>> /opt/slurm/job_submit.lua: [string "slurm.user_msg > >>> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format' > >>> (no value) > >>> > >>> As you can see, there is no line number and there is nothing like > >>> user_msg in this code. There is indeed an "unpack" which is used in > >>> the SchedMD-defined logging helper function which has a comment > >>> "Implicit definition of arg was removed in Lua 5.2" and that's where I > >>> speculate the error occurs. > >>> > >>> I should stress, this is with their own example, not my code. I guess > >>> I could forgo the logging and move forward, but that won't probably > >>> lead me very far. > >>> > >>> I am contemplating submitting a github issue about it? I did check > >>> that the version of the job_submit.lua I have is the same currently in > >>> the repo at > >>> https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example > >>> > >>> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen > >>> <ole.h.niel...@fysik.dtu.dk> wrote: > >>>> > >>>> Did you install all prerequiste packages (including lua) on the server > >>>> where you built the Slurm packages? > >>>> > >>>> On my system I get: > >>>> > >>>> $ strings `which slurmctld ` | grep HAVE_LUA > >>>> HAVE_LUA 1 > >>>> > >>>> /Ole > >>>> > >>>> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites > >>>> > >>>> On 9/2/22 05:15, Davide DelVento wrote: > >>>>> Thanks. > >>>>> > >>>>> I did try a lua script as soon as I got your first email, but that > >>>>> never worked (yes, I enabled it in slurm.conf and ran "scontrol > >>>>> reconfigure" after). Slurm simply acted as if there was no job_submit > >>>>> script. > >>>>> > >>>>> After various tests, all unsuccessful, today I found that link which I > >>>>> mentioned saying that lua might not be compiled in, hence all my most > >>>>> recent messages of this thread. > >>>>> > >>>>> That file is indeed there, so that's good news that I don't need to > >>>>> recompile. > >>>>> However I'm puzzled on what might be missing... > >>>>> > >>>>> > >>>>> On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus <toomuc...@gmail.com> wrote: > >>>>>> > >>>>>> lua is the language you can use with the job_submit plugin. > >>>>>> > >>>>>> I was showing a quick way to see that job_submit capability is indeed > >>>>>> in > >>>>>> there. > >>>>>> > >>>>>> You can see if lua support is there by looking for the > >>>>>> job_submit_lua.so > >>>>>> file is there. > >>>>>> It would be part of the slurm rpm (not the slurm-slurmctl rpm) > >>>>>> > >>>>>> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so > >>>>>> > >>>>>> If that is there, you should be good with trying out a job_submit lua > >>>>>> script. > >>>>>> > >>>>>> Brian Andrus > >>>>>> > >>>>>> On 9/1/2022 1:24 PM, Davide DelVento wrote: > >>>>>>> Thanks again, Brian, indeed that grep returns many hits, but none of > >>>>>>> them includes lua, i.e. > >>>>>>> > >>>>>>> strings `which slurmctld ` | grep -i job_submit | grep -i lua > >>>>>>> > >>>>>>> returns nothing. So I should use the C rather than the more convenient > >>>>>>> lua interface, unless I recompile or am I missing something? > >>>>>>> > >>>>>>> On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus <toomuc...@gmail.com> > >>>>>>> wrote: > >>>>>>>> I would be surprised if it were compiled without the support. > >>>>>>>> However, > >>>>>>>> you could check and run something like: > >>>>>>>> > >>>>>>>> strings /sbin/slurmctld | grep job_submit > >>>>>>>> > >>>>>>>> (or where ever your slurmctld binary is). There should be quite a few > >>>>>>>> lines with that in it. > >>>>>>>> > >>>>>>>> Brian Andrus > >>>>>>>> > >>>>>>>> On 9/1/2022 10:54 AM, Davide DelVento wrote: > >>>>>>>>> Thanks Brian for the suggestion, which I am now exploring. > >>>>>>>>> > >>>>>>>>> The documentation is a bit cryptic for me, but exploring a few > >>>>>>>>> things > >>>>>>>>> and checking > >>>>>>>>> https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/ > >>>>>>>>> I suspect my slurm install (provided by cluster vendor) was not > >>>>>>>>> compiled with the lua plugin installed. Do you know how to verify if > >>>>>>>>> that is the case or if it's something else? I don't see a way to > >>>>>>>>> show > >>>>>>>>> if the plugin is actually being "seen" by slurm, and I suspect it's > >>>>>>>>> not. > >>>>>>>>> > >>>>>>>>> Does anyone else have other suggestions or comment on either the > >>>>>>>>> plugin or the prolog workaround? > >>>>>>>>> > >>>>>>>>> Thanks! > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus <toomuc...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>> Not sure if you can do all the things you intend, but the > >>>>>>>>>> job_submit > >>>>>>>>>> script is precisely where you want to check submission options. > >>>>>>>>>> > >>>>>>>>>> https://slurm.schedmd.com/job_submit_plugins.html > >>>>>>>>>> > >>>>>>>>>> Brian Andrus > >>>>>>>>>> > >>>>>>>>>> On 8/30/2022 12:58 PM, Davide DelVento wrote: > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> I would like to soft-enforce license utilization only when the -L > >>>>>>>>>>> is > >>>>>>>>>>> set. My idea: check in the prolog if the license was requested and > >>>>>>>>>>> only if it were, set the environmental variables needed for the > >>>>>>>>>>> license. > >>>>>>>>>>> > >>>>>>>>>>> I looked at all environmental variables set by slurm and did not > >>>>>>>>>>> find > >>>>>>>>>>> any related to the license as I was hoping. > >>>>>>>>>>> > >>>>>>>>>>> As a workaround, I could check > >>>>>>>>>>> > >>>>>>>>>>> scontrol show job $SLURM_JOB_ID | grep License > >>>>>>>>>>> > >>>>>>>>>>> and that would work, but (as discussed in other messages in this > >>>>>>>>>>> list) > >>>>>>>>>>> the documentation at https://slurm.schedmd.com/prolog_epilog.html > >>>>>>>>>>> say > >>>>>>>>>>> > >>>>>>>>>>>> Prolog and Epilog scripts should be designed to be as short as > >>>>>>>>>>>> possible > >>>>>>>>>>>> and should not call Slurm commands (e.g. squeue, scontrol, > >>>>>>>>>>>> sacctmgr, > >>>>>>>>>>>> etc). [...] Slurm commands in these scripts can potentially lead > >>>>>>>>>>>> to performance > >>>>>>>>>>>> issues and should not be used. > >>>>>>>>>>> This is a bit of a concern, since the prolog would be invoked for > >>>>>>>>>>> every job on the cluster, and it's a prolog (rather than the > >>>>>>>>>>> epilogue > >>>>>>>>>>> like discussed in earlier messages). > >>>>>>>>>>> > >>>>>>>>>>> So two questions: > >>>>>>>>>>> > >>>>>>>>>>> 1) is there a better workaround to check in the prolog if the > >>>>>>>>>>> current > >>>>>>>>>>> job requested a license and/or > >>>>>>>>>>> 2) would this kind of use of scontrol be okay or is indeed a > >>>>>>>>>>> concern >