ent between jobs, and number of
> jobs). We had it on and it nearly ran us out of space on our database host.
> That said the data can be really useful depending on the situation.
>
> -Paul Edmon-
>
> On 8/7/2024 8:51 AM, Juergen Salk via slurm-users wrote:
> > Hi Steffen
Hi Steffen,
not sure if this is what you are looking for, but with
`AccountingStoreFlags=job_env´
set in slurm.conf, the batch job environment will be stored in the
accounting database and can later be retrieved with `sacct -j
--env-vars´
command.
We find this quite useful for debugging purp
Hi,
to my very best knowledge MaxRSS does report aggregated memory consumption
of all tasks but including all the shared libraries that the individual
processes uses, even though a shared library is only loaded into memory
once regardless of how many processes use it.
So shared libraries do count
Hi Alan,
unfortunately, process placement in Slurm is kind of black magic for
sub-node jobs, i.e. jobs that allocate only a small number of CPUs of
a node.
I have recently raised a similar question here:
https://support.schedmd.com/show_bug.cgi?id=19236
And the buttom line was, that to "reall
Hi Jason,
do or did you maybe have a reservation for user root in place?
sreport accounts resources reserved for a user as well (even if not
used by jobs) while sacct reports job accounting only.
Best regards
Jürgen
* Jason Simms via slurm-users [240429 10:47]:
> Hello all,
>
> Each week,
Hi Gerhard,
I am not sure if this counts as administrative measure, but we do
highly encourage our users to always explicitely specify --nodes=n
together with --ntasks-per-node=m (rather than just --ntasks=n*m and
omitting --nodes option, which may lead to cores allocated here and
there and eve
Hi Wirawan,
in general `--gres=gpu:6´ actually means six units of a generic resource named
`gpu´
per node. Each unit may or may not be associated with a physical GPU device.
I'd check the node configuration for the number of gres=gpu resource units that
are
configured for that node.
scont
Hi,
I am not sure if this related to GPUs. I rather think the issue has to do with
how your OpenMPI has been built.
What does ompi_info command show? Look for "Configure command line" in
the output. Does this include '--with-slurm' and '--with-pmi' flags?
To my very best knowledge, both flags
Hi Sebastian,
maybe it's a silly thought on my part, but do you have the
`enable_user_top´ Option included in your SchedulerParameters
configuration?
This would allow regular users to use `scontrol top ´ to
push some of their jobs ahead of other jobs owned by them and this
works internally by
Hi Thomas,
I think sreport should actually do what you want out of the box if you
have permissions to retrieve that information for other users than
yourself.
In my understanding, sacct is meant for individual job and job step
accounting while sreport is more suitable for aggregated cluster usa
term
> > salloc: Granted job allocation 65537
> >
>
> which works as advertised (I'm not sure that i miss xterms or not -- at
> least on our cluster we dont configure them explicitly as a primary
> terminal tool)
>
> And thanks also Chris and Jason
Hi Em,
this is most probably because in Slurm version 20.11 the behaviour of srun was
changed to not allow job steps to overlap by default any more.
An interactive job launched by `srun --pty bash´ always creates a regular
step (step .0), so mpirun or srun will hang when trying to launch
anoth
Hi Martin,
to my very best knowledge MaxRSS does report aggregated memory consumption
of all tasks but including all the shared libraries that the individual
processes uses, even though a shared library is only loaded into memory
once regardless of how many processes use it.
So shared librarie
Hi,
SchedMD also recently changed their online documentation on building
RPM packages for Slurm: https://slurm.schedmd.com/quickstart_admin.html
They now refer to '_slurm_sysconfdir' macro while it was '_sysconfdir'
in previous versions of the documentation.
Now it reads:
--- snip ---
To bui
Hi Marko,
I have had a very similar issue with setting up a custom path for the
Slurm configuration files when using the '%_sysconfdir' macro in
.rpmmacros, but this also happened with version 21.08.6 to me.
Does it work for you if you use '%_slurm_sysconfdir' instead of
'%_sysconfdir' macro in
Hi William,
do those jobs show up when you run `sacctmgr show runaway` command? This
command
will also give you an option to fix them if it finds jobs in that state.
For some more details see https://slurm.schedmd.com/sacctmgr.html and
slide #14 from https://slurm.schedmd.com/SLUG19/Troubleshoo
Hi John,
this is really bad news. We have stopped our rolling update from Slurm
21.08.6 to Slurm 21.08.8-1 today for exactly that reason: State of
compute nodes already running slurmd 21.08.8-1 suddenly started
flapping between responding and not responding but all other nodes
that were still r
Hi Nicolas,
it looks like you have pam_access.so placed in your PAM stack *before*
pam_slurm_adopt.so so this may get in your way. In fact, the logs
indicate that it's pam_access and not pam_slurm_adopt that denies access
in the first place:
Apr 8 19:11:32 magi46 sshd[20542]: pam_access(sshd:ac
Hi Bjørn-Helge,
that's very similar to what we did as well in order to avoid confusion with
Core vs. Threads vs. CPU counts when Hyperthreading is kept enabled in the
BIOS.
Adding CPUs= (not ) will tell Slurm to only
schedule physical cores.
We have
SelectType=select/cons_res
SelectTypePara
Hi,
does anybody know a simple way to enforce certain shell options
such as
set -o errexit (a.k.a. set -e)
set -o pipefail
and maybe also
set -o nounset (a.k.a. set -u)
for the job environment at job submission time (without modifying
the batch scripts themselves)?
Background of t
age correctly, `scontrol suspend` sends a SIGSTOP to all
> > job processes. The processes remain in memory, but are paused. What
> > happens to open file handles, since the underlying filesystem goes away
> > and comes back?
> >
> > Thank you,
> >
> > On Sat
in a staggered manner.
Best regards
Jürgen
* Paul Edmon [211019 15:15]:
> Yup, we follow the same process for when we do Slurm upgrades, this looks
> analogous to our process.
>
> -Paul Edmon-
>
> On 10/19/2021 3:06 PM, Juergen Salk wrote:
> > Dear all,
> >
Dear all,
we are planning to perform some maintenance work on our Lustre file system
which may or may not harm running jobs. Although failover functionality is
enabled on the Lustre servers we'd like to minimize risk for running jobs
in case something goes wrong.
Therefore, we thought about s
Hi,
I think setting MaxSubmitJobs=0 for the association should also do the
trick if you don't want to code something special in the submit.lua
script. E.g. for a single user:
sacctmgr update user set maxsubmitjobs=0
Setting MaxSubmitJobs=-1 will then release this limit.
Best regards
Jür
Dear Jeherul,
sacct is for job accounting, sreport for cluster usage accounting.
Did you maybe had any resource reservations for this user in place during
that period of time?
To my very best knowledge, resource reservations for one or more users
do count in terms of cluster usage as reporte
Hi Hemanta,
is PrivateData also set in your slurmdbd.conf?
Best regards
Juergen
* Hemanta Sahu [210818 15:04]:
> I am still searching for a solution for this .
>
> On Fri, Aug 7, 2020 at 1:15 PM Hemanta Sahu
> wrote:
>
> > Hi All,
> >
> > I have configured in our test cluster "PrivateDa
Hi,
I can't speak specifically for arbiter but to my very best knowledge
this is just how cgroup memory limits work in general, i.e. both,
anonymous memory and page cache, always count against the cgroup
memory limit.
This also applies for memory constraints imposed to compute jobs if
Constrain
* David Chaffin [210607 14:44]:
>
> we get a lot of small sub-node jobs that we want to pack together. Maui
> does this pretty well with the smallest node that will hold the job,
> NODEALLOCATIONPOLICY MINRESOURCE
> I can't figure out the slurm equivalent. Default backfill isn't working
> well.
* Tina Friedrich [210521 16:35]:
> If this is simply about quickly accessing nodes that they have jobs on to
> check on them - we tell our users to 'srun' into a job allocation (srun
> --jobid=XX).
Hi Tina,
sadly, this does not always work in version 20.11.x any more because of the
new non-
Hi Loris,
this depends largely on whether host-based authentication is
configured (which does not seem to be the case for you) and also on
how exactly the PAM stack for sshd looks like in /etc/pam.d/sshd.
As the rules are worked through in the order they appear in
/etc/pam.d/sshd, pam_slurm_adopt
* Juergen Salk [210515 23:54]:
> * Christopher Samuel [210514 15:47]:
>
> > > Usage reported in Percentage of Total
> > >
> > >
> > > Cluster TRES Name
* Christopher Samuel [210514 15:47]:
> > Usage reported in Percentage of Total
> >
> >
> > Cluster TRES Name Allocated Down PLND Dow Idle
> > Reserved Reported
> > - --
Hi John,
does it work with `srun --overlap ...´ or if you do `export SLURM_OVERLAP=1´
before running your interactive job?
Best regards
Jürgen
* John DeSantis [210428 09:41]:
> Hello all,
>
> Just an update, the following URL almost mirrors the issue we're seeing:
> https://github.com/open-
* Ryan Novosielski [210416 21:33]:
> Does anyone have a particularly clever way, either built-in or
> scripted, to find out which jobs will still be running at
> such-and-such time?
Hi Ryan,
coincidentally, I just did this today. For exactly the same reason.
squeue does have a "%L" format opti
* Matthias Leopold [210416 19:35]:
> can someone please explain to me why it's possible to set Grp* resource
> limits on user associations? What's the use for this?
Hi Matthias,
this probably does not fully answer your question, but Grp* limits on
user associations provide the ability to impos
* Heckes, Frank [210413 12:04]:
> This result from a mgmt. - question. How long jobs have to wait (in s, min,
> h, day) before they getting executed and
> how many jobs are waiting (are queued) for each partition in a certain time
> interval.
> The first one is easy to find with sacct and sub
Hi Mike,
for pushing environvent variables into the job environment, you'll have to
use the TaskProlog script (not the regular Prolog script).
The location of the TaskProlog script needs to be defined in
slurm.conf, e.g.
TaskProlog=/etc/slurm/task_prolog
and the standard output of TaskProlog
Hi Paul,
maybe this is totally unrelated but we also have a similar issue with
pam_slurm_adopt in case that ConstrainRAMSpace=no is set in
cgroup.conf and more than one job is running on that node. There is a
bug report open at:
https://bugs.schedmd.com/show_bug.cgi?id=9355
As a workaround we
* Marcus Wagner [200120 09:17]:
> I was astonished about the "Modify Clusters" transactions, so I looked a bit
> further:
> $> sacctmgr list transactions Action="Modify Clusters" -p
> 2020-01-15T00:00:12|Modify
> Clusters|slurmadm|name='rcc'|control_host='134.61.193.19',
> control_port=6750, last
* Ole Holm Nielsen [200118 12:06]:
> When we have created a new Slurm user with "sacctmgr create user name=xxx",
> I would like inquire at a later date about the timestamp for the user
> creation. As far as I can tell, the sacctmgr command cannot show such
> timestamps.
Hi Ole,
for me (current
Hi Michael,
not sure if this is the root cause of your problem, but SchedMD
recommends to set TaskAffinity to "no" in cgroup.conf when using both,
task/affinity and task/cgroup, together for TaskPlugin in slurm.conf
(see the NOTE for TaskPlugin in slurm.conf).
Best regards
Ju
* Chris Samuel [200113 07:30]:
> On 1/13/20 5:55 am, Youssef Eldakar wrote:
>
> > In an sbatch script, a user calls a shell script that starts a Java
> > background process. The job immediately is completed, but the child Java
> > process is still running on the compute node.
> >
> > Is there a
Hello Angelines,
for me (Slurm 19.05.2) the following command seems to work:
sacctmgr update user set maxsubmitjobs=0
Job submission is then rejected with the following message:
$ sbatch job.slurm
sbatch: error: AssocMaxSubmitJobLimit
sbatch: error: Batch job submission failed: Job violates a
Hi Brian,
can you maybe elaborate on how exactly you verified that your epilog
does not run when a job exceeds it's walltime limit? Does it run when
the jobs end normally or when a running job is cancelled by the user?
I am asking because in our environment the epilog also runs when a job
hits
Hi Angelines,
we create a job specific scratch directory in the prolog script but
use the task_prolog script to set the environment variable.
In prolog:
scratch_dir=/your/path
/bin/mkdir -p ${scratch_dir}
/bin/chmod 700 ${scratch_dir}
/bin/chown ${SLURM_JOB_USER} ${scratch_dir}
In task_prolog:
* David Baker [191104 15:14]:
> It looks like the downside of the serial queue is that jobs from
> different users can interact quite badly.
Hi David,
what exactly do you mean with "jobs from different users can interact
quite badly"?
> [...] On the other hand I wonder if our cgroups setup i
Hi,
maybe I missed it, but what does squeue say in the reason field for
your pending jobs that you expect to slip in?
Is your partition maybe configured for exclusive node access, e.g. by
setting `OverSubscribe=EXCLUSIVE´?
Best regards
Jürgen
--
Jürgen Salk
Scientific Software & Compute Ser
Hi Mike,
IIRC, I once did some tests with the very same configuration as
your's, i.e. `JobAcctGatherType=jobacct_gather/linux´ and
`JobAcctGatherParams=OverMemoryKill´ and got this to work as expected:
Jobs were killed when they exceeded the requested amount of memory.
This was with Slurm 18.08.7.
Dear Chris,
I could not find this warning in the slurm.conf man page. So I googled
it and found a reference in the Slurm developers documentation:
https://slurm.schedmd.com/jobacct_gatherplugins.html
However, this web page says in its footer: "Last modified 27 March 2015".
So maybe (means: hop
> On 19-10-08 10:36, Juergen Salk wrote:
> > * Bjørn-Helge Mevik [191008 08:34]:
> > > Jean-mathieu CHANTREIN writes:
> > >
> > > > I tried using, in slurm.conf
> > > > TaskPlugin=task/affinity, task/cgroup
> > > >
* Bjørn-Helge Mevik [191008 08:34]:
> Jean-mathieu CHANTREIN writes:
>
> > I tried using, in slurm.conf
> > TaskPlugin=task/affinity, task/cgroup
> > SelectTypeParameters=CR_CPU_Memory
> > MemLimitEnforce=yes
> >
> > and in cgroup.conf:
> > CgroupAutomount=yes
> > ConstrainCores=yes
> > C
* Rafał Kędziorski [190927 14:58]:
> > >
> > > you may try setting `ReturnToService=2´ in slurm.conf.
> > >
> >
> > Caveat: A spontaneously rebooting machine may create a "black hole" this
> > way.
> >
>
> How do you mean this? Could ReturnToService=2 be a problem?
>
Hi Rafał,
black hole syndr
Hi Rafał,
you may try setting `ReturnToService=2´ in slurm.conf.
Best regards
Jürgen
--
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471
* Rafał Kędziorski [190927
* David Baker [190926 14:12]:
>
> Currently my normal QOS specifies MaxTRESPU=cpu=1280,nodes=32. I've
> tried a number of edits, however I haven't yet found a way of
> redefining the MaxTRESPU to be "cpu=1280". In the past I have
> resorted to deleting a QOS completely and redefining the whole
>
* David Baker [190925 15:58]:
> Thank you for your reply. So, in respond to your suggestion I
> submitted a batch of jobs each asking for 2 cpus. Again I was able
> to get 32 jobs running at once.
Dear David,
this was just meant as a test in order to support the assumption that
you do not hit
Dear David,
as it seems, Slurm counts allocated nodes on a per job basis,
i.e. every individual one-core jobs counts as an additional node
even if they all run on one and the same node.
Can you allocate 64 CPUs at the same time when requesting 2 CPUs
per job?
We've also had this (somewhat stra
Hallo Mahmood,
in our current system (which does not run with Slurm) we have deployed
the community edition of Singularity as a software module.
https://sylabs.io/singularity/
I have no practical experience yet but from what I've read so far,
Singularity is also supposed to work quite well wi
Dear Tina,
probably a stupid question, but is there any other MaxJobs limit
defined somewhere else above the user association in resource limit
hierarchy?
For example, if MaxJobs=1 in the partition/job QOS and MaxJob=100
in the user association, the QOS limit takes precedence over the
user
* Philip Kovacs [190917 07:43]:
> >> I suspect the question, which I also have, is more like:
> >>
> >> "What difference does it make whether I use 'srun' or 'mpirun' within
> >> a batch file started with 'sbatch'."
>
> One big thing would be that using srun gives you resource tracking
> an
* Loris Bennett [190917 07:46]:
> >
> >>But I still don't get the point. Why should I favour `srun
> >>./my_mpi_program´
> >>over `mpirun ./my_mpi_program´? For me, both seem to do exactly the same
> >>thing. No? Did I miss something?
> >
> >>Best regards
> >>Jürgen
> >
> > Running a single job
Dear all,
according to https://slurm.schedmd.com/mpi_guide.html I have built
Slurm 19.05 with PMIx support enabled and it seems to work for both,
OpenMPI and Intel MPI. (I've also set MpiDefault=pmix in slurm.conf.)
But I still don't get the point. Why should I favour `srun ./my_mpi_program´
ove
* Ole Holm Nielsen [190903 11:14]:
> How do you dynamically update your gres=localtmp resource according to the
> current disk free space? I mean, there is already a TmpFS disk space size
> defined in slurm.conf, so how does your gres=localtmp differ from TmpFS?
Dear Ole,
I think (but please c
Dear Bjørn-Helge,
this is unfortunately no answer to the question but I'd be glad to
hear some more thoughts on that, too.
We are also going to implement disk quotas for the amount of local
scratch space that has been allocated for the job by means of generic
resources (e.g. `--gres=scratch:100´
* Andy Georges [190715 16:17]:
>
> On Fri, Jul 12, 2019 at 03:21:31PM +0200, Juergen Salk wrote:
> > Dear all,
> >
> > I have configured pam_slurm_adopt in our Slurm test environment by
> > following the corresponding documentation:
> >
> > http
Hallo,
the cpu vs. cores vs. threads issues also confused me at the very
beginning. Although, in general, we do not encourage our users to make
use of hyperthreading, we have decided to leave it enabled in the BIOS
as there are some use cases that are known to benefit from
hyperthreading.
I think
Dear all,
I have configured pam_slurm_adopt in our Slurm test environment by
following the corresponding documentation:
https://slurm.schedmd.com/pam_slurm_adopt.html
I've set `PrologFlags=contain´ in slurm.conf and also have task/cgroup
enabled along with task/affinity (i.e. `TaskPlugin=task/
* Christopher Samuel [190621 09:59]:
> On 6/13/19 5:27 PM, Kilian Cavalotti wrote:
>
> > I would take a look at the various *KmemSpace options in cgroups.conf,
> > they can certainly help with this.
>
> Specifically I think you'll want:
>
> ConstrainKmemSpace=no
>
> to fix this. This happens
e cache for specific
> files or directories.
>
> ---
> Sam Gallop
>
> -Original Message-
> From: slurm-users On Behalf Of
> Juergen Salk
> Sent: 14 June 2019 09:14
> To: Slurm User Community List
> Subject: Re: [slurm-users] ConstrainRAMSpace=yes and page
ey can certainly help with this.
>
> Cheers, -- Kilian
>
> On Thu, Jun 13, 2019 at 2:41 PM Juergen Salk
> wrote:
> >
> > Dear all,
> >
> > I'm just starting to get used to Slurm and play around with it in
> > a small test environment within our o
Dear all,
I'm just starting to get used to Slurm and play around with it in a small test
environment within our old cluster.
For our next system we will probably have to abandon our current exclusive user
node access policy in favor of a shared user policy, i.e. jobs from different
users will the
70 matches
Mail list logo