Hello all,
I've used the "scontrol write batch_script" command to output the job
submission script from completed jobs in the past, but for some reason, no
matter which job I specify, it tells me it is invalid. Any way to
troubleshoot this? Alternatively, is there another way - even if a manual
da
Hello Daniel,
In my experience, if you have a high-speed interconnect such as IB, you
would do IPoIB. You would likely still have a "regular" Ethernet connection
for management purposes, and yes that means both an IB switch and an
Ethernet switch, but that switch doesn't have to be anything specia
Hello Thomas,
I know I'm a few days late to this, so I'm wondering whether you've made
any progress. We experience this, too, but in a different way.
First, though, you may be aware, but you should use salloc rather than srun
--pty for an interactive session. That's been the preferred method for
Hello Matthew,
You may be aware of this already, but most sites would make these kinds of
checks/validations using job_submit.lua. I'm not an expert in that - though
plenty of others on this list are - but I'm positive you could implement
this type of validation logic. I'd like to say that I've co
As a related point, for this reason I mount /var/log separately from /. Ask
me how I learned that lesson...
Jason
On Tue, Apr 16, 2024 at 8:43 AM Jeffrey T Frey via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n"
> is p
Hello all,
Each week, I generate an automated report of the top users by CPU hours.
This week, for whatever reason the user root accounted for a massive number
of hours:
Login Proper Name Used A
r user root in place?
>
> sreport accounts resources reserved for a user as well (even if not
> used by jobs) while sacct reports job accounting only.
>
> Best regards
> Jürgen
>
>
> * Jason Simms via slurm-users [240429
> 10:47]:
> > Hello all,
> >
> >
Hello all,
The Slurm docs have me a bit confused... I'm wanting to enable job
preemption on certain partitions but not others. I *presume* I would
set PreemptType=preempt/partition_prio globally, but then on the partitions
where I don't want jobs to be able to be preempted, I would set
PreemptMode
On the one hand, you say you want "to *allocate a whole node* for a single
multi-threaded process," but on the other you say you want to allow it
to "*share
nodes* with other running jobs." Those seem like mutually exclusive
requirements.
Jason
On Thu, Aug 1, 2024 at 1:32 PM Henrique Almeida via
I know this doesn't particularly help you, but for me on 23.11.6 it works
as expected and immediately drops me onto the allocated node. In answer to
your question, yes, as I understand it the default/expected behavior is to
return the shell directly.
Jason
On Thu, Sep 5, 2024 at 8:18 AM Loris Ben
Ours works fine, however, without the InteractiveStepOptions parameter.
JLS
On Thu, Sep 5, 2024 at 9:53 AM Carsten Beyer via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> Hi Loris,
>
> we use SLURM 23.02.7 (Production) and 23.11.1 (Testsystem). Our config
> contains a second parameter In
Hello Patrick,
Yeah I'd recommend upgrading, and I imagine most others will, too. I have
found with Slurm that upgrades are nearly mandatory, at least annually or
so, mostly because it's more challenging to upgrade from much older
versions and requires bootstrapping. Not sure about the minus sign;
Hello all,
Apologies for the basic question, but is there a straightforward,
best-accepted method for using Slurm to report on which GPUs are currently
in use? I've done some searching and people recommend all sorts of methods,
including parsing the output of nvidia-smi (seems inefficient, especia
13 matches
Mail list logo