Hi Jeffrey,
Yes, those jobs (and elements of other array which were also
listed one per line) had indeed been preempted and requeued.
So is this behaviour intended/documented or is it a bug?
Cheers,
Loris
Jeffrey T Frey writes:
> Did those four jobs
>
>
>6577272_21 scavenger PD
Correction/addendum: If the node you want to exclude has RPMS that were
built without NVML autodetection, you probably want that gres.conf to
look like this:
NodeName=a1-10 Name=gpu File=/dev/nvidia0
I'm guessing if it was built without Autodetection, the AutoDetect=off
option wouldn't be und
How many nodes are we talking about here? What if you gave each node
it's own gres.conf file, where all of them said
AutoDetect=nvml
Except the one you want to exclude, which would have this in gres.conf :
NodeName=a1-10 AutoDetect=off Name=gpu File=/dev/nvidia0
It seems to me like Autodetect
I don't see how that bug is related. That bug is about requiring the
libnvidia-ml.so library for an RPM that was built with NVML Autodetect
enabled. His problem is the opposite - he's already using NVML
autodetect, but wants to disable that feature on a single node, where it
looks like that nod
Did those four jobs
6577272_21 scavenger PD 0:00 1 (Priority)
6577272_22 scavenger PD 0:00 1 (Priority)
6577272_23 scavenger PD 0:00 1 (Priority)
6577272_28 scavenger PD 0:00 1 (Priority)
run before and get requeued? Seems
Hi,
Does anyone have an idea why pending elements of an array job in one
partition should be displayed compactly by 'squeue' but those of another
in a different partition are displayed one element per line? Please see below
(compact display in 'main', one element per line in 'scavenger').
This is
Hi,
I create a new QoS rule and want everybody can use the QoS
Is there a way to make all users use the new QoS. Not by add association one by
one like:
sacctmgr modify user crock set qos+=alligator
> -Original Message-
> From: slurm-users On Behalf Of
> ole.h.niel...@fysik.dtu.dk
> Sent: 23 February 2021 15:04
>
> Just a thought: Do you run a recent Slurm version? Which version of
> MariaDB/MySQL do you run?
> /Ole
We're currently running Slurm 20.02.6-1 and MariaDB 10.3.28.
But
Yes, we suspected something like that... we have already increased
innodb_buffer_pool_size from 32G to 64G (and have new DB nodes on the way) but
it didn't help. There aren't dedicated DB nodes though.
We assumed it must be some tipping point thing, hence looking into purging. But
like in the
On 23-02-2021 15:19, mercan wrote:
Hi;
May be the database can not fit innodb buffer any more. If there are
enough room to increase this value(innodb_buffer_pool_size) , to find
reason, you can try the increase.
The details of modifying the Innodb parameters are described in
https://wiki.fys
Hi;
May be the database can not fit innodb buffer any more. If there are
enough room to increase this value(innodb_buffer_pool_size) , to find
reason, you can try the increase.
Ahmet M.
23.02.2021 17:03 tarihinde Luke Sudbery yazdı:
That great, thanks. We were thinking about staging it lik
That great, thanks. We were thinking about staging it like that, and using days
is simpler to trigger than waiting for the month.
We will also need to increase innodb_lock_wait_timeout first so we don't hit
the problems described in https://bugs.schedmd.com/show_bug.cgi?id=4295.
Anyone know why
On 2/23/21 1:25 PM, Luke Sudbery wrote:
We have suddenly got bad performance from sreport, querying a 1 hour
period (in the last 24 hours) for TopUsage went from taking under a minute
to timing out after the 15 minutes max slurmdbd query time – although the
SQL query on the DB server continued
Command in question is:
sreport --parsable2 user topusage topcount=3 start=10/15/19 end=10/16/19
Similar to this: https://bugs.schedmd.com/show_bug.cgi?id=2315 where the
problem eventually just 'went away'. We also have >12000 associations and see a
large number of them (>9000) listed in the S
We have suddenly got bad performance from sreport, querying a 1 hour period (in
the last 24 hours) for TopUsage went from taking under a minute to timing out
after the 15 minutes max slurmdbd query time - although the SQL query on the DB
server continued long after that.
So firstly we were wond
15 matches
Mail list logo