Hi,
my SLURM cluster has configured a partition with a "TimeLimit" of 8
hours. Now, a job is running during 9h30m and it has been not cancelled.
During these 9 hours and a half, a script has executed a "scontrol
update partition=mypartition state=down" for disabling this partition
(educationa
Yes, it's odd.
-kkm
On Mon, Mar 9, 2020 at 7:44 AM mike tie wrote:
>
> Interesting. I'm still confused by the where slurmd -C is getting the
> data. When I think of where the kernel stores info about the processor, I
> normally think of /proc/cpuinfo. (by the way, I am running centos 7 in
On 3/10/20 9:03 AM, sysadmin.caos wrote:
my SLURM cluster has configured a partition with a "TimeLimit" of 8 hours.
Now, a job is running during 9h30m and it has been not cancelled. During
these 9 hours and a half, a script has executed a "scontrol update
partition=mypartition state=down" for d
Hi,
On Tue, Mar 10, 2020 at 05:49:07AM +, Rundall, Jacob D wrote:
> I need to update the configuration for the nodes in a cluster and I’d like to
> let jobs keep running while I do so. Specifically I need to add
> RealMemory= to the node definitions (NodeName=). Is it safe to do this
> for
Hello,
I have checked my configuration with "scontrol show config" and these are the
values of that three parameters:
AccountingStorageEnforce = none
EnforcePartLimits = NO
OverTimeLimit = 500 min
...so now I understand by my job hasn't been cancelled after 8 hours... because
th
I built/ran a quick test on older slurm and do see the issue. Looks like
a possible bug. I would open a bug with SchedMD.
I couldn't think of a good work-around, since the job would get
rescheduled to a different node if you reboot, even if you have the node
update it's own status at boot. It
Hi,
We are trying to setup accounts by user groups and I have one group that I'd
like to drop the priority from the default of 1 (FairShare). I'm assuming that
this is accomplished with the sacctmgr command, but haven't been able to figure
out the exact syntax. Assuming this is the correct me
Here is the output of lstopo
*$* lstopo -p
Machine (63GB)
Package P#0 + L3 (16MB)
L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0
L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#1
L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#2
L2 (4096KB) + L1d
On Tue, Mar 10, 2020 at 1:41 PM mike tie wrote:
> Here is the output of lstopo
>
> *$* lstopo -p
>
> Machine (63GB)
>
> Package P#0 + L3 (16MB)
>
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0
>
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#1
>
> L2 (4096KB