Hi,
I tried scontrol reconfigure some years ago, but this didn't work in all
cases.
Best regards
Werner
On 05/09/2018 04:27 PM, Mahmood Naderan wrote:
I think, the problem was:
the python script
/opt/rocks/lib/python2.7/site-packages/rocks/commands/sync/slurm/__init__py,
which is called by
Yep, exactly the same issue. Our dirty workaround is to ssh -X back into the
same host and it will work.
> On 24 Apr 2018, at 00:03, Brendan Moloney wrote:
>
> Hi,
>
> We recently upgraded to 17.11, and I was trying to setup the new integrated
> X11 forwarding instead of using the spank plug
Hi, we have an issue currently where we have a bunch of runaway jobs, but we
cannot clear them:
sacctmgr show runaway|wc -l
sacctmgr: error: slurmdbd: Sending message type 1488: 11: No error
sacctmgr: error: Failed to fix runaway job: Resource temporarily unavailable
58588
Has anyone run into t
Hi,
We recently upgraded to 17.11, and I was trying to setup the new integrated
X11 forwarding instead of using the spank plugin.
Initially I was testing with an SSH session into our login node and things
seemed fine. Then I switched to using X2Go to connect to the login node
and it broke. The
Greetings,
I am setting up our new GPU cluster and trying to ensure that a user may issue
a request such that all the cores assigned to them are on the same socket to
which the GPU is bound; however, I guess I do not fully understand the settings
because I seem to be getting cores from multiple
Hi,
I have Slurm 17.02.10 installed in a test environment. When I use sacct
-o "JobID,JobName,AllocCPUs,ReqMem,Elapsed" and AccountingStorageType =
accounting_storage/filetxt, the fields AllocCPUS and ReqMem are empty.
JobIDJobName AllocCPUS ReqMemElapsed
Hi,
im testing (or trying) generic burst buffers.
But it is not really clear how to configure the burst buffers.
I configured slurm.conf to use the generic plugin:
BurstBufferType=burst_buffer/generic
#Add Debug flags
DebugFlags=BurstBuffer
On startup it loads the module:
slurmctld: debug3: Try
Hello all.
Is it possible to reserve some nodes for being used for jobs only in a
specific partition?
Our mini-cluster will be used for some lessons, so it will need to run
"immediately" the jobs submitted by students.
A partition spanning the required nodes is already defined, but it
overlaps wit
Is there a way to retrieve job step information similar to "scontrol show job"?
What I want to be able to do is see all job steps associated with a
particular job, whether the step pending, running. or finished. It seems that
job step information is only available as long as the step is run
We are pleased to announce the availability of Slurm version 17.11.6.
This includes over 50 fixes made since 17.11.5 was released eight weeks
ago, including a race condition within the slurmstepd that can lead to
hung extern steps.
Slurm can be downloaded from https://www.schedmd.com/download
Hi,
I have a user trying to use %t to split the mpi rank outputs into different
files and it's not working. I verified this too. Any idea why this might be?
This is the first that I've heard of a user trying to do this. Here is an
example job script file:
-
#!/bin/bash
#SBATCH --job-name=m
Good Morning (at least for those on the West coast of the US)
My nodes are no longer “down”
eric@radoncmaster:~$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug* up infinite 4 idle radonc[01-04]
I think the NTP configuration did the trick
So one possibility there is
> I think, the problem was:
> the python script
> /opt/rocks/lib/python2.7/site-packages/rocks/commands/sync/slurm/__init__py,
> which is called by the command rocks sync slurm
> did not restart slurmd on the Head-Node.
Thanks for figuring out that. At the time I was digging, I tried rocks
sync co
I currently use a plugin node feature like knl
but i don't like use node feature because i must write "feature" in
slurm.conf file
I find this solution rather cumbersome
for example if i want to add kernel 4.4 in my program :
srun -c kernel4.4
i must to updated all slurm.conf file in my cluster, a
On Wednesday, 9 May 2018 9:16:37 PM AEST Tueur Volvo wrote:
> if i use srun --reboot hostname, how to tell him to update the kernel before
> rebooting ?
Ah, now I understand why you mention a spank plugin, as that would allow you
to specify a new command line option for sbatch to specify a kerne
I would like to update the linux kernel then reboot the machine and run the
job
for example I would like this:
srun --chooskernel=4.1 hostname
I would like to install kernel 4.1 on my machine, then reboot the machine
and run hostname
if i use srun --reboot hostname, how to tell him to update th
On Wednesday, 9 May 2018 7:09:12 PM AEST Tueur Volvo wrote:
> Hello, i have question, it's possible to reboot slurm node in spank plugin
> before execute job ?
I don't know about that, but sbatch has a --reboot flag and you could use a
submit filter to set it.We do the opposite and always str
On Wednesday, 9 May 2018 6:09:08 PM AEST Werner Saar wrote:
> I think, the problem was:
> the python script
> /opt/rocks/lib/python2.7/site-packages/rocks/commands/sync/slurm/__init__py,
> which is called by the command rocks sync slurm
> did not restart slurmd on the Head-Node.
Depending on the
Hello, i have question, it's possible to reboot slurm node in spank plugin
before execute job ?
Hi Benjamin,
thanks for getting back to me! I somehow failed to ever arrive at this page.
Andreas
-"slurm-users" wrote: -
To: slurm-users@lists.schedmd.com
From: Benjamin Matthews
Sent by: "slurm-users"
Date: 05/09/2018 01:20AM
Subject: Re: [slurm-users] srun seg faults immediately fr
Hi Mahmood,
I think, the problem was:
the python script
/opt/rocks/lib/python2.7/site-packages/rocks/commands/sync/slurm/__init__py,
which is called by the command rocks sync slurm
did not restart slurmd on the Head-Node.
After the restart of slurmctld, slurmd on the Head-node had the old
con
Chester Langin writes:
> Is there no way to scancel a list of jobs? Like from job 120 to job 150?
scancel $(seq 120 150)
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
22 matches
Mail list logo