On Tue, Jul 10, 2018 at 10:34 AM, Taras Shapovalov
wrote:
> I noticed the commit that can be related to this:
>
> https://github.com/SchedMD/slurm/commit/bf4cb0b1b01f3e165bf12e69fe59aa7b222f8d8e
Yes. See also this bug: https://bugs.schedmd.com/show_bug.cgi?id=5240
This commit will be reverted in
On Tue, Jul 10, 2018 at 10:05 AM, Jessica Nettelblad
wrote:
> In the master branch, scontrol write batch_script also has the option to
> write the job script to STDOUT instead of a file. This is what we use in the
> prolog when we gather information for later (possible) troubleshooting. So I
> sup
Check the Gaussian log file for mention of its using just 8 CPUs-- just because
there are 12 CPUs available doesn't mean the program uses all of them. It will
scale-back if 12 isn't a good match to the problem as I recall.
/*!
@signature Jeffrey Frey, Ph.D
@email f...@udel.edu
@source iPh
Gaussian? Look for NProc=8 or similar lines (NPRocShared, could be other
options, too) in their input files. There could also be some system-wide
parallel settings for Gaussian, but that wouldn’t be the default.
> On Jul 10, 2018, at 2:04 PM, Mahmood Naderan wrote:
>
> Hi,
> I see that althoug
Hi,
I see that although I have specified cpu limit of 12 for a user, his job
only utilizes 8 cores.
[root@rocks7 ~]# sacctmgr list association
format=partition,account,user,grptres,maxwall
PartitionAccount User GrpTRES MaxWall
-- -- -- - -
What is the change in the commit you're thinking about?
Original message From: Taras Shapovalov
Date: 10/07/2018 19:34 (GMT+01:00) To:
slurm-us...@schedmd.com Subject: [slurm-users] DefMemPerCPU is reset to 1 after
upgrade
Hey guys,
When we upgraded to 17.11.7, then on some
Hi,
I'm currently playing with SLURM 17.11.7, cgroups and a node with 2
GPUs. Everything works fine if I set the GPU to be consumable. Cgroups
are doing their jobs and the right device is allocated to the right
job. However, it doesn't work if I set `Gres=gpu:no_consume:2`. For
some reason, SLURM d
Hi,
I ran into this recently after upgrading from 16.05.10 to 17.11.7 and couldn’t
run any jobs on any partitions. The only way I got around this was to set this
flag on all “NodeName” definitions in slurm.conf: RealMemory=
Where foo is the total memory of the nodes in MB. I believe the documen
Hey guys,
When we upgraded to 17.11.7, then on some clusters all jobs are killed with
these messages:
slurmstepd: error: Job 374 exceeded memory limit (1308 > 1024), being
killed
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: *** JOB 374 ON node002 CANCELLED AT
2018-06-28T0
Thank you very much. I can see it.
Regards,
Mahmood
On Tue, Jul 10, 2018 at 9:35 PM, Jessica Nettelblad <
jessica.nettelb...@gmail.com> wrote:
> Since 17.11, there's a command to write the job script to a file:
> "scontrol write batch_script job_id optional_filename
> Write the batch script fo
Since 17.11, there's a command to write the job script to a file:
"scontrol write batch_script job_id optional_filename
Write the batch script for a given job_id to a file. The file will default
to slurm-.sh if the optional filename argument is not given. The
batch script can only be retrieved by a
Hi,
We've recently upgraded to Slurm 17.11.7 from 16.05.8.
We noticed that the environment variable, HOSTNAME, does not refelct the
compute node with an interactive job using the salloc/srun command.
Instead it still points to the submit hostname although .SLURMD_NODENAME
reflects the correct co
As in the submit script?
I believe "scontrol show jobid -dd $JOBID" (with $JOBID being the ID of the
job you're after) should show you. (Does for me anyway :) ).
Tina
On Tuesday, 10 July 2018 20:32:33 BST Mahmood Naderan wrote:
> Hi
> How can I check the submitted script of a running based on
scontrol show job -dd JOBID
then search
Command=
Best,
Shenglong
> On Jul 10, 2018, at 12:02 PM, Mahmood Naderan wrote:
>
> Hi
> How can I check the submitted script of a running based on its jobid?
>
>
> Regards,
> Mahmood
>
>
Hi Mahmood,
You should be able to find the script like so with the squeue command:
[mkandes@comet-ln2 ~]$ squeue -j 17797604 --Format="command:132"
COMMAND
/home/jlentfer/S2018/cometTS_7_phe_me_proximal_anti_398_2.sb
[mkandes@comet-ln2 ~]$
Marty
From: slurm-u
Hi
How can I check the submitted script of a running based on its jobid?
Regards,
Mahmood
16 matches
Mail list logo