Hi Sushil,
Try changing NodeName specification to:
NodeName=localhost CPUs=96 State=UNKNOWN Gres=gpu*:8*
Also:
TaskPlugin=task/cgroup
Best,
Steve
On Wed, Apr 6, 2022 at 9:56 AM Sushil Mishra
wrote:
> Dear SLURM users,
>
> I am very new to alarm and need some help in configuring slurm in
Is using "#!/bin/bash -l" enough to make it work?
On Thu, Mar 31, 2022 at 6:46 AM Sebastian Potthoff <
s.potth...@uni-muenster.de> wrote:
> Just a quick follow up, that I could resolve the issue. Maybe this helps
> someone in the future.
>
> $BASH_ENV was pointing to a deprecated script, resetti
If you want to have the same number of processes per node, like:
#PBS -l nodes=4:ppn=8
then what I am doing (maybe there is another way?) is:
#SBATCH --ntasks-per-node=8
#SBATCH --nodes=4
#SBATCH --mincpus=8
This is because "--ntasks-per-node" is actually "maximum number of tasks
per node" and
I see at https://slurm.schedmd.com/cons_res_share.html that there are some
ways to share a node between partitions but I don't see how to specify a
set number of cores to each partition. Is this possible? If I have some
nodes with 36 cores, is there a way to make 16 of them be in one partition
and
PM Stephen Cousins
wrote:
> What I'm saying is that the job might not be able to run in that
> partition. Ever. The job might be asking for more resources than the
> partition can provide. Maybe I'm wrong but it would help to know what the
> partition definition is, along w
ition have
specified (both of these in slurm.conf) and then what the job is asking for.
On Tue, Feb 8, 2022, 7:36 PM wrote:
> Yes, the partition does not meet the requirements now.
>
> The job should still be submitted and wait until requirements are
> available.
>
>
> On 09.02.
I think this message comes up when there are no nodes in that partition
have the resources capable to meet the requirements. Can you show what the
partition definition is in slurm.conf along with what the job is asking for?
On Tue, Feb 8, 2022, 5:25 PM wrote:
>
> Dear all,
>
> sbatch jobs are im
Hi Jeremy,
What is the value of TreeWidth in your slurm.conf? If there is no entry
then I recommend setting it to a value a bit larger than the number of
nodes you have in your cluster and then restarting slurmctld.
Best,
Steve
On Wed, Feb 2, 2022 at 12:59 AM Jeremy Fix
wrote:
> Hi,
>
> A fol
I'd take a look at:
https://slurm.schedmd.com/cpu_management.html#Example2
I think this might be what you want:
SelectType=select/cons_res
SelectTypeParameters=CR_Core
Best,
Steve
On Tue, Nov 23, 2021, 7:35 PM Anne Hammond wrote:
> We are running slurm 20.11.2-1 from CentOS 7 rpms.
>
> Th
*From: *slurm-users on behalf of
> Stephen Cousins
> *Reply to: *Slurm User Community List
> *Date: *Tuesday, 16 November 2021 at 19:09
> *To: *Slurm User Community List
> *Subject: *Re: [slurm-users] Unable to start slurmd service
>
>
>
> I think you just need to use sco
I think you just need to use scontrol to "resume" that node.
On Tue, Nov 16, 2021, 10:10 AM Jaep Emmanuel wrote:
> Hi,
>
>
>
> It might be a newbie question since I'm new to slurm.
>
> I'm trying to restart the slurmd service on one of our Ubuntu box.
>
>
>
> The slurmd.service is defined by:
>
11 matches
Mail list logo