Re: [slurm-users] [EXT] Job ended with OUT_OF_MEMORY even though MaxRSS and MaxVMSize are under the ReqMem value

2021-03-15 Thread Chin,David
RAMPercent=100.00 MaxSwapPercent=100.00 MinRAMSpace=200 Cheers, Dave -- David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel dw...@drexel.edu 215.571.4335 (o) For URCF support: urcf-supp...@drexel.edu https://proteusmaster.urcf.drexel.edu/urcfwiki github:preh

[slurm-users] Questions about adding new nodes to Slurm

2021-04-27 Thread David Henkemeyer
ources for adding/removing nodes to Slurm would be much appreciated. Perhaps there is a "toolkit" out there to automate some of these operations (which is what I already have for PBS, and will create for Slurm, if something doesn't already exist). Thank you all, David

[slurm-users] slurmd -C vs lscpu - which do I use to populate slurm.conf?

2021-04-28 Thread David Henkemeyer
ting: NodeName=devops2 CPUs=4 Boards=1 SocketsPerBoard=1 CoresPerSocket=4 ThreadsPerCore=1 RealMemory=9913 Why is there a discrepancy? Which should I use to populate slurm.conf? The OS of this machine is Centos 8. Thank you, David

[slurm-users] Configless mode enabling issue

2021-05-07 Thread David Henkemeyer
Hello all. My team is enabling slurm (version 20.11.5) in our environment, and we got a controller up and running, along with 2 nodes. Everything was working fine. However, when we try to enable configless mode, I ran into a problem. The node that has a GPU is coming up in "drained" state, and s

Re: [slurm-users] Configless mode enabling issue

2021-05-07 Thread David Henkemeyer
Thank you for the reply, Will! The slurm.conf file only has one line in it: AutoDetect=nvml During my debug, I copied this file from the GPU node to the controller. But, that's when I noticed that the node w/o a GPU then crashed on startup. David On Fri, May 7, 2021 at 12:14 PM Will D

Re: [slurm-users] Different GPU types on the same server

2021-05-14 Thread David Gauchard
s` on each line should partition the host or not (=> CPUs=0-3 for all lines) david On 5/14/21 12:28 PM, Emyr James wrote: Dear all, We currently have a single gpu capable server with 10x RTX2080Ti in it. One of our research groups wants to replace one of these cards with an RTX3090 but o

Re: [slurm-users] What is an easy way to prevent users run programs on the, master/login node.

2021-05-20 Thread David Schanzenbach
ee more login node abuse, we would probably try and layer on the use of cgroups to try and limit memory and cpu usage. Thanks, David Date: Wed, 19 May 2021 19:00:38 +0300 From: Alan Orth To: Ole Holm Nielsen , Slurm User Community List Subject: Re: [slurm-users] What is an easy way to pre

Re: [slurm-users] slurmrestd

2021-06-06 Thread David Schanzenbach
and http-parser-devel under CentOS. Thanks, David On 6/6/2021 1:15 PM, Sid Young wrote: Hi all, I'm interested in using the slurmrestd but it does not appear to be built when you do an rpmbuild reading though the docs does not indicate a switch needed to include it (unless I missed

[slurm-users] Maui equivalent Nodeallocationpolicy

2021-06-07 Thread David Chaffin
Hi all, we get a lot of small sub-node jobs that we want to pack together. Maui does this pretty well with the smallest node that will hold the job, NODEALLOCATIONPOLICY MINRESOURCE I can't figure out the slurm equivalent. Default backfill isn't working well. Anyone know of one? Thanks, David

Re: [slurm-users] Maui equivalent Nodeallocationpolicy

2021-06-08 Thread David Chaffin
nd I think this is working SelectTypeParameters=CR_Core,CR_Pack_Nodes,CR_CORE_DEFAULT_DIST_BLOCK Thanks, David On Mon, Jun 7, 2021 at 2:44 PM David Chaffin wrote: > Hi all, > > we get a lot of small sub-node jobs that we want to pack together. Maui > does this pretty well with the s

[slurm-users] Question about adding and removing features in Slurm

2021-06-18 Thread David Henkemeyer
r will manually edit slurm.conf to add/remove features? I've searched the docs and this seems to be the case, but I just wanted to check with the experts to be sure. Thanks so much, David

[slurm-users] New node w/ 3 GPUs is not accepting GPUs tasks

2021-06-23 Thread David Henkemeyer
batch --export=NONE -N 1 --constraint foo --wrap "ls" Submitted batch job 385 Thanks for the help, David

[slurm-users] When using RequeueExit in Slurm.conf, can you limit the # of requeues?

2021-07-01 Thread David Henkemeyer
Hello, I am investigating Slurm's ability to do requeuing of jobs. I like the fact that I can set RequeueExit= in the slurm.conf file, since this will automatically requeue jobs that exit with the specified exit codes. But, is there a way to limit the # of requeues? Thanks David

[slurm-users] Can I get the original sbatch command, after the fact?

2021-07-16 Thread David Henkemeyer
If I execute a bunch of sbatch commands, can I use sacct (or something else) to show me the original sbatch command line for a given job ID? Thanks David

[slurm-users] Bug when I run "sinfo --states=idle"

2021-10-28 Thread David Henkemeyer
E NODELIST debugup infinite 1 drain node6 debug1* up infinite 0n/a (! 809)-> sinfo --states=down PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debugup infinite 1 down* node1 debug1* up infinite 0n/a Is this a known issue? We are running 21.08.0 David

[slurm-users] Possible to get cluster utilization by partition?

2021-11-04 Thread Chin,David
understanding usage if a similar report could be produced for each partition. I tried the obvious, adding "partitions=gpu", but that option isn't applicable to the cluster utilization report: it just produces the same output as the above command. Cheers, Dave -- David Chin, PhD (he/him)

Re: [slurm-users] Possible to get cluster utilization by partition?

2021-11-05 Thread Chin,David
would be a generally useful feature. Cheers, Dave -- David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel dw...@drexel.edu 215.571.4335 (o) For URCF support: urcf-supp...@drexel.edu https://proteusmaster.urcf.drexel.edu/urcfwiki github:preh

[slurm-users] A Slurm topological scheduling question

2021-12-07 Thread David Baker
not happy, by the way, to have node/switch connections across racks. Best regards, David

[slurm-users] How to limit # of execution slots for a given node

2022-01-06 Thread David Henkemeyer
tion, it seems to me. At least, it left me feeling like there has to be a better way. Thanks! David

[slurm-users] Questions about default_queue_depth

2022-01-12 Thread David Henkemeyer
selected? 3) Is there a way to see the order of the jobs in the queue? Perhaps squeue lists the jobs in order? 3) If we had several partitions, would the default_queue_dpeth apply to all partitions? Thank you David

[slurm-users] monitoring and update regime for Power Saving nodes

2022-02-23 Thread David Simpson
anything else) to a node which is down due to power saving (during a maintenance/reservation) what is your approach? Do you end up with 2 slurm.confs (one for power saving and one that keeps everything up, to work on during the maintenance)? thanks David - David Simpson - Senior

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread David Simpson
nd any down nodes will automatically read the latest. Yes, currently we use file based and config written to the compute node’s disks themselves via ansible. Perhaps we will consider moving the file to a shared fs. regards David - David Simpson - Senior Systems Engineer ARCCA, Redwood

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread David Simpson
a dummy job to bring powered down nodes up then a clustershell slurmd stop is probably the answer regards David - David Simpson - Senior Systems Engineer ARCCA, Redwood Building, King Edward VII Avenue, Ca

[slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
functional difference. But if there is even a subtle difference, I would love to know what it is! Thanks David -- Sent from Gmail Mobile

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
is significant. > > > On Mar 24, 2022, at 12:32 PM, David Henkemeyer < > david.henkeme...@gmail.com> wrote: > > > > Assuming -N is 1 (meaning, this job needs only one node), then is there > a difference between any of these 3 flag combinations: > > > > -n

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
Thank you! We recently converted from pbs, and I was converting “ppn=X” to “-n X”. Does it make more sense to convert “ppn=X” to --“cpus-per-task=X”? Thanks again David On Thu, Mar 24, 2022 at 3:54 PM Thomas M. Payerle wrote: > Although all three cases ( "-N 1 --cpus-per-task 64 -n 1

[slurm-users] Why is --cpu_bind not an option for sbatch? Why only srun?

2022-03-31 Thread David Henkemeyer
We noticed that we can pass --cpu_bind into an srun commandline, but not sbatch. Why is that? Thanks David

[slurm-users] Can I define and use custom env vars in slurm.conf?

2022-04-04 Thread David Henkemeyer
=$NODEPOOL1 MaxTime=INFINITE State=UP PartitionName=interactive Nodes=$NODEPOOL1 MaxTime=INFINITE State=UP PriorityJobFactor=2 PartitionName=perfNodes=$PERFNODES MaxTime=INFINITE State=UP PriorityJobFactor=2 Is this possible? Thanks, David

Re: [slurm-users] Can I define and use custom env vars in slurm.conf?

2022-04-04 Thread David Henkemeyer
That's exactly what I needed! Thank you, David On Mon, Apr 4, 2022 at 1:17 PM Brian Andrus wrote: > I think you are looking at nodesets: > > > From the slurm.conf man: > > NODESET CONFIGURATION > The nodeset configuration allows you to define a name for a specifi

Re: [slurm-users] Memory usage not tracked

2022-04-06 Thread Chin,David
TIME TIME_LIMIT NODES MIN_MEMO NODELIST(REASON) 2514854 def ClusterJobStart_ sbradley RUNNING5:05:27 8:00:00 1 36G node003 -- David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel dw...@drexel.edu 215.571.4335 (o) For URCF support: urcf-supp

[slurm-users] Looking for examples of daily job reports

2022-04-15 Thread David Henkemeyer
c and not that easy to parse. I know that there are 3rd party tools that can help with this. I'd love to hear/see what others are doing. Thanks David

Re: [slurm-users] gres/gpu count lower than reported

2022-05-03 Thread David Henkemeyer
then try taking them back into the idle state. Also, keep an eye on the slurmctld and slurmd logs. They usually are quite helpful to highlight what the issue is. David On Tue, May 3, 2022 at 11:50 AM Jim Kavitsky wrote: > Hello Fellow Slurm Admins, > > > > I have a new Slurm instal

[slurm-users] Is sacct not handling quotes properly?

2022-05-04 Thread David Henkemeyer
x27;s stripping the quotes? This seems unlikely to me. Thanks in advance! David

Re: [slurm-users] Is sacct not handling quotes properly?

2022-05-04 Thread David Henkemeyer
-- sbatch --export=NONE --wrap=uname -a --exclusive So, its storing properly, now I need to see if I can figure out how to preserve/add the quotes on the way out of the DB... David On Wed, May 4, 2022 at 11:15 AM Michael Jennings wrote: > On Wednesday, 04 May 2022, at 10:00:57 (-0700), > Davi

[slurm-users] How to run a job at the end of a set of jobs

2022-05-09 Thread David Henkemeyer
e the last job. What would be the various ways to achieve this? Thanks David

[slurm-users] Question about having 2 partitions that are mutually exclusive, but have unexpected interactions

2022-05-12 Thread David Henkemeyer
lurm parameter I can tweak to make slurm recognize that these partition B jobs shouldn't ever have a pending state of "priority". Or to treat these as 2 separate queues. Or something like that. Spinning up a 2nd slurm controller is not ideal for us (uless there is a lightweight method to do it). Thanks David

Re: [slurm-users] Question about having 2 partitions that are mutually exclusive, but have unexpected interactions

2022-05-12 Thread David Henkemeyer
other focuses on the rest. Or something similar. David On Thu, May 12, 2022 at 9:13 AM Brian Andrus wrote: > I suspect you have too low of a setting for "MaxJobCount" > > *MaxJobCount* > The maximum number of jobs SLURM can have in its active database >

[slurm-users] Is there a way create reservations w/o being Operator or Admin?

2022-07-11 Thread David Henkemeyer
our users into the accounting DB (and then add new users when we bring new people onboard). Thanks in advance, David

Re: [slurm-users] Rolling reboot with at most N machines down simultaneously?

2022-08-04 Thread David Simpson
Another way might be to implement slurm power off/on (if not already) and induce it as required. - David Simpson - Senior Systems Engineer ARCCA, Redwood Building, King Edward VII Avenue, Cardiff, CF10 3NB

[slurm-users] srun: error: io_init_msg_unpack: unpack error

2022-08-06 Thread David Magda
e backwards compatibility of the protocol not extend to srun(1)? Is there any way around this, or should we simply upgrade slurmd(8) on the work nodes, but leave the paths to the older user CLI utilities alone until all the compute nodes have been upgraded? Thanks for any info. Regards, David

Re: [slurm-users] srun: error: io_init_msg_unpack: unpack error

2022-08-08 Thread David Magda
On Aug 6, 2022, at 15:13, Chris Samuel wrote: > > On 6/8/22 10:43 am, David Magda wrote: > >> It seems that the the new srun(1) cannot talk to the old slurmd(8). >> Is this 'on purpose'? Does the backwards compatibility of the protocol not >> extend t

Re: [slurm-users] Possible to get cluster utilization by partition?

2022-08-24 Thread Chin,David
I cooked one up myself using Python (with Pandas) which I feel is more maintainable. https://github.com/prehensilecode/slurm_utilization/blob/main/utilization_from_sacct.py It's still in pretty rough shape, and could certainly use some refining. Cheers, Dave -- David Chin, PhD (h

[slurm-users] Slurm v22 for Alma 8

2022-12-02 Thread David Thompson
ol test in testsuite/slurm_unit/common/slurm_protocol_defs: FAIL: slurm_addto_id_char_list-test Before I start digging in, I thought I would check here and see if anyone has a successful RHEL/Alma/Rocky 8 slurm v22 SRPM they'd be willing to share. Thanks much! David Thompson University of Wisc

Re: [slurm-users] Slurm v22 for Alma 8

2022-12-02 Thread David Thompson
appreciate the help. David Thompson University of Wisconsin – Madison Social Science Computing Cooperative From: slurm-users On Behalf Of Paul Edmon Sent: Friday, December 2, 2022 11:26 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Slurm v22 for Alma 8 Yup, here is the spec we use

Re: [slurm-users] Slurm v22 for Alma 8

2022-12-02 Thread David Thompson
group name. Again, thanks Paul/Brian for the assistance. David Thompson University of Wisconsin – Madison Social Science Computing Cooperative From: David Thompson Sent: Friday, December 2, 2022 1:13 PM To: slurm-users@lists.schedmd.com Subject: RE: [slurm-users] Slurm v22 for Alma 8 Hi Paul

[slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-23 Thread David Laehnemann
ral? Many thanks and best regards, David

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-23 Thread David Laehnemann
sacct that slurmdbd will gracefully handle (per second)? Or any suggestions how to roughly determine such a rate for a given cluster system? cheers, david P.S.: @Loris and @Noam: Exactly, snakemake is a software distinct from slurm that you can use to orchestrate large analysis workflows---on

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-23 Thread David Laehnemann
more efficiently (and better tailored to how Slurm does things) is appreciated. cheers, david On Thu, 2023-02-23 at 09:46 -0500, Sean Maxwell wrote: > Hi David, > > On Thu, Feb 23, 2023 at 8:51 AM David Laehnemann < > david.laehnem...@hhu.de> > wrote: > > > Quick

[slurm-users] snakemake and slurm in general

2023-02-23 Thread David Laehnemann
eeper knowledge of such cluster systems providing their help along the way, which is why I am on this list now, asking for insights. So feel free to dig into the respective code bases with a bit of that grumpy energy, making snakemake or nextflow a bit better in how they deal with Slurm. cheers,

Re: [slurm-users] snakemake and slurm in general

2023-02-23 Thread David Laehnemann
workflow management system giving you additional control over things. So I'm not sure what exactly we are arguing about, right here... cheers, david On Thu, 2023-02-23 at 17:41 +0100, Ole Holm Nielsen wrote: > On 2/23/23 17:07, David Laehnemann wrote: > > In addition, there are very clear

Re: [slurm-users] snakemake and slurm in general

2023-02-24 Thread David Laehnemann
logic, but probably isn't impossible. And it seems to have been discussed, even recently (and I think, even with a recent contribution by you;): https://github.com/snakemake/snakemake/issues/301 I'll try to keep revisiting this, if I can find the time. cheers, david On Fri, 2023-02-24 at 08:

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-24 Thread David Laehnemann
Id can be non- unique? That would indeed spell trouble on a different level, and make status checks much more complicated... cheers, david On Thu, 2023-02-23 at 11:59 -0500, Sean Maxwell wrote: > Hi David, > > On Thu, Feb 23, 2023 at 10:50 AM David Laehnemann < > david.laehnem...@hhu

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-27 Thread David Laehnemann
k heads-up: I am documenting your input by linking to the mailing list archives, I hope that's alright for you? https://github.com/snakemake/snakemake/pull/2136#issuecomment-1446170467 cheers, david On Sat, 2023-02-25 at 10:51 -0800, Chris Samuel wrote: > On 23/2/23 2:55 am, Davi

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-27 Thread David Laehnemann
y used workflow manager in my field (bioinformatics), there's also an issue discussing Slurm job array support: https://github.com/nextflow-io/nextflow/issues/1477 cheers, david On Mon, 2023-02-27 at 13:24 +0100, Ward Poelmans wrote: > On 24/02/2023 18:34, David Laehnemann wrote: > > Those

Re: [slurm-users] speed / efficiency of sacct vs. scontrol

2023-02-27 Thread David Laehnemann
, so that others can hopefully reuse as much as they can in their contexts. But maybe some publicly available best practices (and no-gos) for slurm cluster users would be a useful resource that cluster admins can then point / link to. cheers, david On Mon, 2023-02-27 at 06:53 -0800, Brian Andr

[slurm-users] batched and efficient job status queries by snakemake using sacct

2023-03-15 Thread David Laehnemann
d to the current solution! cheers, david

[slurm-users] seff in slurm-23.02

2023-05-25 Thread David Gauchard
Hello, slurm-23.02 on ubuntu-20.04, seff is not working anymore: ``` # ./seff 4911385 Use of uninitialized value $FindBin::Bin in concatenation (.) or string at ./seff line 11. Name "FindBin::Bin" used only once: possible typo at ./seff line 11, line 602. perl: error: slurm_persist_conn_open:

Re: [slurm-users] seff in slurm-23.02

2023-05-25 Thread David Gauchard
Advanced Research Computing* Information and Technology Solutions (ITS) 303-273-3786 | mrobb...@mines.edu <mailto:mrobb...@mines.edu> A close up of a sign Description automatically generated *Our values:*Trust | Integrity | Respect | Responsibility *From: *slurm-users on behalf o

Re: [slurm-users] running mpi from inside an mpi job

2023-06-20 Thread David Schanzenbach
adding the --overlap flag to the srun call for the parent mpi process fixes the problem. https://slurm.schedmd.com/srun.html#OPT_overlap Thanks, David On 6/20/2023 4:08 AM, Vanhorn, Mike wrote: I have a user who is submitting a job to slurm which requests 16 tasks, i.e. #SBATCH --ntasks 16

[slurm-users] A fairshare policy that spans multiple clusters

2024-01-05 Thread David Baker
r clusters – if that makes sense. Does anyone have any thoughts on this question, please? Am I correct in thinking that federating clusters is related to my question? Do I gather correctly, however, that federation only works if there is a common database on a shared file system? Best regards, David

[slurm-users] failed to open persistant connection to localhost:6819

2017-11-28 Thread david vilanova
Hello, Don´t understand why i can´t connect to the controller. This is a new fresh install using slurmdbd in ubuntu 16.04 Seems like a persistant connection to mysql cannot be made ??? *slurmctl.log:* [2017-11-27T20:22:48.056] Job accounting information stored, but details not gathered [2017-11

[slurm-users] slurm conf with single machine with multi cores.

2017-11-29 Thread david vilanova
Hello, I have installed latest 7.11 release and my node is shown as down. I hava a single physical server with 12 cores so not sure the conf below is correct ?? can you help ?? In slurm.conf the node is configure as follows: NodeName=linuxcluster CPUs=1 RealMemory=991 Sockets=12 CoresPerSocket=1

Re: [slurm-users] slurm conf with single machine with multi cores.

2017-11-29 Thread david vilanova
(null). ondemand set. [2017-11-29T16:29:31.169] SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=2 David El El mié, 29 nov 2017 a las 15:59, Steffen Grunewald < steffen.grunew...@aei.mpg.de> escribió: > H

Re: [slurm-users] slurm conf with single machine with multi cores.

2017-11-30 Thread david vilanova
ps -ef | grep slurm slurm11388 1 0 09:24 ?00:00:00 /usr/local/sbin/slurmdbd slurm11430 1 0 09:24 ?00:00:00 /usr/local/sbin/slurmctld Any idea ? El El mié, 29 nov 2017 a las 18:21, Le Biot, Pierre-Marie < pierre-marie.leb...@hpe.com> escribió:

Re: [slurm-users] "command not found"

2017-12-15 Thread david vilanova
Thanks manuel, The shared folder between master and slave sounds like a good option. I’ll go and try that one, Thanks El El vie, 15 dic 2017 a las 12:36, Manuel Rodríguez Pascual < manuel.rodriguez.pasc...@gmail.com> escribió: > Hi David, > > The command to be executed must be

[slurm-users] Autoscaling slurm

2017-12-18 Thread david martin
Hi, I´m using slurm together with clustercfn autoscaling. I just have a problem and thought that you may help. When i run a script #Script.sh # /bin/bash ./myprogram --threads=5  inputfile outputfile The program uses 5 threads , assuming only 1 thread per cpu is launched it would requi

[slurm-users] Dependencies problem with cfncluster

2017-12-20 Thread david martin
-ntasks=$n --ntasks-per-core=1 --output=$n.slurmout jobscript`; echo "ntasks $n jobid $id" done jobscript file: #! /bin/bash echo $hostname Looks like clustercfn is not aware of job dependencies. Or is it a slurm problem ? Thanks, David

[slurm-users] LAST TASK ID

2018-02-06 Thread david martin
Hi, I´m running a batch array script and would like to execute a command after the last task #SBATCH --array 1-10%10:1 sh myscript.R inputdir/file.${SLURM_ARRAY_TASK_ID} # Would like to run a command after the last task For exemple when i was using SGE there was something like this | if($

[slurm-users] Allocate more memory

2018-02-07 Thread david martin
 Hi, I would like to submit a job that requires 3Go. The problem is that I have 70 nodes available each node with 2Gb memory. So the command sbatch --mem=3G will wait for ressources to become available. Can I run sbatch and tell the cluster to use the 3Go out of the 70Go available or is

Re: [slurm-users] Allocate more memory

2018-02-07 Thread david vilanova
/2018 15:50, Loris Bennett wrote: Hi David, david martin writes:  Hi, I would like to submit a job that requires 3Go. The problem is that I have 70 nodes available each node with 2Gb memory. So the command sbatch --mem=3G will wait for ressources to become available. Can I run sbatch and

Re: [slurm-users] Allocate more memory

2018-02-07 Thread david vilanova
[mailto:slurm-users-boun...@lists.schedmd.com] On > Behalf Of r...@open-mpi.org > Sent: Wednesday, February 7, 2018 10:03 AM > To: Slurm User Community List > Subject: Re: [slurm-users] Allocate more memory > > Afraid not - since you don’t have any nodes that meet the 3G requir

Re: [slurm-users] Allocate more memory

2018-02-07 Thread david vilanova
Afraid not - since you don’t have any nodes that meet the 3G > requirement, you’ll just hang. > > > >> On Feb 7, 2018, at 7:01 AM, david vilanova wrote: > >> > >> Thanks for the quick response. > >> > >> Should the following script do the trick ??

[slurm-users] Multithreads config

2018-02-16 Thread david martin
?* ** *Thanks,* ** *David*

[slurm-users] Multithreads config

2018-02-16 Thread david MARTIN
?* ** *Thanks,* ** *David*

Re: [slurm-users] Multithreads config

2018-02-16 Thread david martin
Benjamin Redling wrote: Am 16.02.2018 um 15:28 schrieb david martin: *I have a single physical server with :* * *64 cpus (each cpu has 16 cores) * * *480Gb total memory* *NodeNAME= Sockets=1 CoresPerSocket=16 ThreadsPerCore=1 Procs=63 REALMEMORY=48*** *This configuration will not work.

[slurm-users] Advice on managing GPU cards using SLURM

2018-03-05 Thread david baker
specify certain user limits on the partition definition rather than define another QOS. Any help or tips on getting the configuration started -- so that the user interface is not too complex -- would be really appreciated, please. Best regards, David

Re: [slurm-users] What version I should install?

2018-04-17 Thread David Rodríguez
does not appear "slurm-plugins-$VER*rpm" Your wiki helped me a lot! Thanks! David 2018-04-17 8:34 GMT+02:00 Ole Holm Nielsen : > On 04/16/2018 08:20 PM, David Rodríguez Galiano wrote: > >> Dear Slurm community, >> >> I am a sysadmin who needs to make a fr

[slurm-users] Trouble disabling core specialization

2019-06-27 Thread Guertin, David S.
w the incorrect value (and jobs still do not run on those cores)? Dave David Guertin Information Technology Services Middlebury College 700 Exchange St. Middlebury, VT 05753 (802)443-3143

[slurm-users] How to turn off core specialization?

2019-08-09 Thread Guertin, David S.
7;d like it back for jobs to use. Thanks, Dave David Guertin Information Technology Services Middlebury College 700 Exchange St. Middlebury, VT 05753 (802)443-3143

Re: [slurm-users] How to turn off core specialization?

2019-08-09 Thread Guertin, David S.
> Have you restarted all your slurm daemons? Yes, I have done that on every node, but it still shows one specialized core. > Not sure whether "scontrol reconfigure" can deal with that change. I tried "scontrol reconfigure", but it also had no effect. Thanks, Dave

Re: [slurm-users] How to turn off core specialization?

2019-08-09 Thread Guertin, David S.
no idea what that could be. Dave David Guertin Information Technology Services Middlebury College 700 Exchange St. Middlebury, VT 05753 (802)443-3143 From: slurm-users on behalf of Guertin, David S. Sent: Friday, August 9, 2019 4:28 PM To: Slurm User Community

Re: [slurm-users] How to turn off core specialization?

2019-08-29 Thread Guertin, David S.
Thanks! That worked. David Guertin Information Technology Services Middlebury College 700 Exchange St. Middlebury, VT 05753 (802)443-3143 From: slurm-users on behalf of Taras Shapovalov Sent: Wednesday, August 28, 2019 11:20 AM To: Slurm User Community

[slurm-users] Node is not allocating all CPUs

2022-04-05 Thread Guertin, David S.
CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s Why isn't this node allocating all 32 cores? Thanks, David Guertin

Re: [slurm-users] Node is not allocating all CPUs

2022-04-06 Thread Guertin, David S.
g 16. David Guertin From: slurm-users on behalf of Brian Andrus Sent: Tuesday, April 5, 2022 6:14 PM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Node is not allocating all CPUs You want to see what is output on the node itself when yo

Re: [slurm-users] Node is not allocating all CPUs

2022-04-06 Thread Guertin, David S.
uld the number of trackable resources be different from the number of actual CPUs? David Guertin From: slurm-users on behalf of Sarlo, Jeffrey S Sent: Wednesday, April 6, 2022 10:30 AM To: Slurm User Community List Subject: Re: [slurm-users] Node is

Re: [slurm-users] Node is not allocating all CPUs

2022-04-06 Thread Guertin, David S.
slurm.conf contains the following: SelectType=select/cons_tres SelectTypeParameters=CR_Core AccountingStorageTRES=gres/gpu Could this be constraining CgfTRES=cpu=16 somehow? David Guertin From: Guertin, David S. Sent: Wednesday, April 6, 2022 12:27 PM To: Slurm

[slurm-users] What version I should install?

2018-04-16 Thread David Rodríguez Galiano
version of slurm and munge in CentOS 7. Sorry for the inconvenience of reading this basic question. Thanks for your help, thanks for your work! David

[slurm-users] Fwd: Slurm/cgroups on a single head/compute node

2019-08-21 Thread David da Silva Pires
ys/fs/cgroup" CgroupAutomount=yes AllowedRAMSpace=100 AllowedSwapSpace=0 ConstrainCores=no ConstrainDevices=yes ConstrainKmemSpace=no ConstrainRAMSpace=no ConstrainSwapSpace=no MaxRAMPercent=100 MaxSwapPercent=100 TaskAffinity=no Thanks in advance for any help. -- David da Silva Pires

Re: [slurm-users] Fwd: Slurm/cgroups on a single head/compute node

2019-08-22 Thread David da Silva Pires
last SLURM version instead of using the default one from Ubuntu packages? Is there anything wrong in my configuration? By the way, the absolute path of the last two configuration files are: /etc/slurm-llnl/slurm.conf /etc/slurm-llnl/cgroup.conf Best regards. -- David da Silva Pires

Re: [slurm-users] Fwd: Slurm/cgroups on a single head/compute node

2019-08-29 Thread David da Silva Pires
=== root cpu,memory / slurm cpu,memory / * cpuset,memory /interactive The ideia is to create a cgroups with a cpuset for slurm, from 1 to 216. Let's see if it works. Best. -- David da Silva Pires

[slurm-users] How does slurm keep track of latest jobid

2020-05-19 Thread Flynn, David P. (Dave)
Where does Slurm keep track of the latest jobid. Since it is persistent across reboots, I suspect it’s in a file somewhere. — Dave Flynn

[slurm-users] Re: Best practices for tracking jobs started across multiple clusters for accounting purposes.

2024-08-29 Thread David via slurm-users
Hello, What is meant here by "tracking"? What information are you looking to gather and track? I'd say the simplest answer is using sacct, but I am not sure how federated/non-federated setups come into play while using it. David On Tue, Aug 27, 2024 at 6:23 AM Di Bernardini,

[slurm-users] Node (anti?) Feature / attribute

2024-06-14 Thread David Magda via slurm-users
itions, and not (AFAICT) on a per node basis. We’re currently on 22.05.x, but upgrading is fine. Regards, David -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Node (anti?) Feature / attribute

2024-06-17 Thread David Magda via slurm-users
This functionality in slurmd was added in August 2023, so not in the version we’re currently running: https://github.com/SchedMD/slurm/commit/0daa1fda97c125c0b1c48cbdcdeaf1382ed71c4f Perhaps something for the future. Currently looking like the job_submit.lua is the best candidate.

[slurm-users] Re: Node (anti?) Feature / attribute

2024-06-17 Thread David Magda via slurm-users
Could you post that snippet? > On Jun 14, 2024, at 14:33, Laura Hild via slurm-users > wrote: > > I wrote a job_submit.lua also. It would append "¢os79" to the feature > string unless the features already contained "el9," or if empty, set the > features string to "centos79" without the ampe

[slurm-users] Re: Temporarily bypassing pam_slurm_adopt.so

2024-07-08 Thread David Schanzenbach via slurm-users
Hi Daniel, Utilizing pam_access  with pam_slurm_adopt might be what you are looking for? https://slurm.schedmd.com/pam_slurm_adopt.html#admin_access Thanks, David On 7/8/2024 10:54 AM, Daniel L'Hommedieu via slurm-users wrote: Hi, all. We have a use case where we need to allow a gro

[slurm-users] QOS MaxTRESPU node=X intepretation

2024-08-30 Thread David Magda via slurm-users
info. Regards, David -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] pam error, related to accounting?

2025-04-09 Thread David Bremner via slurm-users
Recently I enabled accounting on my tiny (1 compute node, one head node) slurm cluster. slurmdbd.conf looks like AuthType=auth/munge DbdHost=vertex DbdPort=6819 SlurmUser=slurm StorageHost=localhost StorageType=accounting_storage/mysql StorageUser=slurm StoragePa

<    1   2