Sound like you figured it out, but I mis-remembered and switched the
case on CR_LLN. Setting it spreads the jobs out across the nodes, not
filling one up first. Also, I believe it can be set per partition as
well.
On Tue, Sep 11, 2018 at 5:24 PM Felix Wolfheimer
<f.wolfhei...@googlemail.com> wrote:
>
> Thanks for the input! I tried a few more things but wasn't able to get the 
> behavior I want.
>  Here's what I tried so far:
> - Set SelectTypeParameter to "CR_CPU,CR_LLN".
> - Set SelectTypeParameter to "CR_CPU,CR_Pack_Nodes". The documentation for 
> this parameter seems to described the behavior I want (pack jobs as densely 
> as possible on instances, i.e., minimize the number of instances).
> - Assign Weights to nodes as follows:
> NodeName=compute-X Weight=X
>
> The different configurations result all in the same behavior: If jobs are 
> coming in when the start of a node has been triggered, but the node is not 
> yet up and running, SLURM won't consider this resource but instead triggers 
> the creation of another node. As I'm expecting that this will happen pretty 
> regularly in the scenario I'm dealing with, that's kind of critical for me. 
> BTW: I'm using SLURM 18.08 and I restarted slurmctld after each change in the 
> configuration of course.
>
> Am Di., 11. Sep. 2018 um 00:33 Uhr schrieb Brian Haymore 
> <brian.haym...@utah.edu>:
>>
>> I re-read the docs and I was wrong on the default behavior.  The default is 
>> "no" which just means don't oversubcribe the individual resources where I 
>> thought it was default to 'exclusive'.  So I think I've been taking us down 
>> a dead end in terms of what I thought might help. :\
>>
>>
>> I have a system her that we are running with the elastic setup but there we 
>> are doing exclusive (and it's sent that way in the conf) scheduling so I've 
>> not run into the same circumstances you have.
>>
>> --
>> Brian D. Haymore
>> University of Utah
>> Center for High Performance Computing
>> 155 South 1452 East RM 405
>> Salt Lake City, Ut 84112
>> Phone: 801-558-1150, Fax: 801-585-5366
>> http://bit.ly/1HO1N2C
>>
>> ________________________________________
>> From: slurm-users [slurm-users-boun...@lists.schedmd.com] on behalf of Chris 
>> Samuel [ch...@csamuel.org]
>> Sent: Monday, September 10, 2018 4:17 PM
>> To: slurm-users@lists.schedmd.com
>> Subject: Re: [slurm-users] Elastic Compute
>>
>> On Tuesday, 11 September 2018 12:52:27 AM AEST Brian Haymore wrote:
>>
>> > I believe the default value of this would prevent jobs from sharing a node.
>>
>> But the jobs _do_ share a node when the resources become available, it's just
>> that the cloud part of Slurm is bringing up the wrong number of nodes 
>> compared
>> to what it will actually use.
>>
>> --
>>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>>
>>
>>
>>
>>

Reply via email to