Hi John, Thanks for the tips. I got it to work. The trick was to use
SchedulerParameters= bf_busy_nodes This is creating the nodes only if the previous node is filled up. Jordan > On Oct 26, 2018, at 1:13 AM, John Hearns <hear...@googlemail.com> wrote: > > Hi Jordan. > Regarding filling up the nodes look at > https://slurm.schedmd.com/elastic_computing.html > <https://slurm.schedmd.com/elastic_computing.html> > > SelectType > Generally must be "select/linear". If Slurm is configured to allocate > individual CPUs to jobs rather than whole nodes (e.g. > SelectType=select/cons_res rather than SelectType=select/linear), then Slurm > maintains bitmaps to track the state of every CPU in the system. If the > number of CPUs to be allocated on each node is not known when the slurmctld > daemon is started, one must allocate whole nodes to jobs rather than > individual processors. The use of "select/cons_res" requires each node to > have a CPU count set and the node eventually selected must have at least that > number of CPUs. > > > If I am not wrong you can configure the number of CPUs per node as a fixed > amount - if you select a fixed instance type > > > NOTE: This demo uses c4.2xlarge instance types for the compute nodes, which > have statically set the number of CPUs=8 in slurm_nodes.conf. If you want to > expierment with different instance types (in slurm-aws-startup.sh) ensure you > change the CPUs in slurm_nodes.conf. > > > > > > > On Fri, 26 Oct 2018 at 07:13, J.R. W <jwillis0...@gmail.com > <mailto:jwillis0...@gmail.com>> wrote: > > > > Hello everyone, > > I setup a SLURM cluster based on this post and plugin. > https://aws.amazon.com/blogs/compute/deploying-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-1/ > > <https://aws.amazon.com/blogs/compute/deploying-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-1/> > > When I submit jobs to the queue, the AWS instances start configuring. Because > I have so many potential instances, for each job, they spool up one instance. > For example, if I submit 10 job, AWS will configure 10 instances. What would > be ideal is if there is a slurm.conf option I’m missing that will tell the > power-save plugin to only configure N amount of nodes, even though there > hundreds of “available” nodes to configure in the cloud. Some potential > solutions I have thought of. > > 1. Have the scheduler fill up nodes even if they are in the configuring > state. SLURM knows how many CPUs are available for the nodes that are being > configured. Is there a way to have jobs all fill up a node, even if it’s in > the configuring state? That way, a queued job will not trigger the “power > save resume” of a new node. > > 2. Some parameter in slurm.conf that has maximum nodes that can be available. > > 3. Modify my slurm_resum script to check for how many nodes are configured. > If that number is greater than my N amount of nodes I want spun up, then do > nothing. Hopefully that will just send the job back to the queue to await one > of those configured nodes. > > I hope I’m making sense. I know the elastic computing is a new feature > > Jordan > > >