Re: [slurm-users] SLURM: reconfig

2022-05-06 Thread Ole Holm Nielsen
On 5/6/22 10:26, Mark Dixon wrote: On Thu, 5 May 2022, Ole Holm Nielsen wrote: ... You're right, probably the correct order for Configless must be: * stop slurmctld * edit slurm.conf etc. * start slurmctld * restart the slurmd nodes to pick up new slurm.conf See also slides 29-34 in https://sl

Re: [slurm-users] SLURM: reconfig

2022-05-06 Thread Mark Dixon
On Thu, 5 May 2022, Ole Holm Nielsen wrote: ... You're right, probably the correct order for Configless must be: * stop slurmctld * edit slurm.conf etc. * start slurmctld * restart the slurmd nodes to pick up new slurm.conf See also slides 29-34 in https://slurm.schedmd.com/SLUG21/Field_Notes_5

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Christopher Samuel
On 5/5/22 7:08 am, Mark Dixon wrote: I'm confused how this is supposed to be achieved in a configless setting, as slurmctld isn't running to distribute the updated files to slurmd. That's exactly what happens with configless mode, slurmd's retrieve their config from the slurmctld, and will g

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Christopher Samuel
On 5/5/22 5:17 am, Steven Varga wrote: Thank you for the quick reply! I know I am pushing my luck here: is it possible to modify slurm: src/common/[read_conf.c, node_conf.c] src/slurmctld/[read_config.c, ...] such that the state can be maintained dynamically? -- or cheaper to write a job manag

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Ole Holm Nielsen
On 5/5/22 16:08, Mark Dixon wrote: On Thu, 5 May 2022, Ole Holm Nielsen wrote: ... That is correct.  Just do "scontrol reconfig" on the slurmctld server.  If all your slurmd's are truly running Configless[1], they will pick up the new config and reconfigure without restarting. Details are su

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Ole Holm Nielsen
On 5/5/22 15:53, Ward Poelmans wrote: Hi Steven, I think truly dynamic adding and removing of nodes is something that's on the roadmap for slurm 23.02? Yes, see slide 37 in https://slurm.schedmd.com/SLUG21/Roadmap.pdf from the Slurm publications site https://slurm.schedmd.com/publications.ht

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Mark Dixon
On Thu, 5 May 2022, Ole Holm Nielsen wrote: ... That is correct. Just do "scontrol reconfig" on the slurmctld server. If all your slurmd's are truly running Configless[1], they will pick up the new config and reconfigure without restarting. Details are summarized in https://wiki.fysik.dtu.dk/n

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Ward Poelmans
Hi Steven, I think truly dynamic adding and removing of nodes is something that's on the roadmap for slurm 23.02? Ward On 5/05/2022 15:28, Steven Varga wrote: Hi Tina, Thank you for sharing. This matches my observations when I checked if slurm could do what I am upto: manage AWS EC2 dynamic(

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Ole Holm Nielsen
Hi Tina, On 5/5/22 14:54, Tina Friedrich wrote: Hi List, out of curiosity - I would assume that if running configless, one doesn't manually need to restart slurmd on the nodes if the config changes? That is correct. Just do "scontrol reconfig" on the slurmctld server. If all your slurmd's

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Brian Andrus
@Tina, Figure slurmd reads the config in ones and runs with it. You would need to have it recheck regularly to see if there are any changes. This is exactly what 'scontrol reconfig' does: tells all the slurm nodes to recheck the config. @Steven, It seems to me you could just have a monitor

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Steven Varga
Hi Tina, Thank you for sharing. This matches my observations when I checked if slurm could do what I am upto: manage AWS EC2 dynamic(spot) instances. After replacing MySQL with REDIS now i wonder what would it take to make slurm node addition | removal dynamic. I've been looking at the source code

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Tina Friedrich
Hi List, out of curiosity - I would assume that if running configless, one doesn't manually need to restart slurmd on the nodes if the config changes? Hi Steven, I have no idea if you want to do it every couple of minutes and what the implications are of that (although I've certainly manage

Re: [slurm-users] SLURM: reconfig

2022-05-05 Thread Steven Varga
Thank you for the quick reply! I know I am pushing my luck here: is it possible to modify slurm: src/common/[read_conf.c, node_conf.c] src/slurmctld/[read_config.c, ...] such that the state can be maintained dynamically? -- or cheaper to write a job manager with less features but supporting dynamic

Re: [slurm-users] SLURM: reconfig

2022-05-04 Thread Christopher Samuel
On 5/4/22 7:26 pm, Steven Varga wrote: I am wondering what is the best way to update node changes, such as addition and removal of nodes to SLURM. The excerpts below suggest a full restart, can someone confirm this? You are correct, you need to restart slurmctld and slurmd daemons at present

[slurm-users] SLURM: reconfig

2022-05-04 Thread Steven Varga
Hello, I am wondering what is the best way to update node changes, such as addition and removal of nodes to SLURM. The excerpts below suggest a full restart, can someone confirm this? or perhaps `*scontrol reconfigure | kill -s SIGHUP*` does it? best wishes: steven // src/slurmctld/read_config.c