Re: [slurm-users] Issues with Slurm 23.11.1

2024-01-23 Thread Brian Haymore
Do you have a firewall between the slurmd and the slurmctld daemons? If yes, do you know what kind of idle timeout that firewall has for expiring idle sessions? I ran into something somewhat similar but for me it was between the slurmctld and slurmdbd where a recent change they made had one di

Re: [slurm-users] Upgrade from 17.02.11 to 21.08.2 and state information

2022-02-02 Thread Brian Haymore
Are you running slurmdbd in your current setup? If you are then the upgrade path there might have additional considerations moving this far in versions. -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 801-558

Re: [slurm-users] How to do a clean restart of slurmctld under systemd?

2019-12-27 Thread Brian Haymore
>From the man page for slurmctld you can feed it the '-c' option to clear >things. I'd suggest reading that man page to see what it says there and if >that matches your needs. -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City

Re: [slurm-users] question about partition definition

2019-12-09 Thread Brian Haymore
Have you looked at the limits you can set at the QOS or Account level in slurmdbd? There seems to be better granularity at those levels from what I've seen. -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 8

[slurm-users] CaRCC Systems Facing track

2018-12-13 Thread Brian Haymore
Greetings, As multiple aspects of the national Campus Research Computing (CaRCC) Consortium continue to move forward, we are ready and excited to invite membership the newly-forming Systems-Facing track within the People Network, whe

Re: [slurm-users] Elastic Compute

2018-09-10 Thread Brian Haymore
:17 PM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Elastic Compute On Tuesday, 11 September 2018 12:52:27 AM AEST Brian Haymore wrote: > I believe the default value of this would prevent jobs from sharing a node. But the jobs _do_ share a node when the resources become availa

Re: [slurm-users] Elastic Compute

2018-09-10 Thread Brian Haymore
18 um 03:55 Uhr schrieb Brian Haymore mailto:brian.haym...@utah.edu>>: What do you have the OverSubscribe parameter set on the partition your using? -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 801-5

Re: [slurm-users] Elastic Compute

2018-09-09 Thread Brian Haymore
What do you have the OverSubscribe parameter set on the partition your using? -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 801-558-1150, Fax: 801-585-5366 http://bit.ly/1HO1N2C

Re: [slurm-users] [UCE] Slurm From email address configuration

2018-01-04 Thread Brian Haymore
We handle this for slurm as well as a number of other "internal" services by having a mail server with re-write rules setup. We have run into a number of devices/software over the years that isn't able to address this on their own. Our mail server doesn't allow incoming mail from the outside,