Yes, the system is a HPE Cray EX, and I am trying to use
switch/hpe_slingshot.
RC
On 10/28/2022 11:21 AM, Ole Holm Nielsen wrote:
On 10/28/22 07:35, Richard Chang wrote:
I have observed that when I specify a switch type in the slurm.conf
file and that particular switch type is not present in the slurmctld
node, slurmctld panics and shuts down. Is this expected ? My
slurmctld doesn't have the switch type, but the computes have that
switch type. how can I set it up so that it can utilise the feature
but not break slurm.
What is you line in slurm.conf? The manual page seems to describe
what you have observed:
SwitchType
Identifies the type of switch or interconnect used for
applica‐
tion communications. Acceptable values
include
"switch/cray_aries" for Cray systems, "switch/none" for
switches
not requiring special processing for job launch or
termination
(Ethernet, and InfiniBand) and The default
value is
"switch/none". All Slurm daemons, commands and
running jobs
must be restarted for a change in SwitchType to take
effect. If
running jobs exist at the time slurmctld is restarted
with a new
value of SwitchType, records of all jobs in any state
may be
lost.
Why do you want to use this configuration? Is your system a Cray?
/Ole