Hi, everyone.

My team runs a SLURM cluster, currently SLURM17, but we are working to upgrade 
to 22, of about 800 servers.  We currently have only x64 front-end servers, but 
we are looking to add some ARM servers.  I have deployed some new ARM front end 
servers in exactly the same way the x64 ones are deployed, but srun does not 
work on the ARM systems.  To be clear: a job is created, but srun does not 
connect the user to that job.  My command is “srun —pty bash” — same as on the 
x64 system.

On the x64 system, “srun —pty bash” results in a job and a shell on a server.  
On the ARM system, “srun —pty bash” results in the creation of a job, but srun 
never connects me to the shell on the server.

The controller log shows the job, and “squeue -u $USER” shows the job, but srun 
just doesn’t connect to the job.

I have done web searches and have not gotten any ideas on what might be causing 
this.  Anyone seen this?  Any ideas on how to fix it?

Thanks for any guidance or ideas.

Daniel

Reply via email to