Greetings -- When we started using Slurm some years ago, obtaining the interactive resources through "srun ... --pty bash" was the standard that we adopted. We are now running Slurm v22.05 (happily), though we noticed recently some limitations when claiming resources to demonstrate or develop in an mpi environment. A colleague today was revisiting a finding dating back to January, which is:
I am having issues running interactive MPI jobs in a traditional way. It > just stays there without execution. > > srun -N 2 -n 4 --mem=4gb --pty bash > mpirun -n 4 ~/prime-mpi > > Hower, it does run with: > srun -N 2 -n 4 --mem=4gb ~/prime-mpi > As indicated, the first approach, taking the resources to test/demo MPI jobs via "srun ... --pty bash" no longer supports the launching of the job. We also checked the srun environment using verbosity, and found that the job steps are executed and terminate before the prompt is achieved in the requested shell. While we infer that changes were implemented, would someone be able to direct us to documentation or a discussion as to the changes, and the motivation? We do not doubt that there is compelling motivation, we ask to improve our understanding. As was summarized in and shared amongst our team following our review of the current operational behaviour: > > - "srun ... executable" works fine > - "salloc -n4", "ssh <node>", "srun -n4 <executable>" works > Using "mpirun -n4 <executable>" does not work > - In batch mode, both mpirun and srun work. > > Thanks to any and all who take the time to shed light on this matter. -- E.M. (Em) Dragowsky, Ph.D. Research Computing -- UTech Case Western Reserve University (216) 368-0082 they/them