I run into this problem occasionally. In my organization, most accounts
are created with tcsh as the default shell, and then users copy my bash
submission script example from my online documentation, or copy someone
else's submission script written in bash. And then when the job runs, it
fails with an error about the module command not being found.
The problem you are describing is because the module command is defined
differently in bash and tcsh. In bash it's a function, but in tcsh it is
an alias. Slurm jobs inherit the environment of the shell submitting the
script, but when one of those shells is tcsh and the other is bash, or
vice-versa, the definition of the command 'module' doesn't survive.
On RHEL-based systems, if your using the environment modules RPM, the
module command itself is defined in the the files
/etc/profile.d/modules.{sh,csh}
One easy fix to this problem is that if someone is using tcsh but is
using a bash submission script, they can make the interpreter of their
bash submission script a login shell, which will process
/etc/profile.d/*.sh by adding a -l to the interpreter line of their script:
#!/bin/bash -l
I imagine that this will work with someone using bash as their login
shell, but writing their sbatch script in tcsh, but I've never come
across that scenario.
Prentice
On 1/22/21 9:34 AM, Thomas M. Payerle wrote:
On our clusters, we typically find that an explicit source of the
initialization dot files is need IF the default shell of
the user submitting the job does _not_ match the shell being used to
run the script. I.e., for sundry historical and other reasons,
the "default" login shell for users on our cluster is tcsh, so if an
user with login shell of tcsh submits a bash job script, they generally
need to do an explicit "source ~/.profile".
On Fri, Jan 22, 2021 at 5:42 AM Gestió Servidors
<sysadmin.c...@uab.cat <mailto:sysadmin.c...@uab.cat>> wrote:
Hello,
I use “Environment Modules” (http://modules.sourceforge.net/
<http://modules.sourceforge.net/>) in my SLURM cluster. In my
scripts I do need to add an explicit “source
/soft/modules-3.2.10/Modules/3.2.10/init/bash”. However, in
several examples I have read about SLURM scripts, nobody comments
that. So, have I forgotten a parameter in SLURM to “capture”
environment variables into the script or is it a problem due to my
distribution (CentOS-7)???
Thanks.
--
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroads paye...@umd.edu <mailto:paye...@umd.edu>
5825 University Research Park (301) 405-6135
University of Maryland
College Park, MD 20740-3831
--
Prentice Bisbal
Lead Software Engineer
Research Computing
Princeton Plasma Physics Laboratory
http://www.pppl.gov