Is there a way to configure Slurm not to export the environment of the submission node by default?
-- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) www.accre.vanderbilt.edu On 2017-12-19 08:12:39-06:00 Jeffrey Frey wrote: Don't propagate the submission environment: srun --export=NONE myprogram &gt; On Dec 19, 2017, at 8:37 AM, Yair Yarom <ir...@cs.huji.ac.il> wrote: &gt; &gt; &gt; Thanks for your reply, &gt; &gt; The problem is that users are running on the submission node e.g. &gt; &gt; module load tensorflow &gt; srun myprogram &gt; &gt; So they get the tensorflow version (and PATH/PYTHONPATH) of the &gt; submission node's version of tensorflow (and any additional default &gt; modules). &gt; &gt; There is never a chance to run the "module add ${SLURM_CONSTRAINT}" or &gt; remove the unwanted modules that were loaded (maybe automatically) on &gt; the submission node and aren't working on the execution node. &gt; &gt; Thanks, &gt; Yair. &gt; &gt; On Tue, Dec 19 2017, "Loris Bennett" <loris.benn...@fu-berlin.de> wrote: &gt; &gt;&gt; Hi Yair, &gt;&gt; &gt;&gt; Yair Yarom <ir...@cs.huji.ac.il> writes: &gt;&gt; &gt;&gt;&gt; Hi list, &gt;&gt;&gt; &gt;&gt;&gt; We use here lmod[1] for some software/version management. There are two &gt;&gt;&gt; issues encountered (so far): &gt;&gt;&gt; &gt;&gt;&gt; 1. The submission node can have different software than the execution &gt;&gt;&gt; nodes - different cpu, different gpu (if any), infiniband, etc. When &gt;&gt;&gt; a user runs 'module load something' on the submission node, it will &gt;&gt;&gt; pass the wrong environment to the task in the execution &gt;&gt;&gt; node. e.g. "module load tensorflow" can load a different version &gt;&gt;&gt; depending on the nodes. &gt;&gt;&gt; &gt;&gt;&gt; 2. There are some modules we want to load by default, and again this can &gt;&gt;&gt; be different between nodes (we do this by source'ing /etc/lmod/lmodrc &gt;&gt;&gt; and ~/.lmodrc). &gt;&gt;&gt; &gt;&gt;&gt; For issue 1, we instruct users to run the "module load" in their batch &gt;&gt;&gt; script and not before running sbatch, but issue 2 is more problematic. &gt;&gt;&gt; &gt;&gt;&gt; My current solution is to write a TaskProlog script that runs "module &gt;&gt;&gt; purge" and "module load" and export/unset the changed environment &gt;&gt;&gt; variables. I was wondering if anyone encountered this issue and have a &gt;&gt;&gt; less cumbersome solution. &gt;&gt;&gt; &gt;&gt;&gt; Thanks in advance, &gt;&gt;&gt; Yair. &gt;&gt;&gt; &gt;&gt;&gt; [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tacc.utexas.edu%2Fresearch-development%2Ftacc-projects%2Flmod&amp;data=02%7C01%7Cdavide.vanzo%40vanderbilt.edu%7C0ea39bfde2134f5d08ad08d546ea871c%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636492895581644581&amp;sdata=bG2SKduxy19tfm52%2Boma59eYSHyi798arSmnOiS1x64%3D&amp;reserved=0 &gt;&gt; &gt;&gt; I don't fully understand your use-case, but, assuming you can divide &gt;&gt; your nodes up by some feature, could you define a module per feature &gt;&gt; which just loads the specific modules needed for that category, e.g. in &gt;&gt; the batch file you would have &gt;&gt; &gt;&gt; #SBATCH --constraint=shiny_and_new &gt;&gt; &gt;&gt; module add ${SLURM_CONSTRAINT} &gt;&gt; &gt;&gt; and would have a module file 'shiny_and_new', with contents like, say, &gt;&gt; &gt;&gt; module add tensorflow/2.0 &gt;&gt; module add cuda/9.0 &gt;&gt; &gt;&gt; whereas the module 'rusty_and_old' would contain &gt;&gt; &gt;&gt; module add tensorflow/0.1 &gt;&gt; module add cuda/0.2 &gt;&gt; &gt;&gt; Would that help? &gt;&gt; &gt;&gt; Cheers, &gt;&gt; &gt;&gt; Loris &gt; :::::::::::::::::::::::::::::::::::::::::::::::::::::: Jeffrey T. Frey, Ph.D. Systems Programmer V / HPC Management Network &amp; Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 :::::::::::::::::::::::::::::::::::::::::::::::::::::: </ir...@cs.huji.ac.il></loris.benn...@fu-berlin.de></ir...@cs.huji.ac.il>