Hello,
the docu for heterogeneous jobs [1] says that the envVar SLURM_JOB_ID
should be different for each component. However, I cannot reproduce this
on a fresh slurm-19.05.1 installation.
$ salloc -pcompute -N1 : -pcompute2 -N1
[...]
salloc: Granted job allocation 108453
[...]
bash-4.1$ sque
Hi,
we had the same issue and solved it by using the 'plane' distribution in
combination with MPMD style srun, e.g. in your example
#SBATCH -N 3 # 3 nodes with 10 cores each
#SBATCH -n 21 # 21 MPI-tasks in sum
#SBATCH --cpus-per-task=1 # if you do not want hyperthreading
cat > mpmd.conf <<
Hi,
based on information given in job_submit_lua.c we decided not to use
pn_min_memory any more. The comment in src says:
/*
* FIXME: Remove this in the future, lua can't handle 64bit
* numbers!!!. Use min_mem_per_node|cpu instead.
*/
Instead we check in job_submit.lua for s,th, like
if
Hello,
we recently updated from slurm 16.05.x to 17.11.5 and found that the
sbatch option --propagate is no longer followed. Although written in the
man pages the following does not modify the core file and stack size
limits on the compute nodes
#SBATCH --propagate=STACK,CORE
but it can sti
st 15 minutes!
Best,
Hendryk
--
Dr. Hendryk Bockelmann
Wissenschaftliches Rechnen
Abteilung Anwendungen
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45 a, D-20146 Hamburg, Germany
smime.p7s
Description: S/MIME Cryptographic Signature