We do the same at TACC in our base module (which happens to be called “TACC”),
and then we document it.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435| Fax: (512) 475-9445
On 3/6/18, 5:13 PM, "slurm-
Thanks again! I’d seen the second one but not the first one.
> On Mar 6, 2018, at 6:28 PM, Martin Cuma wrote:
>
> MKL is trying to be flexible as it has different potential levels of
> parallelism inside. Having MKL_ and OMP_NUM_THREADS can be beneficial in
> programs where you may want to use
MKL is trying to be flexible as it has different potential levels of
parallelism inside. Having MKL_ and OMP_NUM_THREADS can be beneficial in
programs where you may want to use your own OpenMP but restrict MKL's or
vice versa.
A good article on the different options that MKL provides is here:
Thanks, Martin — I almost mentioned Utah in my original e-mail as I turned up
your support page in a search.
It is good to know definitively that MKL honors that variable — would be
preferable to having to know about various different ones.
> On Mar 6, 2018, at 6:07 PM, Martin Cuma wrote:
>
>
Ryan,
we set OMP_NUM_THREADS=1 in the R and Python modules (MKL will honor
that), and instruct those users that want to run multi-threaded to set
OMP_NUM_THREADS themselves after loading the module - and make sure they
don't oversubscribe the node.
In our experience majority of R and Python
Hi SLURM users,
Software compiled against the MKL, like R or Python with NumPy/SciPy compiled
against MKL, or probably many other examples present a problem for someone who
is making choices via the scheduler which then the software does not respect.
Our most recent example is that someone is r
Hi All,
I am having an issue with jobs that end, either by an "scancel", or being
killed due to job wall time timeout, or even in with srun --pty interactive
shell), exiting the shell. An excerpt from /var/log/slurmd where a typical
job was running:
[2018-03-05T12:48:49.165] _run_prolog: run job
Hi all,
I would like to implement Slurm in my current HPC system.
I have many Jobs divided into job arrays - which makes me cross the Slurm’s 67
Million JOBuid limit.
I've looked into the source code and it looks like the ID’s are being reused
(67 Mil jobs cycle) but Slurm can handle identical ID