Dan, thankoyu very much for a comprehensive and understandable reply.
On 5 March 2018 at 16:28, Dan Jordan wrote:
> John/Chris,
>
> Thanks for your advice. I'll need to do some reading on cgroups, I've
> never even been exposed to that concept. I don't even know if the SLURM
> setup I have acce
John/Chris,
Thanks for your advice. I'll need to do some reading on cgroups, I've never
even been exposed to that concept. I don't even know if the SLURM setup I
have access to has the cgroups or PAM plugin/modules enabled/available.
Unfortunately I'm not involved in the administration of SLURM,
Dan, completely off topic here. May I ask what type of simulations are you
running?
Clearly you probably have a large investment in time in Trick.
However as a fan of Julia language let me leave this link here:
https://juliaobserver.com/packages/RigidBodyDynamics
On 5 March 2018 at 07:31, John He
I completely agree with what Chris says regarding cgroups. Implement them,
and you will not regret it.
I have worked with other simulation frameworks, which work in a similar
fashion to Trick, ie a master process which spawns
off independent worker processes on compute nodes. I am thinking on an
On 05/03/18 12:12, Dan Jordan wrote:
What is the /correct /way to clean up processes across the nodes
given to my program by SLURM_JOB_NODELIST?
I'd strongly suggest using cgroups in your Slurm config to ensure that
processes are corralled and tracked correctly.
You can use pam_slurm_adopt fr
Sorry, you are right, the documentation is clear about it being available
only in EpilogSlurmctld. I'm quite new to SLURM and I've read some of the
documentation, but obviously I haven't grasped it all. I don't quite
understand the difference between --epilog vs --task-epilog, EpilogSlurm
vs. Epil
On 05/03/18 10:16, Dan Jordan wrote:
In my particular case, I need SLURM_JOB_NODELIST, which should be
available but it is not.
This is only available in PrologSlurmctld, not Prolog, according to
those docs. Does that match what you're trying?
cheers,
Chris