Hello all,
For years I've been looking for what I might consider to be the holy grail
of composable resource allocation in slurm jobs:
* A command that can be run inside of an sbatch script...
* ...which immediately and synchronously invokes another sbatch script
(which may or may not invoke mpir
Thank you to both Kilian and Chris,
I have running on the slurm server to report once when any of the nodes go into
“Drain” State:
sudo -u slurm bash -c “strigger --set -D -p
/etc/slurm/triggers/slurm_admin_notify --flags=perm"
/bin/mail -s “ClusterName DrainedNode:$*” our_admin_email_address
Partitions have an ExclusiveUser setting. Not exclusive per job as I’d
mis-remembered, but exclusive per user.
In any case, none of my few Fluent users run graphically on the HPC. They do
their pre- and post-processing on local workstations, copying their .cas.gz and
.dat.gz files to the HPC an
Dear Slurm users,
Is there a 'Job Size'-factor equivalent of the 'Job Age'-factor's
PriorityMaxAge parameter in Slurm?
What I am looking for is a scaling parameter (or threshold parameter) to
normalize the job size factor to a value of 1 for '10-core 1-hour' jobs instead
of for 1-core 1-mi
Dear Slurm users,
Is there a 'Job Size'-factor equivalent of the 'Job Age'-factor's
PriorityMaxAge parameter in Slurm?
What I am looking for is a scaling parameter (or threshold parameter) to
normalize the job size factor to a value of 1 for '10-core 1-hour' jobs instead
of for 1-core 1-mi
On Thursday, 20 September 2018 5:57:56 PM AEST Mahmood Naderan wrote:
> It seems that when their fluent job crashes for some reasons, or they
> decide to close the fluent window without terminating the job or
> closing the terminal suddenly or ... the fluent processes remain in
> the node while t
On Thursday, 20 September 2018 1:50:39 AM AEST Siddharth Dalmia wrote:
> Is it possible to have a separate timeout for interactive jobs? Or can
> someone help me come up with a hack to do this?
I believe you should be able to catch interactive jobs in the submit filter by
looking for the absence
Hi,
We also have multiple partitions, but in addition we use a job submit
plugin to distinguish between srun/salloc and sbatch submissions. This
plugin forces a specific partition for interactive jobs (and the timelimit
with it) and using the license system it limits the number of simultaneous
int
Hi,
Since users are not experts with cluster environment and slurm, their
wrong works take up my time.
It seems that when their fluent job crashes for some reasons, or they
decide to close the fluent window without terminating the job or
closing the terminal suddenly or ... the fluent processes r
Hi Michael,
Sorry for the late response. Do you mean supplying --exclusive to the
srun command? Or I have to do something else for partitions? Currently
they use
srun -n 1 -c 6 --x11 -A monthly -p CAT --mem=32GB ./fluent.sh
where fluent.sh is
#!/bin/bash
unset SLURM_GTIDS
/state/partition1/ansys
10 matches
Mail list logo