Registration for the 2018 Slurm User Group Meeting is open. You can
register at https://slug18.eventbrite.com
The meeting will be held on 25-26 September 2018 in Madrid Spain at CIEMAT.
- *Early registration*
- May 29 through July 2
- $300 USD
- *Standard registration*
- J
Slurm User Group Meeting 2018
25-26 September 2018
Madrid, Spain
You are invited to submit an abstract of a tutorial, technical presentation
or site report to be given at the Slurm User Group Meeting 2018. This event
is sponsored and organized by CIEMAT and SchedMD. This international event
is ope
On thing that seems concerning to me is that you may start a job on a
node before a currently running job has 'expanded' as much as it will.
If there is 128G on the node and current job is using 64G but will
eventually use 112G, your approach could start another similar job and
they would both
On 05/25/2018 11:19 AM, Will Dennis wrote:
Not yet time for us... There's problems with U18.04 that render it unusable for
our environment.
What problems have you run in to with 18.04?
Thanks for your inputs, the automatic reporting is definitely a great idea and
seems easy to implement in Slurm. At our site we have a web portal developed
internally where users can see in real time everything that is happening on the
cluster, and every metric of their own job. There is especia
John Hearns writes:
> Alexandre, you have made a very good point here. "Oftentimes users only input
> 1G as they really have no idea of the memory requirements,"
> At my last job we introduced cgroups. (this was in PBSPro). We had to enforce
> a minumum request for memory.
> Users then asked us
Alexandre, you have made a very good point here. "Oftentimes users only
input 1G as they really have no idea of the memory requirements,"
At my last job we introduced cgroups. (this was in PBSPro). We had to
enforce a minumum request for memory.
Users then asked us how much memory their jobs use
Hello John, this behavior is needed because the memory usage of the codes
executed on the nodes are particularly hard to guess. Usually, when exceeded
the ratio is between 1.1 and 1.3 more than expected. Sometimes much larger.
A) Indeed there is a partition running only exclusive jobs, but
Also regarding memory, there are system tunings you can set for the
behaviour of the OurOfMemory Killer and also the VM overcommit.
I have seen the VM overcommit parameters being discussed elsewhere, and
generally for HPC people advise to disable overcommit
https://www.suse.com/support/kb/doc/?id=
Alexandre, it would be helpful if you could say why this behaviour is
desirable.
For instance, do you have codes which need a large amount of memory and
your users are seeing that these codes are crashing because other codes
running on the same nodes are using memory.
I have two thoughts:
A) en
Hi,
When I submit the following script, I receive a job id. However, it doesn't
show that in squeue. Moreover, there is no log file as I specified in the
script
hamid@rocks7:scripts$ cat slurm_script.sh
#!/bin/bash
#SBATCH --job-name=hvacSteadyFoam
#SBATCH --output=hvacSteadyFoam.log
#SBATCH --nta
Hi,
in the cluster where I'm deploying Slurm the job allocation has to be based on
the actual free memory available on the node, not just the allocated by Slurm.
This is nonnegotiable and I understand that it's not how Slurm is designed to
work, but I'm trying anyway.
Among the solutions that
12 matches
Mail list logo