Hi,
I can not figure out why the following mpi script failed to start.
[siadati@rocks7 ~]$ sacctmgr list association
format=partition,account,user,grptres | grep siadati
othersem1siadati cpu=6,mem=8G
[siadati@rocks7 ~]$ cat slurm_script.sh
#!/bin/bash
#SBATCH --output=test.out
#SBA
> On 30 Apr 2018, at 22:37, Nate Coraor wrote:
>
> Hi Shawn,
>
> I'm wondering if you're still seeing this. I've recently enabled task/cgroup
> on 17.11.5 running on CentOS 7 and just discovered that jobs are escaping
> their cgroups. For me this is resulting in a lot of jobs ending in
> OU
Nevermind - it appears to happen when puppet runs. I have no hand in that,
so I'll kick it to those admins and report back with what I find.
I ruled out slurm by simply creating a non-slurm cgroup, with e.g.
`cgcreate -g memory:test`, and that cgroup also disappeared unexpectedly.
--nate
On Mon,
Hi Shawn,
I'm wondering if you're still seeing this. I've recently enabled
task/cgroup on 17.11.5 running on CentOS 7 and just discovered that jobs
are escaping their cgroups. For me this is resulting in a lot of jobs
ending in OUT_OF_MEMORY that shouldn't, because it appears slurmd thinks
the oom
Hi,
Unfortunately that can't be a solution in my running production environment for
a number of reasons. I did consider it (
Thanks!
-John
On 4/30/18, 2:40 AM, "slurm-users on behalf of Bjørn-Helge Mevik"
wrote:
"Roberts, John E." writes:
> So now the issue remains on why I c
Hi Ole,
Ole Holm Nielsen writes:
> Hi Loris,
>
> On 04/30/2018 01:09 PM, Loris Bennett wrote:
>> Your example of how to use 'Organisation' to setup separate groups
>> within one department is illuminating. However, I am still unable to
>> set up 'geochemie' as a sibling of 'geophysik' and a chi
Hi Loris,
On 04/30/2018 01:09 PM, Loris Bennett wrote:
Your example of how to use 'Organisation' to setup separate groups
within one department is illuminating. However, I am still unable to
set up 'geochemie' as a sibling of 'geophysik' and a child of 'geowiss':
$ sacctmgr list acc where a
Hi Ole,
Ole Holm Nielsen writes:
> Hi Loris,
>
> On 04/30/2018 10:12 AM, Loris Bennett wrote:
>> Thanks, I should have spotted that, although I don't understand the
>> difference between 'parent' and 'organisation' and in fact asked this
>> question:
>>
>> https://groups.google.com/forum/#!top
Hi Loris,
On 04/30/2018 10:12 AM, Loris Bennett wrote:
Thanks, I should have spotted that, although I don't understand the
difference between 'parent' and 'organisation' and in fact asked this
question:
https://groups.google.com/forum/#!topic/slurm-users/f1vftgIRcVk
on the subject recently.
Hi Simon,
Simon Flood writes:
> Hi Loris,
>
> On 27/04/18 13:46, Loris Bennett wrote:
>> Hi,
>>
>> If I dump my account structure with sacctmgr, I get
>>
>>Parent - 'geowiss'
>>Account - 'geochemie':Fairshare=2
>>Account -
>> 'geographie':Description='geographie':Organization='geowi
Hi Paul,
Thanks for the reply. Just want over the backfill options again.
It look reasonable that after a certain number of jobs in the first cluster,
the other one doesn't even get tested since there are too many jobs to backfill
in the first cluster.
I will try to look at the partition_job_dep
"Roberts, John E." writes:
> So now the issue remains on why I can’t use decimals to bill for time…
As a work around, perhaps you can just scale up all numbers so you get integers.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
12 matches
Mail list logo