think, that that is the better way.
> If you have e.g. tasks with different memory needs, Slurm (or the
> oom_killer to be precise) would kill the job, if that limit gets exceeded.
> If the limit is set for the step, the tasks can "steal" memory from each
> other.
>
>
>
Hello everyone,
I came across a weird behavior and was wondering if this is a bug,
oversight, or intended?
It appears that Slurm does not set memory.limit_in_bytes at the task level,
but it does set it at the step level and above. Observe:
$ grep memory /proc/$$/cgroup
10:memory:/slurm/uid_2001/
We have a decent number of associations in our Slurm database (several
thousand), and sometimes the sacctmgr command is a bit finicky. We've found
it to claim it added an association before but actually didn't when the
load was high enough. We have API scripts that rely on associations being
added
Dear Slurm Community,
We recognize that the SlurmdTimeout has a default value of 300 seconds, and
that if the controller is unable to communicate with a node during that
time it will mark it down. We have two questions regarding this:
1. Won't also individual compute nodes kill their own jobs if
ty of Kentucky
jacob.chapp...@uky.edu
On Wed, Oct 30, 2019 at 2:28 PM Paul Edmon wrote:
> All the aggregate historic data should be accessible via sacct. sstat is
> for live jobs but sacct is for completed jobs.
>
> -Paul Edmon-
> On 10/30/2019 2:13 PM, Jacob Chappell wrote:
&
Is there a simple way to store sstat information permanently on job
completion? We already have job accounting on, but the information
collected from cgroups doesn't seem to be stored once a job finishes (sstat
-j $JOB_ID on a dead job returns an error).
Thanks,
___
On Tue, May 28, 2019 at 12:56 PM Paddy Doyle wrote:
> Hi Jacob,
>
> On Tue, May 28, 2019 at 11:38:23AM -0400, Jacob Chappell wrote:
>
> > Hello all,
> >
> > Is it possible in Slurm to check RawUsage against GrpTRESMins and
> prevent a
> > job from bein
Hello all,
Is it possible in Slurm to check RawUsage against GrpTRESMins and prevent a
job from being submitted if the RawUsage exceeds the GrpTRESMins? My center
needs this feature for detailed accounting constraints. The RawUsage is
important to us because we weight certain resource types and wa
Hi all,
It seems that "raw usage," i.e. what is shown with "sshare" shows TRES
minutes on the whole. With sacctmgr, I can configure GrpTRESMins to set
limits on TRES at the individual TRES level. However, is it possible to set
limits irrespective of the individual TRES? I.e., I'd like to do someth
bring it all back online. Obviously
> you could have issues in the future when/if the database schema changes and
> Slurm tries to auto-upgrade your extant database.
>
>
>
>
>
> > On Dec 10, 2018, at 11:33 AM, Jacob Chappell
> wrote:
> >
> > Hi all,
&g
Hi all,
We've come across an issue recently with Slurm account names. Our center
uses fairly long Slurm account names, as they record various important
pieces of information about the account such as the user's unique id,
department, project name, etc. Consequently, our account naming structure
is
27;s QOS's UsageFactor).
>
>
>
>
>
> > On Dec 5, 2018, at 11:33 AM, Jacob Chappell
> wrote:
> >
> > Hi All,
> >
> > Can you set TRESBillingWeights something besides partitions
> (specifically nodes)? I might want to bill higher for
Hi All,
Can you set TRESBillingWeights something besides partitions (specifically
nodes)? I might want to bill higher for nodes with faster CPUs but also
have a global partition with all nodes in it.
Thanks,
__
*Jacob D. Chappell, CSM*
*Research Com
ec 10, 2017 at 7:53 PM, Marcin Stolarek
wrote:
> You can use slurmctl prologue to save it the way you want.
>
> cheers,
> Marcin
>
>
>
> 2017-11-30 0:25 GMT+01:00 Chris Samuel :
>
>> On 30/11/17 8:57 am, Jacob Chappell wrote:
>>
>> Using "s
All,
Using "scontrol show jobid X" I can see info about running jobs, including
the command used to launch the job, the user's working directory, values of
stdout, stdin, stderr, etc. With Slurm accounting configured, sacct seems
to show *some* of this information about jobs that have completed. H
15 matches
Mail list logo