Re: [Beowulf] Definition of HPC

Craig Tierney - NOAA Affiliate Mon, 22 Apr 2013 10:06:16 -0700

On Thu, Apr 18, 2013 at 7:21 PM, Mark Hahn <h...@mcmaster.ca> wrote:

> Only for benchmarking?  We have done this for years on our production
>> clusters (and SGI provides a tool this and more to clean up nodes).  We
>> have this in our epilogue so that we can clean out memory on our diskless
>> nodes so there is nothing stale sitting around that can impact the next
>> users job.
>>
>
> understood, but how did you decide that was actually a good thing?
>
>
Mark,


Because it stopped the random out of memory conditions that we were having.



> if two jobs with similar file reference patterns run, for instance,
> drop_caches will cause quite a bit of additional IO delay.
>
>
For our workloads, this is a highly unlikely scenario because nodes are not
shared and the workload is very diverse, so for the next job to have any
connection to the previous job is negligible.

Craig


> I guess the rationale would also be much clearer for certain workloads,
> such as big-data reduction jobs, where things like executables would have
> to be re-fetched, but presumably much larger input data might never be
> re-referenced by following jobs.  it would have to be jobs that have a lot
> of intra- but not inter-job readonly file re-reference,
> and where clean-page scavenging is a noticable cost.
>
> I'm guessing this may have been a much bigger deal on strongly NUMA
> machines of a certain era (high-memory ia64 SGI, older kernels).
>
> regards, mark.
>

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Definition of HPC

Reply via email to