On Mon, Apr 22, 2013 at 11:40 AM, Mark Hahn <h...@mcmaster.ca> wrote:

> understood, but how did you decide that was actually a good thing?
>>>
>>>  Mark,
>>
>> Because it stopped the random out of memory conditions that we were
>> having.
>>
>
> aha, so basically "rebooting windows resolves my performance problems" ;)


Not really.  We are saying "we know better than you, flush your buffers".
Maybe in a perfect world we bring some kernel engineers in and make sure
that the OOM killer and other memory subsystem controllers work as we
desire when there is no swap.  That isn't something we have resources to
do.  While we figured this out on our in-house white-box clusters, it is
also needed on the more "supported" SGI ICE system.


>
>  I'm guessing this may have been a much bigger deal on strongly NUMA
>>> machines of a certain era (high-memory ia64 SGI, older kernels).
>>>
>>
> and the situation you're referring to was actually on Altix, right?
> (therefore not necessarily a good idea with current machines and kernels.)
>
>
No this is on two-socket, Intel x86_64 systems.  Standard cluster nodes
running IB, Lustre, and RHEL6 (but we did the same thing in the past on
RHEL5).

Craig
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to