On 31/07/15 04:51, Tom Harvill wrote:

> Thank you for your reply.  Yes, it's 'bad' code.  It's WRF mostly.

We've also seen this same issue with NAMD where rank 0 uses more memory
as it's tracking all the information about other ranks in order to load
balance correctly.  Admittedly this was on BlueGene/Q where you're
running thousands of ranks and each node only has 1GB/core so if you're
running 16 ranks per node you can hit that 1GB limit easily.

The solution there was (IIRC) to guide the user to use the SMP build so
they could run 1 rank per node and multithread on the node instead.

cheers,
Chris
-- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to