On 31/07/15 04:51, Tom Harvill wrote: > Thank you for your reply. Yes, it's 'bad' code. It's WRF mostly.
We've also seen this same issue with NAMD where rank 0 uses more memory as it's tracking all the information about other ranks in order to load balance correctly. Admittedly this was on BlueGene/Q where you're running thousands of ranks and each node only has 1GB/core so if you're running 16 ranks per node you can hit that 1GB limit easily. The solution there was (IIRC) to guide the user to use the SMP build so they could run 1 rank per node and multithread on the node instead. cheers, Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf