Re: [Beowulf] Re: Problems scaling performance to more than one node, GbE

Bogdan Costescu Tue, 17 Feb 2009 11:31:42 -0800

On Mon, 16 Feb 2009, Tiago Marques wrote:

I must ask, doesn't anybody on this list run like 16 cores on twonodes well, for a code and job that completes like in a week?

For GROMACS and other MD programs, the way a job runs depends on a lotof factors that define the simulation: the size of the molecularsystem, the force field in use, the cutoff distances, etc.Furthermore, what you call a job actually contains a very importantvariable - the number of MD steps, which can make the total runtime gofrom seconds to months (or more). Asking for someone who runs in thesame conditions as you probably means that he/she has already done thesimulations you are about to begin, meaning from a scientific point ofview that you would better invest your time in something else ashe/she would publish first ;-)

I have found several MD codes to scale rather poorly when used onclusters composed of 8-core nodes, especially when those 8 cores arecoming from 2 quad-core Intel CPUs; the poor scaling was also withInfiniBand (Mellanox ConnectX), so IB will not magically solve yourproblems. The setup that seemed to me like a good compromise was with4-core nodes, when these 4 cores come from 2 dual-core CPUs,associated with Myrinet or IB.

You have to understand that, the way most MD programs are done thisdays, the MD simulations of small molecular systems are simply notgoing to scale, the communication dominates the total runtime.Communication through shared memory is still the best way to scalesuch a job, so having a node with as many cores a possible and runninga job to use all of them is probably going to give you the bestperformance.


--
Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
E-mail: bogdan.coste...@iwr.uni-heidelberg.de
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Re: Problems scaling performance to more than one node, GbE

Reply via email to