On Thu, Sep 28, 2006 at 07:06:14PM +0100, Peter Wainwright wrote:
> What (in your opinion) is the right tradeoff between more cores, > more processors and more individual compute nodes? $/performance. Once you have your code written into pure MPI form, then you can run on any of the above alternatives. Then you can simply work out the price for various things, and make a guess at the performance. Run a few benchmarks to check your guesses. The general rules work like this: * The more cores per node, the less performance per core, due to imperfect scaling plus generally you only have 1 interconnect card/node. * Note that most interconnects don't scale very well to more cores per node, for example the "latency" number everyone quotes for interconnects is just 1 core/node. At 4 cores/node this number is much worse for most interconnects. * The more cores per node, the price is often higher per core, although this varies. You buy less interconnect, but you pay more for fancier processors and motherboards. We talk about a "sweet spot", that's still (in my opinion) 2 dual-core cpus per node. > However, I do not understand what happens when you have > multi-processor/multi-core nodes in a cluster. Do you just use MPI > (with each thread using its own non-shared memory) or is there any > way to do "mixed-mode" programming which takes advantage of shared > memory within a node (like, an MPI/OpenMP hybrid?). The first is the easiest. MPI takes advantage of shared memory within the node. The hybrid model is a lot more work for the programmer, and often is slower than pure MPI. And it hurts interconnect performance because you usually end up with just 1 core driving the interconnect. -- greg _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf