Afternoon

I was involved in the design, procurement and initial setup of a 1936 cpu SGI Altix 3700Bx2 cluster based on 32p nodes with numa-link as the shared memory and cluster interconnect.

The machine mostly works as expected and you can treat it as a standard beowulf cluster... except for the queue and scheduler software. Your scheduler really needs to be numa-aware (knows about the topology of your interconnect or shared memory within the node and tries to keep jobs processes as close as possible) and with such large cluster nodes, it also needs to be able to use use cpu-sets to lock down MPI threads to specific cpu/mem sets. Without this, threads move, pages get sprayed all over memory and performance goes out the window.

We were lucky, one of my colleagues maintains a heavily modified OpenPBS, which is numa-aware, and another rewrote SGI's mpirun to place MPI processes into cpu sets. This means that users get excellent performance and reliable run times, which is important in the environment because they are expected to request the walltime that their jobs will run for.

Stu.


On 21/09/2006, at 22:59, Clements, Brent M ((SAIC)) wrote:

Out of my own curiosity, would those of you that have delt with current/next generation intel based NUMA systems give me your opinions on why/why not you would buy or use them as a cluster node.

I'm looking for primarily technical opinions.

Thanks!

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


--
Dr Stuart Midgley
[EMAIL PROTECTED]


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to