Re: [Beowulf] What services do you run on your cluster nodes?

Patrick Geoffray Tue, 23 Sep 2008 19:11:43 -0700

Perry E. Metzger wrote:

You realize that most big HPC systems are using interconnects that
don't generate many or any interrupts, right?


Of course. Usually one even uses interrupt pacing/mitigation even in
gig ethernet on a modern machine -- otherwise you're not going to get
reasonable performance. (For 10Gig, you have to do even uglier
tricks.)

What Greg is trying to say is that high-speed interconnects used in HPCdo not raises interrupts at all. Data is delivered directly inuser-space, and the app (or the communication library) busy polls on it,no kernel/OS involved. There is one app process per core (usually boundto improve locality in a NUMA architecture). When a daemon wakes up, itwill preempt a core, and the app process just has to wait. If the app istighly coupled, that will delay everybody.

You can say that a daemon waking up every couple of hours is no bigdeal. However, if these events are uniformly distributed on a couplethousand nodes, it will happen a couple thousand times more often. Youcan solve this by gang scheduling the daemons across all the nodes, oryou can turn them off.

However, it is only important for large machines with tightly coupledcodes. For the majority of the cases, it's just being anal.


Patrick
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] What services do you run on your cluster nodes?

Reply via email to