On Sun, Aug 9, 2009 at 9:34 PM, Gus Correa<[email protected]> wrote:
> See answers inline. Thanks! > So it is to me. > The good news is that according to all reports I read, > hyperthreading in Nehalem works well What I am more concerned about is its implications on benchmarking and schedulers. (a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are actually slower than 2 and 4 cores (depending on setup) Just doesn't make sense to me. We are indeed doing something wrong. And no, it isn't just bad parallelization of this code since we have ran it on AMDs and of course performance increases with cores on a single server for sure. (b) We usually set up Torque / PBS / maui to also allow partial server requests. i.e. somebody could say just get 4 cores on a server. The other four cores could go to another job or stay empty. Question is with hyperthreading this compartmentalization is lost isn't it? So userA who got 4 cores could end up leeching on the other 4 cores too? Or am I wrong? > > Which MPI do you use? OpenMPI > IIRR, you have Gigabit Ethernet, right? (not Infiniband) Yes. That's right. No infiniband. > If you use OpenMPI, you can set the processor affinity, > i.e. bind each MPI process to one "processor" (which was once > a CPU, then became a core, and now is probably a virtual > processor associated to the hyperthreaded Nehalem core). > In my experience (and other people's also) this improves > performance. Yup, good point. I have done this with Barcelonas (AMD) and had a 5% boost. Let me try it with the Nehalems too. > > It is possible that this is the result of not setting > processor affinity. > The Linux scheduler may not switch processes > across cores/processors efficiently. So let me double check my understanding. On this Nehalem if I set the processor affinity is that akin to disabling hyperthreading too? Or are these two independent concepts? > (Not sure you actually have 24GB or 16GB, though. > You didn't say how much memory you bought.) I am running two tests. machineA has 24 GB machineB has 16GB. But other things change too. machineA has the X5550 whereas machineB has the E5520. I'll post the results once I have them for the Nehalems! Thanks again, Gus. All very helpful. -- Rahul _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
