Re: [Beowulf] bizarre scaling behavior on a Nehalem

Mikhail Kuzminsky Wed, 12 Aug 2009 08:21:46 -0700

In message from Craig Tierney <craig.tier...@noaa.gov> (Tue, 11 Aug2009 11:40:03 -0600):

Rahul Nabar wrote:
On Mon, Aug 10, 2009 at 12:48 PM, BrunoCoutinho<couti...@dcc.ufmg.br> wrote:
This is often caused by cache competition or memory bandwidthsaturation.If it was cache competition, rising from 4 to 6 threads would makeit worse.As the code became faster with DDR3-1600 and much slower with Xeon5400,
this code is memory bandwidth bound.
Tweaking CPU affinity to avoid thread jumping among cores of thewill not
help much, as the big bottleneck is memory bandwidth.
To this code, CPU affinity will only help in NUMA machines tomaintain
memory access in local memory.
If the machine has enough bandwidth to feed the cores, it willscale.
Exactly! But I thought this was the big advance with the Nehalemthatit has removed the CPU<->Cache<->RAM bottleneck. So if the codescaled
with the AMD Barcelona then it would continue to scale with the
Nehalem right?

I'm posting a copy of my scaling plot here if it helps.

http://dl.getdropbox.com/u/118481/nehalem_scaling.jpg

To remove most possible confounding factors this particular Nehlem
plot is produced with the following settings:

Hyperthreading OFF
24GB memory i.e. 6 banks of 4GB. i.e. optimum memory configuration
X5550

Even if we explained away the bizzare performance of the 4 node case
to the Turbo effect what is most confusing is how the 8 core data
point could be so much slower than the corresponding 8 core point ona
old AMD Barcelona.

Something's wrong here that I just do not understand. BTW, any other
VASP users here? Anybody have any Nehalem experience?
Rahul,
What are you doing to ensure that you have both memory and processor
affinity enabled?
Craig

As I mentioned here in "numactl&SuSE11.1' thread, on some kernelsthere is wrong behaviour for Nehalem (bad /sys/devices/system/nodedirectory content). This bug is presented, in particular, in defaultOpenSuSE 11 kernels (2.6.27.7-9 and 2.6.29-6), and (as it was writtedin the corresponding thread discussion) in FC11 2.6.29 kernel.

I found that in such situation disabling of NUMA in BIOS gives onlyincrease of STREAM throughput. Therefore I think this (Rahul) problemis not due to BIOS settings. Unfortunately I've no data about VASPitself.

It's interesting, do somebody have "normally working" w/Nehalem - inthe sense of NUMA - kernels ? AFAIK more old 2.6 kernels (from SuSE10.3) works OK, but I didn't check. May be error in NUMA support isthe reason of Rahul problem ?

Mikhail

--
Rahul
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by PenguinComputingTo change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf
--
Craig Tierney (craig.tier...@noaa.gov)
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by PenguinComputingTo change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf
--
üÔÏ ÓÏÏÂÝÅÎÉÅ ÂÙÌÏ ÐÒÏ×ÅÒÅÎÏ ÎÁ ÎÁÌÉÞÉÅ × ÎÅÍ ×ÉÒÕÓÏ×
É ÉÎÏÇÏ ÏÐÁÓÎÏÇÏ ÓÏÄÅÒÖÉÍÏÇÏ ÐÏÓÒÅÄÓÔ×ÏÍ
MailScanner, É ÍÙ ÎÁÄÅÅÍÓÑ
ÞÔÏ ÏÎÏ ÎÅ ÓÏÄÅÒÖÉÔ ×ÒÅÄÏÎÏÓÎÏÇÏ ËÏÄÁ.


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] bizarre scaling behavior on a Nehalem

Reply via email to