Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-10 Thread Rahul Nabar
On Mon, Aug 10, 2009 at 12:48 PM, Bruno Coutinho wrote: > This is often caused by cache competition or memory bandwidth saturation. > If it was cache competition, rising from 4 to 6 threads would make it worse. > As the code became faster with DDR3-1600 and much slower with Xeon 5400, > this code i

RE: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Tom Elken
> Well, as there are only 8 "real" cores, running a computationally > intensive process across 16 should *definitely* do worse than across 8. Not typically. At the SPEC website there are quite a few SPEC MPI2007 (which is an average across 13 HPC applications) results on Nehalem. Summary: IBM,

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Craig Tierney
Joshua Baker-LePain wrote: > On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote > >> On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote: (a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Bill Broadley
Joshua Baker-LePain wrote: > Well, as there are only 8 "real" cores, running a computationally > intensive process across 16 should *definitely* do worse than across 8. I've seen many cases where that isn't true. The P4 rarely justified turning on HT because throughput would often be lower. Wit

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Joshua Baker-LePain
On Mon, 10 Aug 2009 at 3:02pm, Rahul Nabar wrote On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain wrote: Well, as there are only 8 "real" cores, running a computationally intensive process across 16 should *definitely* do worse than across 8. However, it's not so surprising that you're seei

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Rahul Nabar
On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain wrote: > Well, as there are only 8 "real" cores, running a computationally intensive > process across 16 should *definitely* do worse than across 8. However, it's > not so surprising that you're seeing peak performance with 2-4 threads. >  Nehale

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Gus Correa
Joshua Baker-LePain wrote: On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote: (a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at 2, 4 cpus instead

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Joshua Baker-LePain
On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote: (a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores a

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Renato Callado Borges
On Mon, Aug 10, 2009 at 08:33:27AM -0400, Mark Hahn wrote: >> Is there a way of finding out within Linux if Hyperthreading is on or >> not? > > in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores. > that is, I'm guessing one of your nehalem's shows as having 8 siblings > and 4 cpu cor

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Mark Hahn
this is on the machine which reports 16 cores, right? ?I'm guessing that the kernel is compiled without numa and/or ht, so enumerates virtual cpus first. ?that would mean that when otherwise idle, a 2-core proc will get virtual cores within the same physical core. ?and that your 8c test is merely

Re: [Beowulf] numactl & SuSE11.1

2009-08-10 Thread Mikhail Kuzminsky
I'm sorry for my mistake: the problem is on Nehalem Xeon under SuSE -11.1, but w/kernel 2.6.27.7-9 (w/Supermicro X8DT mobo). For Opteron 2350 w/SuSE 10.3 (w/ more old 2.6.22.5-31 -I erroneously inserted this string in my previous message) numactl works OK (w/Tyan mobo). NUMA is enabled in BIO

[Beowulf] bizarre scaling behavior on a Nehalem

2009-08-10 Thread Rahul Nabar
A while ago Tiago Marques had provided some benchmarking info in a thread ( http://www.beowulf.org/archive/2009-May/025739.html ) and some recent tests that I've been doing made me interested in this snippet again: >One of the codes, VASP, is very bandwidth limited and loves to run in a >number of

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Rahul Nabar
On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote: >> (a) I am seeing strange scaling behaviours with Nehlem cores. eg A >> specific DFT (Density Functional Theory) code we use is maxing out >> performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are >> actually slower than 2 and 4 cores (dep

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Rahul Nabar
On Mon, Aug 10, 2009 at 7:33 AM, Mark Hahn wrote: > in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores. > that is, I'm guessing one of your nehalem's shows as having 8 siblings > and 4 cpu cores. Yes. That works. Also looking at the "physical id" helps. I was confused by the ht fla

[Beowulf] sun x4100's with infiniband

2009-08-10 Thread Michael Di Domenico
just cause i've posted it everywhere else, figured i'd make one last ditch effort and see if anyone on this list might know the answer... I have several Sun x4100 with Infiniband servers which appear to be running at 200MB/sec instead of 800MB/sec. It's a freshly reformatted cluster converting fro

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Mark Hahn
Googling for 'dmidecode Hyper Thread' I found this 2004 article: the info in /proc/cpuinfo has definitely changed since 2004. ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsub

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Mark Hahn
(a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are actually slower than 2 and 4 cores (depending on setup) this is on the machine which reports 16 co

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-10 Thread Mark Hahn
Is there a way of finding out within Linux if Hyperthreading is on or not? in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores. that is, I'm guessing one of your nehalem's shows as having 8 siblings and 4 cpu cores. ___ Beowulf mailing li

[Beowulf] Re: [Paraview] VTK under ParaView

2009-08-10 Thread Tomislav Maric
Jérôme wrote: > Hi, > > ParaView comes with its own VTK sources. You can find in the source tree > : ./Paraview3/VTK > The VTK binaries will be put in the ParaView binary tree : ./ParaViewBin/bin > > Obviously, the paths depend on your calling way, and on your CMake settings > > Hope that helps