On Mon, Aug 10, 2009 at 12:48 PM, Bruno Coutinho wrote:
> This is often caused by cache competition or memory bandwidth saturation.
> If it was cache competition, rising from 4 to 6 threads would make it worse.
> As the code became faster with DDR3-1600 and much slower with Xeon 5400,
> this code i
> Well, as there are only 8 "real" cores, running a computationally
> intensive process across 16 should *definitely* do worse than across 8.
Not typically.
At the SPEC website there are quite a few SPEC MPI2007 (which is an average
across 13 HPC applications) results on Nehalem.
Summary:
IBM,
Joshua Baker-LePain wrote:
> On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote
>
>> On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote:
(a) I am seeing strange scaling behaviours with Nehlem cores. eg A
specific DFT (Density Functional Theory) code we use is maxing out
performance at
Joshua Baker-LePain wrote:
> Well, as there are only 8 "real" cores, running a computationally
> intensive process across 16 should *definitely* do worse than across 8.
I've seen many cases where that isn't true. The P4 rarely justified turning
on HT because throughput would often be lower. Wit
On Mon, 10 Aug 2009 at 3:02pm, Rahul Nabar wrote
On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain wrote:
Well, as there are only 8 "real" cores, running a computationally intensive
process across 16 should *definitely* do worse than across 8. However, it's
not so surprising that you're seei
On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain wrote:
> Well, as there are only 8 "real" cores, running a computationally intensive
> process across 16 should *definitely* do worse than across 8. However, it's
> not so surprising that you're seeing peak performance with 2-4 threads.
> Nehale
Joshua Baker-LePain wrote:
On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote
On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote:
(a) I am seeing strange scaling behaviours with Nehlem cores. eg A
specific DFT (Density Functional Theory) code we use is maxing out
performance at 2, 4 cpus instead
On Mon, 10 Aug 2009 at 11:43am, Rahul Nabar wrote
On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote:
(a) I am seeing strange scaling behaviours with Nehlem cores. eg A
specific DFT (Density Functional Theory) code we use is maxing out
performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores a
On Mon, Aug 10, 2009 at 08:33:27AM -0400, Mark Hahn wrote:
>> Is there a way of finding out within Linux if Hyperthreading is on or
>> not?
>
> in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores.
> that is, I'm guessing one of your nehalem's shows as having 8 siblings
> and 4 cpu cor
this is on the machine which reports 16 cores, right? ?I'm guessing
that the kernel is compiled without numa and/or ht, so enumerates virtual
cpus first. ?that would mean that when otherwise idle, a 2-core
proc will get virtual cores within the same physical core. ?and that your 8c
test is merely
I'm sorry for my mistake:
the problem is on Nehalem Xeon under SuSE -11.1, but w/kernel
2.6.27.7-9 (w/Supermicro X8DT mobo). For Opteron 2350 w/SuSE 10.3 (w/
more old 2.6.22.5-31 -I erroneously inserted this string in my
previous message) numactl works OK (w/Tyan mobo).
NUMA is enabled in BIO
A while ago Tiago Marques had provided some benchmarking info in a
thread ( http://www.beowulf.org/archive/2009-May/025739.html ) and
some recent tests that I've been doing made me interested in this
snippet again:
>One of the codes, VASP, is very bandwidth limited and loves to run in a
>number of
On Mon, Aug 10, 2009 at 7:41 AM, Mark Hahn wrote:
>> (a) I am seeing strange scaling behaviours with Nehlem cores. eg A
>> specific DFT (Density Functional Theory) code we use is maxing out
>> performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are
>> actually slower than 2 and 4 cores (dep
On Mon, Aug 10, 2009 at 7:33 AM, Mark Hahn wrote:
> in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores.
> that is, I'm guessing one of your nehalem's shows as having 8 siblings
> and 4 cpu cores.
Yes. That works. Also looking at the "physical id" helps.
I was confused by the ht fla
just cause i've posted it everywhere else, figured i'd make one last
ditch effort and see if anyone on this list might know the answer...
I have several Sun x4100 with Infiniband servers which appear to be
running at 200MB/sec instead of 800MB/sec. It's a freshly reformatted
cluster converting fro
Googling for 'dmidecode Hyper Thread' I found this 2004 article:
the info in /proc/cpuinfo has definitely changed since 2004.
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsub
(a) I am seeing strange scaling behaviours with Nehlem cores. eg A
specific DFT (Density Functional Theory) code we use is maxing out
performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are
actually slower than 2 and 4 cores (depending on setup)
this is on the machine which reports 16 co
Is there a way of finding out within Linux if Hyperthreading is on or
not?
in /proc/cpuinfo, I believe it's a simple as siblings > cpu cores.
that is, I'm guessing one of your nehalem's shows as having 8 siblings
and 4 cpu cores.
___
Beowulf mailing li
Jérôme wrote:
> Hi,
>
> ParaView comes with its own VTK sources. You can find in the source tree
> : ./Paraview3/VTK
> The VTK binaries will be put in the ParaView binary tree : ./ParaViewBin/bin
>
> Obviously, the paths depend on your calling way, and on your CMake settings
>
> Hope that helps
19 matches
Mail list logo