Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Joe Landman
Rahul Nabar wrote: On Tue, Aug 11, 2009 at 11:16 AM, Joe Landman wrote: There is a cost to going cheap. This cost is time, and loss of productivity. If your time (your students time) is free, and you don't need to pay for consequences (loss of grants, loss of revenue, loss of productivity, ...

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Rahul Nabar
On Tue, Aug 11, 2009 at 11:16 AM, Joe Landman wrote: > > There is a cost to going cheap.  This cost is time, and loss of > productivity.  If your time (your students time) is free, and you don't need > to pay for consequences (loss of grants, loss of revenue, loss of > productivity, ...) in delayed

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Robert G. Brown
On Tue, 11 Aug 2009, Joe Landman wrote: There is a cost to going cheap. This cost is time, and loss of productivity. If your time (your students time) is free, and you don't need to pay for consequences (loss of grants, loss of revenue, loss of productivity, ...) in delayed delivery of result

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Bruno Coutinho
2009/8/11 Rahul Nabar > On Tue, Aug 11, 2009 at 5:57 PM, Bruno Coutinho > wrote: > > Nehalem and Barcelona have the following cache architecture: > > > > L1 cache: 64KB (32kb data, 32kb instruction), per core > > L2 cache: Barcelona :512kb, Nehalem: 256kb, per core > > L3 cache: Barcelona: 2MB, N

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Rahul Nabar
On Tue, Aug 11, 2009 at 5:57 PM, Bruno Coutinho wrote: > Nehalem and Barcelona have the following cache architecture: > > L1 cache: 64KB (32kb data, 32kb instruction), per core > L2 cache: Barcelona :512kb, Nehalem: 256kb, per core > L3 cache: Barcelona: 2MB, Nehalem: 8MB , shared among all cores.

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Bruno Coutinho
2009/8/11 Rahul Nabar > On Tue, Aug 11, 2009 at 12:06 PM, Bill Broadley > wrote: > > Looks to me like you fit in the barcelona 512KB L2 cache (and get good > > scaling) and do not fit in the nehalem 256KB L2 cache (and get poor > scaling). > > Thanks Bill! I never realized that the L2 cache of th

Re: [Beowulf] Repeated Dell SC1435 crash / hang. How to get the vendor to resolve the issue when 20% of the servers fail in first year?

2009-08-11 Thread Rahul Nabar
On Thu, Apr 9, 2009 at 11:35 AM, Douglas J. Trainor wrote: > Rahul, > > I think Greg et al. are correct.  Does your SC1435 have a Delta Electronics > switching power supply?  I bet you have a 600 watt Delta. > > Intel recently had problems with outsourced 350 watt "FHJ350WPS" switching > power supp

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-11 Thread Rahul Nabar
On Tue, Aug 11, 2009 at 12:19 PM, Mikhail Kuzminsky wrote: > If this results are for HyperThreading "ON", it may be not too strange > because of "virtual cores" competition. > > But if this results are for switched off Hyperthreading - it's strange. > I have usual good DFT scaling w/number of cores

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Craig Tierney
Rahul Nabar wrote: > On Tue, Aug 11, 2009 at 12:40 PM, Craig Tierney wrote: >> What are you doing to ensure that you have both memory and processor >> affinity enabled? >> > > All I was using now was the flag: > > --mca mpi_paffinity_alone 1 > > Is there anything else I ought to be doing as well

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Rahul Nabar
On Tue, Aug 11, 2009 at 12:06 PM, Bill Broadley wrote: > Looks to me like you fit in the barcelona 512KB L2 cache (and get good > scaling) and do not fit in the nehalem 256KB L2 cache (and get poor scaling). Thanks Bill! I never realized that the L2 cache of the Nehalem is actually smaller than th

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Rahul Nabar
On Tue, Aug 11, 2009 at 12:40 PM, Craig Tierney wrote: > What are you doing to ensure that you have both memory and processor > affinity enabled? > All I was using now was the flag: --mca mpi_paffinity_alone 1 Is there anything else I ought to be doing as well? -- Rahul ___

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread David Mathog
Joe Landman wrote: > I am arguing for commodity systems. But some gear is just plain junk. > Not all switches are created equal. Some inexpensive switches do a far > better job than some of the expensive ones. Some brand name machines > are wholly inappropriate as compute nodes, yet they ar

Re: [Beowulf] numactl & SuSE11.1

2009-08-11 Thread Mikhail Kuzminsky
It's interesting, that for this hard&software configuration disabling of NUMA in BIOS gives more high STREAM results in comparison w/"NUMA enabled". I.e. for NUMA "off": 8723/8232/10388/10317 MB/s for NUMA "on": 5620/5217/6795/6767 MB/s (both for OMP_NUM_THREADS=1 and ifort 11.1 compiler). The

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Joe Landman
Daniel Pfenniger wrote: There is a cost to *EVERYTHING* Well, not really surprising. The point is to be quantitative, not subjective (fear, etc.). Each solution has a cost and alert people will choose the best one for them, not for the vendor. Sadly, not always (choosing the best one f

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Craig Tierney
Rahul Nabar wrote: > On Mon, Aug 10, 2009 at 12:48 PM, Bruno Coutinho wrote: >> This is often caused by cache competition or memory bandwidth saturation. >> If it was cache competition, rising from 4 to 6 threads would make it worse. >> As the code became faster with DDR3-1600 and much slower with

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Daniel Pfenniger
Joe Landman wrote: Gerry Creager wrote: Daniel Pfenniger wrote: Douglas Eadline wrote: [...] This article sounds unbalanced and self-serving. I thought it read a bit like a chronicle of my recent experiences. Mine were not so bad, so I found the tone too pessimistic. I think that this

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-11 Thread Mikhail Kuzminsky
In message from Rahul Nabar (Sun, 9 Aug 2009 22:42:25 -0500): (a) I am seeing strange scaling behaviours with Nehlem cores. eg A specific DFT (Density Functional Theory) code we use is maxing out performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are actually slower than 2 and 4 cores (

Re: [Beowulf] bizarre scaling behavior on a Nehalem

2009-08-11 Thread Bill Broadley
Rahul Nabar wrote: > Exactly! But I thought this was the big advance with the Nehalem that > it has removed the CPU<->Cache<->RAM bottleneck. Not sure I'd say removed, but they have made a huge improvement. To the point where a single socket intel is better than a dual socket barcelona. > So if

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Gerry Creager
+1 Joe Landman wrote: Gerry Creager wrote: Daniel Pfenniger wrote: Douglas Eadline wrote: [...] This article sounds unbalanced and self-serving. I thought it read a bit like a chronicle of my recent experiences. I think that this article is fine, not unbalanced. What I like to point

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Tim Cutts
On 11 Aug 2009, at 3:38 pm, Daniel Pfenniger wrote: Douglas Eadline wrote: All, I posted this on ClusterMonkey the other week. It is actually derived from a white paper I wrote for SiCortex. I'm sure those on this list have some experience/opinions with these issues (and other cluster issues!)

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Joe Landman
Gerry Creager wrote: Daniel Pfenniger wrote: Douglas Eadline wrote: [...] This article sounds unbalanced and self-serving. I thought it read a bit like a chronicle of my recent experiences. I think that this article is fine, not unbalanced. What I like to point out to customers and par

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-11 Thread David N. Lombard
On Mon, Aug 10, 2009 at 01:02:51PM -0700, Rahul Nabar wrote: > On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain wrote: > > Well, as there are only 8 "real" cores, running a computationally intensive > > process across 16 should *definitely* do worse than across 8. Some workloads will benefit

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Gerry Creager
Daniel Pfenniger wrote: Douglas Eadline wrote: All, I posted this on ClusterMonkey the other week. It is actually derived from a white paper I wrote for SiCortex. I'm sure those on this list have some experience/opinions with these issues (and other cluster issues!) The True Cost of HPC Clus

Re: [Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Daniel Pfenniger
Douglas Eadline wrote: All, I posted this on ClusterMonkey the other week. It is actually derived from a white paper I wrote for SiCortex. I'm sure those on this list have some experience/opinions with these issues (and other cluster issues!) The True Cost of HPC Cluster Ownership http://w

[Beowulf] The True Cost of HPC Cluster Ownership

2009-08-11 Thread Douglas Eadline
All, I posted this on ClusterMonkey the other week. It is actually derived from a white paper I wrote for SiCortex. I'm sure those on this list have some experience/opinions with these issues (and other cluster issues!) The True Cost of HPC Cluster Ownership http://www.clustermonkey.net//con

Re: [Beowulf] performance tweaks and optimum memory configs for a Nehalem

2009-08-11 Thread Håkon Bugge
On Aug 10, 2009, at 23:07 , Tom Elken wrote: Summary: IBM, SGI and Platform have some comparisons on clusters with "SMT On" of running 1 rank for every core compared to running 2 ranks on every core. In general, on low core-counts, like up to 32 there is about an 8% advantage for running