On Mon, May 11, 2009 at 6:22 PM, Gus Correa wrote:
> Oops, I misunderstood what you said.
> I see now. You are bonding channels on the your nodes' dual GigE
> ports to double your bandwidth, particularly for MPI, right?
Yes. Each node has dual gigabit eth cards.
> I am curious about your result
All 64 bit machines with a dual channel
bonded Gigabit ethernet interconnect. AMD Quad-Core AMD Opteron(tm)
Processor 2354.
As others have said, 50% is a more likely HPL efficiency for a large GigE
cluster, but with your smallish cluster (24 nodes) and bonded channels,
you would probably get
On Mon, May 11, 2009 at 06:58:24PM -0400, Gus Correa wrote:
> How far/close I am now to the maximum performance that can be achieved?
In this case you're only asking about measuring memory. Measure the
curve on your current system. You roughly know the shape of the
curve, so that will allow you t
Greg Lindahl wrote:
On Mon, May 11, 2009 at 05:56:43PM -0400, Gus Correa wrote:
However, here is somebody that did an experiment with increasing
values of N, and his results suggest that performance increases
logarithmically with problem size (N), not linearly,
saturating when you get closer
Ashley Pittman wrote:
On Mon, 2009-05-11 at 15:09 -0400, Gus Correa wrote:
Mark Hahn wrote:
I haven't checked the Top500 list in detail,
but I think you are right about 80% being fairly high.
(For big clusters perhaps?).
Other way around, maintaining a high efficiency rating at large node
c
On Mon, May 11, 2009 at 05:56:43PM -0400, Gus Correa wrote:
> However, here is somebody that did an experiment with increasing
> values of N, and his results suggest that performance increases
> logarithmically with problem size (N), not linearly,
> saturating when you get closer to the maximum
On Mon, 2009-05-11 at 15:09 -0400, Gus Correa wrote:
> Mark Hahn wrote:
> I haven't checked the Top500 list in detail,
> but I think you are right about 80% being fairly high.
> (For big clusters perhaps?).
Other way around, maintaining a high efficiency rating at large node
counts is a very diff
Rahul Nabar wrote:
On Mon, May 11, 2009 at 2:09 PM, Gus Correa wrote:
Of course even the HPL Rmax is
not likely to be reached by a real application,
with I/O, etc, etc.
Rahul and I may be better off testing Rmax with our real
programs.
I do know that these benchmarks can be somewhat unrealis
Hi Tom, Greg, Rahul, list
Tom Elken wrote:
On Behalf Of Rahul Nabar
Rmax/Rpeak= 0.83 seems a good guess based on one very similar system
on the Top500.
Thus I come up with a number of around 1.34 TeraFLOPS for my cluster
of 24 servers. Does the value seem reasonable ballpark? Nothing too
accu
Rahul Nabar wrote:
On Mon, May 11, 2009 at 12:23 PM, Gus Correa wrote:
If you don't feel like running the HPL benchmark (It is fun,
but time consuming) to get your actual Gigaflops
(Rmax in Top500 jargon),
you can look up the Top500 list the Rmax/Rpeak ratio for clusters
with hardware similar t
> On Behalf Of Rahul Nabar
>
> Rmax/Rpeak= 0.83 seems a good guess based on one very similar system
> on the Top500.
>
> Thus I come up with a number of around 1.34 TeraFLOPS for my cluster
> of 24 servers. Does the value seem reasonable ballpark? Nothing too
> accurate but I do not want to be a
>- Original Message -
>From: "Greg Lindahl"
>
>On Mon, May 11, 2009 at 02:30:31PM -0400, Mark Hahn wrote:
>
>> 80 is fairly high, and generally requires a high-bw, low-lat net.
>> gigabit, for instance, is normally noticably lower, often not much
>> better than 50%. but yes,
On Mon, May 11, 2009 at 02:30:31PM -0400, Mark Hahn wrote:
> 80 is fairly high, and generally requires a high-bw, low-lat net.
> gigabit, for instance, is normally noticably lower, often not much
> better than 50%. but yes, top500 linpack is basically just
> interconnect factor * peak, and so u
On Mon, May 11, 2009 at 12:23 PM, Gus Correa wrote:
> If you don't feel like running the HPL benchmark (It is fun,
> but time consuming) to get your actual Gigaflops
> (Rmax in Top500 jargon),
> you can look up the Top500 list the Rmax/Rpeak ratio for clusters
> with hardware similar to yours.
> Y
2009/5/11 Mark Hahn :
>>> Excellent. Thanks Gus. That sort of estimate is exactly what I needed.
>>> I do have AMD Athelons.
>
> right - for PHB's, peak theoretical throughput is a reasonable approach,
> especially since it doesn't require any real work on your part. the only
> real magic is to fi
On Mon, May 11, 2009 at 2:09 PM, Gus Correa wrote:
> Of course even the HPL Rmax is
> not likely to be reached by a real application,
> with I/O, etc, etc.
> Rahul and I may be better off testing Rmax with our real
> programs.
>
I do know that these benchmarks can be somewhat unrealistic and the
Mark Hahn wrote:
Excellent. Thanks Gus. That sort of estimate is exactly what I needed.
I do have AMD Athelons.
right - for PHB's, peak theoretical throughput is a reasonable approach,
especially since it doesn't require any real work on your part. the only
real magic is to find the flops-per-
Excellent. Thanks Gus. That sort of estimate is exactly what I needed.
I do have AMD Athelons.
right - for PHB's, peak theoretical throughput is a reasonable approach,
especially since it doesn't require any real work on your part. the only
real magic is to find the flops-per-cycle multiplier f
Hi Tom, Rahul, list
Of course Tom is right about Barcelona and Shanghai being the first to
have 4 flops/cycle max.
(Is this what AMD calls "3rd generation Opterons"?)
On my first email I should have mentioned that I was talking about
Opteron "Shanghai" 2376.
I suppose 4 flops/cycle max is when
Rahul Nabar wrote:
On Mon, May 11, 2009 at 12:23 PM, Gus Correa wrote:
Theoretical maximum Gflops (Rpeak in Top500 parlance), for instance,
on cluster with AMD quad-core 2.3GHz processor
is:
2.3 GHz x
4 floating point operations/cycle x
4 cores/CPU socket x
number of CPU sockets per node x
num
> On Behalf Of Rahul Nabar
> On Mon, May 11, 2009 at 12:23 PM, Gus Correa
> wrote:
> > Theoretical maximum Gflops (Rpeak in Top500 parlance), for instance,
> > on cluster with AMD quad-core 2.3GHz processor is:
> >
> > 2.3 GHz x
> > 4 floating point operations/cycle x
> > 4 cores/CPU socket x
>
On Mon, May 11, 2009 at 12:23 PM, Gus Correa wrote:
> Theoretical maximum Gflops (Rpeak in Top500 parlance), for instance,
> on cluster with AMD quad-core 2.3GHz processor
> is:
>
> 2.3 GHz x
> 4 floating point operations/cycle x
> 4 cores/CPU socket x
> number of CPU sockets per node x
> number o
Rahul Nabar wrote:
I was recently asked to report the FLOPS capacity of our home-built
computing cluster. Never did that before. Some googling revealed that
LINPACK is one such benchmark. Any other options / suggestions?
I am not interested in a very precise value just a general ballpark to
gene
I was recently asked to report the FLOPS capacity of our home-built
computing cluster. Never did that before. Some googling revealed that
LINPACK is one such benchmark. Any other options / suggestions?
I am not interested in a very precise value just a general ballpark to
generate what-if scenario
Mark Hahn wrote:
I had no idea that the MIPSPro compiler had been resurrected so I was
interested to hear about the AMD effort.
I can't speak for the popularity of that past, but since I stumbled
across Open64 I've been eager to promote and make it better. From
Google Summer of code to recruit
I agree and I also believe that good benchmark data
takes time and care. Which is why some of this data
can be hard to come by. i.e. tried XYZ cc
on my app and things go faster, I'm done.
The problem is further compounded by multi-core.
Unless you benchmark the all CPU sockets you
never really kn
26 matches
Mail list logo