The team HAD a system a month ago nearly at the world champs computerchess
2006.
And it clocked 3Ghz and had 2 sockets (4 cores in total)
3Ghz is 25% more than the 2.4Ghz dual core opterons i've got here
and they get a 20% higher IPC.
So effectively it's 50% faster.
Now we already knew at some floating point benchmarks that are not so
dependant upon
a huge L3 cache (which most specfp software is somehow) that it destroyed
quite a lot the
competition, but chess is completely integer only and has a lot of branches
that nonstop get
mispredicted.
So getting a 20% higher IPC there than k8 is quite *impressive*.
Of course it's not clear to what amount of that 20% the ddrII contributes.
As it has the same latency, just more bandwidth, according to what i read
online,
i'd say at most 2% of that is RAM.
The team using it is the junior team. Amir Ban and Shay Bushinsky. The last
is the programmer
the first the former foreman of www.m-systems.com
You can find them in Tel Aviv somewhere.
The majority of systems in financial world that run databases and are
serving are 2 sockets, not 4 sockets.
It's very fair to compare woodcrest to k8, because the next generation chip
from AMD is K8L and as
that must use 0.065 technology which will under normal circumstances take
till 2008 or so to get sold in shops,
whereas woodcrest can be ordered today from Dell (and hopefully soon gets
delivered).
Basically first that K8L chip must tape out then it takes another year to
produce it and get it in the shops. That's how it normally works.
So januari 2008 would be already good.
Of course i hope AMD proves us wrong there!
But the combination of new process technology + moving from 3 to 4
instructions a cycle will of course give massive problems
and headaches to AMD. Especially knowing the years of delay it took to
introduce previous technology (0.09) when it was new.
In short AMD will have to release some quad core k8 end of this year to be
able to compete with woodcrest AND clock it to 3Ghz.
Of course putting 2 more cores to k8 is simpler for AMD than to design a new
core that executes at 4 instructions a cycle.
Right now what AMd is doing is simply putting in more watts. 125 watt is
just over the top IMHO.
That's even more than what intel did do to Xeon P4 when it had failed (105
watt) and similar to itanium (125 watt also).
That dual opteron dual core 2.4Ghz here is already nearly uncoolable.
Not to mention what happens when i've got my beowulf online here with 14-16
nodes!
Woodcrest has 3 floating point/SSE2 units so it can effectively execute 4
instructions a cycle (as about every other instruction is a memory operand
and amazingly that doesn't get counted as a flop but in reality MUST get
executed) versus some improved k8 end this year will do 3 instructions a
cycle.
Now we didn't talk about what happens to k8 when it executes a vector path
instruction which basically completely blocks all its execution units and
effectively eats a cycle or 7 to 8. Multiplying is 4 cycles, very important
for matrix calculations, and that's very nice at opteron of course.
Just the IPC difference that's a 33% difference in advantage to woodcrest at
this moment and woodcrest is not more expensive than the 265-285 series from
AMD.
Then additionally intel has the major advantage of having a good compiler
for it, which is really the big killer in performance.
I didn't even compile Diep yet for intel c++ to test at woodcrest. What
speed will it get at it then?
More or less than 20% ipc difference to AMD?
I count at *more*.
The potential advantage of woodcrest is therefore 33% for our software
that's hardly multiplying and 99.9% of the time executing integer code.
AMD wins back some because branches run faster at it i guess.
Seems that certain mispredicted branches at AMD just eat a cycle or 5.
Vincent
----- Original Message -----
From: "Craig Tierney" <[EMAIL PROTECTED]>
To: "Vincent Diepeveen" <[EMAIL PROTECTED]>
Cc: "Kevin Ball" <[EMAIL PROTECTED]>; "Erik Paulson"
<[EMAIL PROTECTED]>; <beowulf@beowulf.org>; "Patrick Geoffray"
<[EMAIL PROTECTED]>
Sent: Wednesday, June 28, 2006 11:44 PM
Subject: Re: [Beowulf] Three notes from ISC 2006
Vincent Diepeveen wrote:
Woodcrest totally destroys everything in terms of raw cpu performance.
Not only it clocks nearly 25% higher. According to junior team who used
such a system
from HP (that's their normal sponsor) at world champs 2006 it was giving
a 20% higher ipc for their program too.
What do you mean 'clock nearly 25% higher'. Higher than what? I had
the impression that the top clock rates of the Woodcrest are dropping
slightly from the Dempsey numbers. The machines were just announced, the
team you refer to has a system already?
Can you describe what program they were running? What is FP intensive?
Woodcrest has added an additional 128-bit SSE2 register, but no additional
memory bandwidth to support it. Double the FP performance is nice, but in
practice I wonder if I will see even a 5% increase for my codes. The
change is nice for linpack (doubles node performance), but even the
efficiency drops somewhat.
That's 50% faster than 2.4Ghz dual core opteron.
Only for those who need latency to the RAM above cpu performance,
A64-single core with 16GB RAM at each node will be more interesting.
That's not many applications.
Of course if you buy something *today* the dual core opteron is the
preferred node,
as woodcrest isn't in the shop yet buyable.
If your software can work with gigabit ethernet then of course the price
per node of an A64 dual core with cheap RAM
and a cheap mainboard could be more interesting than a faster node that's
a little bit more expensive, using DDRII ram.
So the aspect of cost could be a concern.
At dual socket level however, the choice is simple. Woodcrest will outgun
AMD in a big way.
You are comparing Intel's latest processor (which little or no real
benchmarks) to AMD's last generation processor. The next generation AMD
will include an extra SSE2 register. AMD still scales much better from 1
socket to 2 socket (in general of course, your benchmarks may vary) and
Intel can't touch the > 2 socket market yet.
(Not an AMD fanboy, just someone who appreciates seeing performance of
real codes than arguing performance based on architecture and
press-releases.)
Craig
Add to that that the new socket from intel is like 125 watts TDP. That's
just not normal. That's wasting as much as itanium2!
Vincent
----- Original Message ----- From: "Kevin Ball" <[EMAIL PROTECTED]>
To: "Erik Paulson" <[EMAIL PROTECTED]>
Cc: <beowulf@beowulf.org>; "Patrick Geoffray" <[EMAIL PROTECTED]>
Sent: Wednesday, June 28, 2006 10:29 PM
Subject: Re: [Beowulf] Three notes from ISC 2006
On Wed, 2006-06-28 at 13:41, Erik Paulson wrote:
On Wed, Jun 28, 2006 at 04:25:40PM -0400, Patrick Geoffray wrote:
>
> I just hope this will be picked up by an academic that can convince
> vendors to donate. Tax break is usually a good incentive for that :-)
>
How much care should be given to the selection of the nodes?
Performance
is a function of both the nodes and the interconnect - so while your
test cluster allows for direct comparisons of the interconnects it's
only
for a cluster of AMD processors, or for Intel processors.
Prior to Woodcrest, I would have said AMD 100%. Now? Its hard to say.
I think AMD nodes will still tend to do better at scaling and show
interconnects in a better light than Intel nodes, but Woodcrest
performance looks like it may be good enough to at least make things
competitive for all but the largest clusters.
I could imagine there would be academic sites that would host this
thing, and possibly even spring for the nodes, provided that the
interconnects were donated and they got to use it when it's not in
use (and probably had some promise that no more than X% of the time
would the cluster be in "benchmark" mode)
This is very possible... especially if the benchmarking results were
interesting enough to pull some papers out of.
-Kevin
-Erik, not legally authorized to volunteer the University of Wisconsin
to
host any such thing.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf