Mark Hahn wrote:

kinda sucks, doesn't it? here's what I get for a not-new dual-275 with 8x1G PC3200 (I think):

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        5714.6837       0.0840       0.0840       0.0841
Scale:       5821.0766       0.0825       0.0825       0.0826
Add:         6437.8226       0.1119       0.1118       0.1120
Triad:       6414.2079       0.1123       0.1123       0.1124
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Now that I am back from "da Yoo Pee" I can post some of my numbers. Here is Our dual core opteron 275.

4-threads

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       9999.3465      0.0356      0.0320      0.0360
Scale:      8888.4147      0.0360      0.0360      0.0360
Add:        9230.2533      0.0542      0.0520      0.0560
Triad:      9230.2321      0.0538      0.0520      0.0560

1-thread

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       4705.6130      0.0711      0.0680      0.0720
Scale:      4705.6130      0.0702      0.0680      0.0720
Add:        4615.1161      0.1067      0.1040      0.1080
Triad:      4444.1975      0.1080      0.1080      0.1080

using a PathScale compiled binary. I see slightly higher numbers using PGI 6.1-2 compiled binaries for single threads, not sure why. The 6.1-5/6 compiled are worse :(

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:       6666.9379      0.0454      0.0300      0.0500
Scale:      4000.0610      0.0567      0.0500      0.0600
Add:        4285.7330      0.0758      0.0700      0.0800
Triad:      4285.7330      0.0747      0.0700      0.0900

Same code binary on woodcrest 2.66 GHz

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:       5000.1240      0.0427      0.0400      0.0500
Scale:      5000.1240      0.0452      0.0400      0.0500
Add:        5000.0445      0.0685      0.0600      0.0800
Triad:      5000.0445      0.0712      0.0600      0.0800

Intel 9.1 compiled version (64 bit)

1-thread

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        4447.6829       0.1440       0.1439       0.1445
Scale:       4613.8072       0.1388       0.1387       0.1390
Add:         4256.9431       0.2256       0.2255       0.2259
Triad:       4187.6605       0.2294       0.2292       0.2302


2-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        7288.3813       0.0882       0.0878       0.0893
Scale:       7186.2381       0.0891       0.0891       0.0893
Add:         7085.0852       0.1357       0.1355       0.1365
Triad:       6916.0273       0.1389       0.1388       0.1392


3-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6589.2489       0.0989       0.0971       0.1001
Scale:       6528.4171       0.0988       0.0980       0.0997
Add:         6535.0076       0.1488       0.1469       0.1504
Triad:       6563.9202       0.1486       0.1463       0.1496


4-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6645.4125       0.0965       0.0963       0.0976
Scale:       6994.6233       0.0916       0.0915       0.0917
Add:         6373.0207       0.1508       0.1506       0.1509
Triad:       6710.7522       0.1432       0.1431       0.1433

I may have been Bill's 10 GB/s source, and that may have been a mixup on my part.

FWIW: the PathScale compiled binaries on this machine give

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       7272.4071      0.0453      0.0440      0.0480
Scale:      7272.2298      0.0462      0.0440      0.0480
Add:        5999.6258      0.0827      0.0800      0.0840
Triad:      5999.6302      0.0831      0.0800      0.0840

and the PGI compiled ones give

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6608.0161       0.0970       0.0969       0.0977
Scale:       4592.3298       0.1395       0.1394       0.1397
Add:         4259.8885       0.2262       0.2254       0.2269
Triad:       4244.0478       0.2269       0.2262       0.2273

They may be slightly different versions of the original source (notice the labels on the columns), but the core measurements are the same.


On the Opteron 275, we have two memory nodes, each with multiple banks per node.

[EMAIL PROTECTED]:~/stream> numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3
cpubind:
nodebind:
membind: 0 1

[EMAIL PROTECTED]:~/stream> numactl --hardware
available: 2 nodes (0-1)
node 0 size: 2015 MB
node 0 free: 1276 MB
node 1 size: 4025 MB
node 1 free: 2416 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10


On the woodcrest, it looks like a single memory node.

[EMAIL PROTECTED]:~> numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3
cpubind:
nodebind:
membind: 0


[EMAIL PROTECTED]:~> numactl --hardware
available: 1 nodes (0-0)
node 0 size: 4017 MB
node 0 free: 2649 MB
node distances:
node   0
  0:  10


I have it on good authority that with the other chipset (we have a Blackford here), we should see higher numbers. Not exceeding the Opteron 275 though.

When I have time, I will investigate this more and write about it on my blog. FWIW, I am not seeing a clear performance picture emerging. I have heard speculation/rumor from others, but I prefer measurement, and my measurements while consistent, are not exposing a nice and meaningful picture where I can say "yes its faster" or "no it isn't".

What I can say is that Woodcrest is interesting. It just may be overhyped by a "compliant" media.



--

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to