Re: [Beowulf] bizarre scaling behavior on a Nehalem

Mikhail Kuzminsky Fri, 14 Aug 2009 08:19:00 -0700

In message from Bill Broadley <b...@cse.ucdavis.edu> (Thu, 13 Aug 200917:09:24 -0700):

Tom Elken wrote:
To add some details to what Christian says, the HPC Challengeversion ofSTREAM uses dynamic arrays and is hard to optimize. I don't knowwhat'sbest with current compiler versions, but you could try some of thesethat
were used in past HPCC submissions with your program, Bill:
Thanks for the heads up, I've checked the specbench.org compileroptions forhints on where to start with optimization flags, but I didn't knowabout the
dynamic stream.

Is the HPC challenge code open source?


Yes, they are open.

PathScale 2.2.1 on Opteron:
Base OPT flags: -O3 -OPT:Ofast:fold_reassociate=0STREAMFLAGS=-O3 -OPT:Ofast:fold_reassociate=0-OPT:alias=restrict:align_unsafe=on -CG:movnti=1
Alas my pathscale license expired and I believe with sci-cortex'sdeath (RIP)
I can't renew it.


Now I understand that I was sage :-)

(we purchased perpetual acafemic license). ВТW, dosomebody know about Pathscale compilers future (if it will be) ?


Mikhail


I tried open64-4.2.2 with those flags and on a nehalem single socket:

$ opencc -O4 -fopenmp stream.c -o stream-open64 -static
$ opencc -O4 -fopenmp stream-malloc.c -o stream-open64-malloc -static

$ ./stream-open64
Total memory required = 457.8 MB.
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       22061.4958       0.0145       0.0145       0.0146
Scale:      22228.4705       0.0144       0.0144       0.0145
Add:        20659.2638       0.0233       0.0232       0.0233
Triad:      20511.0888       0.0235       0.0234       0.0235

Dynamic:
$ ./stream-open64-malloc

Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       14436.5155       0.0222       0.0222       0.0222
Scale:      14667.4821       0.0218       0.0218       0.0219
Add:        15739.7070       0.0305       0.0305       0.0305
Triad:      15770.7775       0.0305       0.0304       0.0305

Intel C/C++ Compiler 10.1 on Harpertown CPUs:
Base OPT flags:  -O2 -xT -ansi-alias -ip -i-static
Intel recently used
Intel C/C++ Compiler 11.0.081 on Nehalem CPUs:
         -O2 -xSSE4.2 -ansi-alias -ip

and got good STREAM results in their HPCC submission on theirENdeavor cluster.


$ icc -O2 -xSSE4.2 -ansi-alias -ip -openmp stream.c -o stream-icc
$ icc -O2 -xSSE4.2 -ansi-alias -ip -openmp stream-malloc.c -o
stream-icc-malloc

$ ./stream-icc | grep ":"
STREAM version $Revision: 5.9 $
Copy:       14767.0512       0.0022       0.0022       0.0022
Scale:      14304.3513       0.0022       0.0022       0.0023
Add:        15503.3568       0.0031       0.0031       0.0031
Triad:      15613.9749       0.0031       0.0031       0.0031
$ ./stream-icc-malloc | grep ":"
STREAM version $Revision: 5.9 $
Copy:       14604.7582       0.0022       0.0022       0.0022
Scale:      14480.2814       0.0022       0.0022       0.0022
Add:        15414.3321       0.0031       0.0031       0.0031
Triad:      15738.4765       0.0031       0.0030       0.0031

So ICC does manage zero penalty, alas no faster than open64 with thepenalty.

I'll attempt to track down the HPCC stream source code to see iftheir dynamic

arrays are any friendlier than mine (I just use malloc).

In any case many thanks for the pointer.

Oh, my dynamic tweak:
$ diff stream.c stream-malloc.c
43a44

# include <stdlib.h>

97c98
< static double      a[N+OFFSET],
---

/* static double        a[N+OFFSET],

99c100,102
<            c[N+OFFSET];
---

                c[N+OFFSET]; */

double *a, *b, *c;

134a138,142


    a=(double *)malloc(sizeof(double)*(N+OFFSET));
    b=(double *)malloc(sizeof(double)*(N+OFFSET));
    c=(double *)malloc(sizeof(double)*(N+OFFSET));

283c291,293
<
---

    free(a);
    free(b);
    free(c);





_______________________________________________

Beowulf mailing list, Beowulf@beowulf.org sponsored by PenguinComputingTo change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf


--
üÔÏ ÓÏÏÂÝÅÎÉÅ ÂÙÌÏ ÐÒÏ×ÅÒÅÎÏ ÎÁ ÎÁÌÉÞÉÅ × ÎÅÍ ×ÉÒÕÓÏ×
É ÉÎÏÇÏ ÏÐÁÓÎÏÇÏ ÓÏÄÅÒÖÉÍÏÇÏ ÐÏÓÒÅÄÓÔ×ÏÍ
MailScanner, É ÍÙ ÎÁÄÅÅÍÓÑ
ÞÔÏ ÏÎÏ ÎÅ ÓÏÄÅÒÖÉÔ ×ÒÅÄÏÎÏÓÎÏÇÏ ËÏÄÁ.


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] bizarre scaling behavior on a Nehalem

Reply via email to