------- Comment #4 from whaley at cs dot utsa dot edu  2007-06-27 17:00 -------
Andrew,

>PowerPC970FX is not a direct descendent of Power5

Sorry, completely misremembered this.  Since Power4 didn't suffer as bad
as Power5 (I think it lost maybe 10% rather than 50), maybe the 970 will
also not die.

>so I think the case is the register allocator is messing up (which is already 
>known)

OK, can you point me to the bug report?  Is there some way to confirm this
is the problem, rather than the scheduling pass itself?

>The other thing is what options are you using to invoke GCC with?

My Makefile shows them.  The gcc3-derived flags are:
   -mcpu=power5 -mtune=power5 -O3 -m64
for gcc4, I get most of my performance back if I add:
   -fno-schedule-insns -fno-rerun-loop-opt

I include below example output and arch info on the machine I created the
benchmark on (forgot to include it before, sorry).

Thanks,
Clint

r78n04 noibm122/TEST> uname -a
Linux r78n04 2.6.5-7.244-pseries64 #1 SMP Mon Dec 12 18:32:25 UTC 2005 ppc64
ppc64 ppc64 GNU/Linux

r78n04 noibm122/TEST> /usr/bin/gcc -v
Reading specs from /usr/lib/gcc-lib/powerpc-suse-linux/3.3.3/specs
Configured with: ../configure --enable-threads=posix --prefix=/usr
--with-local-prefix=/usr/local --infodir=/usr/share/info
--mandir=/usr/share/man --enable-languages=c,c++,f77,objc,java,ada
--disable-checking --libdir=/usr/lib --enable-libgcj
--with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib --with-system-zlib
--enable-shared --enable-__cxa_atexit --host=powerpc-suse-linux
--build=powerpc-suse-linux --target=powerpc-suse-linux
--enable-targets=powerpc64-suse-linux --enable-biarch
Thread model: posix
gcc version 3.3.3 (SuSE Linux)

r78n04 noibm122/TEST> gcc -v
Using built-in specs.
Target: powerpc64-unknown-linux-gnu
Configured with: ../configure --prefix=/home/whaley/local/linux
--enable-languages=c --with-gmp=/u/noibm122/local/linux
--with-mpfr-lib=/u/noibm122/local/linux/lib
--with-mpfr-include=/u/noibm122/local/linux/include
Thread model: posix
gcc version 4.2.0

r78n04 TEST/MMBENCH_PPC> make all
/usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c
mmbench.c
/usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c
dgemm_atlas.c
/usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -o
xdmm_gcc3 mmbench.o dgemm_atlas.o
rm -f *.o
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -c mmbench.c
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -c dgemm_atlas.c
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -o xdmm_gcc4 mmbench.o dgemm_atlas.o
rm -f *.o
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -c mmbench.c
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -fno-schedule-insns -fno-rerun-loop-opt -c
\
                dgemm_atlas.c
/u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL
-mcpu=power5 -mtune=power5 -O3 -m64 -o xdmm_gcc4_nosched mmbench.o
dgemm_atlas.o
rm -f *.o
echo "GCC 3.x performance:"
GCC 3.x performance:
./xdmm_gcc3
ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

atlasmm       40   1000       0.026     4998.24

echo "GCC 4.2 performance:"
GCC 4.2 performance:
./xdmm_gcc4
ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

atlasmm       40   1000       0.034     3806.35

echo "GCC 4.2 w/o scheduling performance:"
GCC 4.2 w/o scheduling performance:
./xdmm_gcc4_nosched
ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

atlasmm       40   1000       0.025     5044.53


-- 

whaley at cs dot utsa dot edu changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |c


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523

Reply via email to