Hi, all,
I built the latest Lapack package which latest gfortran, and use
profiling to see the hotspot, and found that the final binary spent a
lot of time on the libm-2.7.so, after binary searching, i found that
for the following source code(i attached the .f file):
(1) source code snippets: gfortran -O3 zlange.f -S
DOUBLE PRECISION FUNCTION ZLANGE( NORM, M, N, A, LDA, WORK )
DO 40 J = 1, N
SUM = ZERO
DO 30 I = 1, M
SUM = SUM + ABS( A( I, J ) )
30 CONTINUE
VALUE = MAX( VALUE, SUM )
40 CONTINUE
(2) assembly code which generated by gfortran:
[EMAIL PROTECTED]:~/math/lapack-
gfortran/SRC$ gfortran -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../trunk/configure
--prefix=/home/tianwei/gcc/trunk-install
--enable-languages=c,c++,fortran --disable-multilib
--disable-bootstrap
Thread model: posix
gcc version 4.4.0 20081106 (experimental) (GCC)
.L10:
movsd (%rbx), %xmm0
movsd 8(%rbx), %xmm1
movsd %xmm2, 16(%rsp)
call cabs
(3)
the cabs defintion in libm-2.7.so:
[EMAIL PROTECTED]:~/math/lapack-gfortran/SRC$ readelf -s
/lib/libm-2.7.so | grep cabs
72: 0000000000030150 23 FUNC WEAK DEFAULT 12 cabsf@@GLIBC_2.2.5
78: 0000000000038310 5 FUNC WEAK DEFAULT 12 cabsl@@GLIBC_2.2.5
231: 0000000000025110 5 FUNC WEAK DEFAULT 12 cabs@@GLIBC_2.2.5
(4) the performance gap for this intrinsic is about 15% on my X86
Core2 desktop compared to other compiler.
(5) i search the web, and found that gfortran should support this
intrinsic, anyone can give me some suggestions for this problem?
Thanks very much.
Tianwei
--
Sheng, Tianwei
Inst. of High Performance Computing
Dept. of Computer Sci. & Tech.
Tsinghua Univ.