http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59486
Bug ID: 59486
Summary: math functions take more cycles after running any
Intel AVX function
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: kayan4096 at gmail dot com
Created attachment 31427
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31427&action=edit
the .i file
when using any AVX256 instruction, the "AVX upper state" becomes "dirty", which
results in a performance hit to all legacy library calls.
This is documented in the Intel Optimization Manual.
gcc should clean the YMM register after using AVX.
for the attached foo.i the result we get are:
round res 31999997 total cycles 224725952 CPI 22
round res 31999997 total cycles 1900864520 CPI 190
while the expected results are:
round res 31999997 total cycles 224725952 CPI 22
round res 31999997 total cycles 224725952 CPI 22
This is also described here:
http://stackoverflow.com/questions/20545539/math-functions-takes-more-cycles-after-running-any-intel-avx-function
"gcc -v -save-temps -Wall -mavx -lm foo.c" output:
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -E -quiet -v foo.c -mavx
-mtune=generic -Wall -fpch-preprocess -o foo.i
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.4.6/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../x86_64-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/include
/usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/cc1 -fpreprocessed foo.i -quiet
-dumpbase foo.c -mavx -mtune=generic -auxbase foo -Wall -version -o foo.s
GNU C (GCC) version 4.4.6 20110731 (Red Hat 4.4.6-3) (x86_64-redhat-linux)
compiled by GNU C version 4.4.6 20110731 (Red Hat 4.4.6-3), GMP version
4.3.1, MPFR version 2.4.1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 11bca756726d0c8e79657fd5bb53575a
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
as -V -Qy -msse2avx -o foo.o foo.s
GNU assembler version 2.20.51.0.2 (x86_64-redhat-linux) using BFD version
version 2.20.51.0.2-5.28.el6 20091009
COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/:/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-mavx' '-mtune=generic'
/usr/libexec/gcc/x86_64-redhat-linux/4.4.6/collect2 --eh-frame-hdr --build-id
-m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux-x86-64.so.2
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crt1.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crti.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtbegin.o
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64 -L/lib/../lib64
-L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../.. -lm foo.o
-lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.4.6/crtend.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/crtn.o