http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59697
Bug ID: 59697 Summary: Function attribute __target_(("no-avx)) work on Windows/mingw but fails on Linux. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: oystein at gnubg dot org Created attachment 31754 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31754&action=edit Example source code file Hi, As I last week tried to compile a file with one function to sse instruction and another to avx, (see invalid bug report #59657), I tried to solve this by using function attributes. static void calculate_sse(float *data, float scale, int size ) __attribute__ ((__target__ ("no-avx"))); static void calculate_avx(float *data, float scale, int size ); (Full example code attached) When I compile this as my mingw system, the produced code is as expected, and the code runs good at my no-avx Windows-machine. Compiled with: gcc -Wall -O3 -g -mavx sse_test.c -o sse_test [10:52:28,97 C:\APPL\ssetest]# gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=c:/appl/mingw/bin/../libexec/gcc/mingw32/4.7.2/lto-wrapper.exe Target: mingw32 Configured with: ../gcc-4.7.2/configure --enable-languages=c,c++,ada,fortran,objc,obj-c++ --disable-sjlj-exceptions --wit h-dwarf2 --enable-shared --enable-libgomp --disable-win32-registry --enable-libstdcxx-debug --disable-build-poststage1-wi th-cxx --enable-version-specific-runtime-libs --build=mingw32 --prefix=/mingw Thread model: win32 gcc version 4.7.2 (GCC) Disassembly of the calculate_sse functions looks like this: (gdb) disassemble calculate_sse Dump of assembler code for function calculate_sse: 0x0040138c <+0>: mov 0xc(%esp),%eax 0x00401390 <+4>: sar $0x2,%eax 0x00401393 <+7>: lea -0x1(%eax),%edx 0x00401396 <+10>: test %eax,%eax 0x00401398 <+12>: je 0x4013be <calculate_sse+50> 0x0040139a <+14>: mov 0x4(%esp),%eax 0x0040139e <+18>: movss 0x8(%esp),%xmm0 0x004013a4 <+24>: shufps $0x0,%xmm0,%xmm0 0x004013a8 <+28>: movaps %xmm0,%xmm1 0x004013ab <+31>: nop 0x004013ac <+32>: movaps (%eax),%xmm0 0x004013af <+35>: mulps %xmm1,%xmm0 0x004013b2 <+38>: movaps %xmm0,(%eax) 0x004013b5 <+41>: add $0x10,%eax 0x004013b8 <+44>: dec %edx 0x004013b9 <+45>: cmp $0xffffffff,%edx 0x004013bc <+48>: jne 0x4013ac <calculate_sse+32> 0x004013be <+50>: ret End of assembler dump. All nice and as expected on Windows/mingw! I then try the same code with the same function attribute on my Arch linux laptop (also non-avx CPU). But on this system the calculate_sse functions is compiled with avx instruction despite the function attribute. I compile with the same command line: gcc -Wall -O3 -g -mavx sse_test.c -o sse_test [oystein@oysteins-laptop ~]$ gcc --version gcc (GCC) 4.8.2 20131219 (prerelease) Produced disassembly of calculate_sse function: (gdb) disassemble Dump of assembler code for function calculate_sse: 0x08048440 <+0>: mov 0xc(%esp),%ecx 0x08048444 <+4>: mov 0x4(%esp),%eax 0x08048448 <+8>: sar $0x2,%ecx 0x0804844b <+11>: test %ecx,%ecx 0x0804844d <+13>: lea -0x1(%ecx),%edx 0x08048450 <+16>: je 0x8048474 <calculate_sse+52> => 0x08048452 <+18>: vbroadcastss 0x8(%esp),%xmm1 0x08048459 <+25>: lea 0x0(%esi,%eiz,1),%esi 0x08048460 <+32>: vmulps (%eax),%xmm1,%xmm0 0x08048464 <+36>: sub $0x1,%edx 0x08048467 <+39>: add $0x10,%eax 0x0804846a <+42>: vmovaps %xmm0,-0x10(%eax) 0x0804846f <+47>: cmp $0xffffffff,%edx 0x08048472 <+50>: jne 0x8048460 <calculate_sse+32> 0x08048474 <+52>: repz ret End of assembler dump. I'm always unsure what to expect, but at least I did not the behavior to be different on the two systems. Well, it might not be the Windows/Linux systems that makes the difference but maybe rather the 4.7.x / 4.8.x gcc-version that makes the difference. What do I know? Thanks, -Øystein