http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59697
Bug ID: 59697
Summary: Function attribute __target_(("no-avx)) work on
Windows/mingw but fails on Linux.
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: oystein at gnubg dot org
Created attachment 31754
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31754&action=edit
Example source code file
Hi,
As I last week tried to compile a file with one function to sse instruction and
another to avx, (see invalid bug report #59657), I tried to solve this by using
function attributes.
static void calculate_sse(float *data, float scale, int size ) __attribute__
((__target__ ("no-avx")));
static void calculate_avx(float *data, float scale, int size );
(Full example code attached)
When I compile this as my mingw system, the produced code is as expected, and
the code runs good at my no-avx Windows-machine.
Compiled with:
gcc -Wall -O3 -g -mavx sse_test.c -o sse_test
[10:52:28,97 C:\APPL\ssetest]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=c:/appl/mingw/bin/../libexec/gcc/mingw32/4.7.2/lto-wrapper.exe
Target: mingw32
Configured with: ../gcc-4.7.2/configure
--enable-languages=c,c++,ada,fortran,objc,obj-c++
--disable-sjlj-exceptions --wit
h-dwarf2 --enable-shared --enable-libgomp --disable-win32-registry
--enable-libstdcxx-debug --disable-build-poststage1-wi
th-cxx --enable-version-specific-runtime-libs --build=mingw32 --prefix=/mingw
Thread model: win32
gcc version 4.7.2 (GCC)
Disassembly of the calculate_sse functions looks like this:
(gdb) disassemble calculate_sse
Dump of assembler code for function calculate_sse:
0x0040138c <+0>: mov 0xc(%esp),%eax
0x00401390 <+4>: sar $0x2,%eax
0x00401393 <+7>: lea -0x1(%eax),%edx
0x00401396 <+10>: test %eax,%eax
0x00401398 <+12>: je 0x4013be <calculate_sse+50>
0x0040139a <+14>: mov 0x4(%esp),%eax
0x0040139e <+18>: movss 0x8(%esp),%xmm0
0x004013a4 <+24>: shufps $0x0,%xmm0,%xmm0
0x004013a8 <+28>: movaps %xmm0,%xmm1
0x004013ab <+31>: nop
0x004013ac <+32>: movaps (%eax),%xmm0
0x004013af <+35>: mulps %xmm1,%xmm0
0x004013b2 <+38>: movaps %xmm0,(%eax)
0x004013b5 <+41>: add $0x10,%eax
0x004013b8 <+44>: dec %edx
0x004013b9 <+45>: cmp $0xffffffff,%edx
0x004013bc <+48>: jne 0x4013ac <calculate_sse+32>
0x004013be <+50>: ret
End of assembler dump.
All nice and as expected on Windows/mingw!
I then try the same code with the same function attribute on my Arch linux
laptop (also non-avx CPU). But on this system the calculate_sse functions is
compiled with avx instruction despite the function attribute.
I compile with the same command line:
gcc -Wall -O3 -g -mavx sse_test.c -o sse_test
[oystein@oysteins-laptop ~]$ gcc --version
gcc (GCC) 4.8.2 20131219 (prerelease)
Produced disassembly of calculate_sse function:
(gdb) disassemble
Dump of assembler code for function calculate_sse:
0x08048440 <+0>: mov 0xc(%esp),%ecx
0x08048444 <+4>: mov 0x4(%esp),%eax
0x08048448 <+8>: sar $0x2,%ecx
0x0804844b <+11>: test %ecx,%ecx
0x0804844d <+13>: lea -0x1(%ecx),%edx
0x08048450 <+16>: je 0x8048474 <calculate_sse+52>
=> 0x08048452 <+18>: vbroadcastss 0x8(%esp),%xmm1
0x08048459 <+25>: lea 0x0(%esi,%eiz,1),%esi
0x08048460 <+32>: vmulps (%eax),%xmm1,%xmm0
0x08048464 <+36>: sub $0x1,%edx
0x08048467 <+39>: add $0x10,%eax
0x0804846a <+42>: vmovaps %xmm0,-0x10(%eax)
0x0804846f <+47>: cmp $0xffffffff,%edx
0x08048472 <+50>: jne 0x8048460 <calculate_sse+32>
0x08048474 <+52>: repz ret
End of assembler dump.
I'm always unsure what to expect, but at least I did not the behavior to be
different on the two systems. Well, it might not be the Windows/Linux systems
that makes the difference but maybe rather the 4.7.x / 4.8.x gcc-version that
makes the difference. What do I know?
Thanks,
-Øystein