https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71990

            Bug ID: 71990
           Summary: Function multiversioning prohibits inlining
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sgunderson at bigfoot dot com
  Target Milestone: ---

Hi,

I'm trying to write a library that uses F16C instructions in certain places,
and since they're not really universally accessible (and ld.so hardware
capabilities seem to have been long abandoned), I've tried to use function
multiversioning for it. However, trying to combine it with inlining seems to
draw a blank; a very simplified example:

klump:~> /usr/lib/gcc-snapshot/bin/g++ -v                 
Using built-in specs.                  
COLLECT_GCC=/usr/lib/gcc-snapshot/bin/g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc-snapshot/libexec/gcc/x86_64-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 20160707-1'
--with-bugurl=file:///usr/share/doc/gcc-snapshot/README.Bugs
--enable-languages=c,ada,c++,java,go,fortran,objc,obj-c++
--prefix=/usr/lib/gcc-snapshot --enable-shared --enable-linker-build-id
--disable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-libmpx
--enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-7-snap-amd64/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-7-snap-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-7-snap-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--disable-werror --enable-checking=yes --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.0.0 20160707 (experimental) [trunk revision 238117] (Debian
20160707-1) 

klump:~> cat test.cc                     
#include <stdio.h>

__attribute__ ((target("default")))
inline int foo()
{
        return 0;
}

__attribute__ ((target("avx")))
inline int foo()
{
        return 1;
}

int bar()
{
        int sum = 0;
        for (int i = 0; i < 100; ++i) {
                sum += foo();
        }
        return sum;
}

int main(void)
{
        printf("%d\n", bar());
}

klump:~> /usr/lib/gcc-snapshot/bin/g++ -O2 -o test test.cc
klump:~> nm --demangle test | egrep 'foo|bar' 
0000000000400c40 i _Z3foov.ifunc()
0000000000400bf0 T bar()
0000000000400c20 W foo()
0000000000400c30 W foo() [clone .avx]
0000000000400c40 W foo() [clone .resolver]

Of course, in reality, my foo() would do something more complicated, like call
_cvtss_sh() or similar; this is a toy example. But it illustrates that the
function multiversioning blocks inlining.

If I compile with -mavx, the entire multiversioning goes away (only the AVX
version is emitted), so I hoped that I could use target cloning on bar():

__attribute__ ((target_clones("avx", "default")))
int bar()
{
    // same code...

but unfortunately, no. There's a bar() clone for AVX emitted, but it still
calls the resolving function for foo(); no inlining.

So I really can't find any usable way of using this feature if your
architecture switch is in inlined functions (in my case, convert to/from fp16).

Reply via email to