https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95778

            Bug ID: 95778
           Summary: target_clones indirection eliminates requires noinline
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Compiling

```
static __attribute__((noinline,target_clones("default,avx2"))) int f2(int *p)
{
    asm volatile ("" :: "r"(p) : "memory");
    return *p;
}

__attribute__((target_clones("default,avx2"))) int g2(int *p)
{
    return f2(p);
}
```

with `-fPIC -O3` generates


```
g2.avx2.0:
        jmp     f2.avx2.0
```

However, if any of the two `noinline` is removed, the generated code becomes,

```
g2.avx2.0:
        jmp     f2@PLT
```

which cannot get eliminated later
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776

I think this should be possible to do and should be possible without LTO (hence
a slightly different bug than
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776 even though if that one is
fixed turning on LTO can particially fix this).

Also, in this case, the `f2` should be inlinable to `g2`. However, no
combination of `inline`, `always_inline`, `flatten` I've tested can do that,
even though when both functions are marked with `noinline` gcc clearly knows
which function is calling what so it should have no problem inlining.

Reply via email to