https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95778
Bug ID: 95778 Summary: target_clones indirection eliminates requires noinline Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: yyc1992 at gmail dot com Target Milestone: --- Compiling ``` static __attribute__((noinline,target_clones("default,avx2"))) int f2(int *p) { asm volatile ("" :: "r"(p) : "memory"); return *p; } __attribute__((target_clones("default,avx2"))) int g2(int *p) { return f2(p); } ``` with `-fPIC -O3` generates ``` g2.avx2.0: jmp f2.avx2.0 ``` However, if any of the two `noinline` is removed, the generated code becomes, ``` g2.avx2.0: jmp f2@PLT ``` which cannot get eliminated later https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776 I think this should be possible to do and should be possible without LTO (hence a slightly different bug than https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776 even though if that one is fixed turning on LTO can particially fix this). Also, in this case, the `f2` should be inlinable to `g2`. However, no combination of `inline`, `always_inline`, `flatten` I've tested can do that, even though when both functions are marked with `noinline` gcc clearly knows which function is calling what so it should have no problem inlining.