On Wed, Apr 9, 2025 at 8:54 AM Ard Biesheuvel <a...@kernel.org> wrote: > > On Wed, 9 Apr 2025 at 16:46, H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Wed, Apr 9, 2025 at 1:53 AM Ard Biesheuvel <ardb+...@google.com> wrote: > > > > > > From: Ard Biesheuvel <a...@kernel.org> > > > > > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling > > > __fentry__") updated the logic that emits mcount() / __fentry__() calls > > > into function prologues when profiling is enabled, to avoid GOT-based > > > indirect calls when a direct call would suffice. > > > > > > There are two problems with that change: > > > - it relies on -mdirect-extern-access rather than -fno-plt to decide > > > whether or not a direct [PLT based] call is appropriate; > > > - for the PLT case, it falls through to x86_print_call_or_nop(), which > > > does not emit the @PLT suffix, resulting in the wrong relocation to be > > > used (R_X86_64_PC32 instead of R_X86_64_PLT32) > > > > > > Fix this by testing flag_plt instead of ix86_direct_extern_access, and > > > updating x86_print_call_or_nop() to take flag_pic and flag_plt into > > > account. This also ensures that -mnop-mcount works as expected when > > > emitting the PLT based profiling calls. > > > > > > Note that only 64-bit codegen is affected by this change or by the > > > commit referenced above; -m32 will yield 'call *mcount@GOT()' as before. > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 > > > > > > Signed-off-by: Ard Biesheuvel <a...@kernel.org> > > > > > > gcc/ChangeLog: > > > > > > PR target/119386 > > > * config/i386/i386.cc (x86_print_call_or_nop): Add @PLT suffix > > > where appropriate. > > > (x86_function_profiler): Fall through to x86_print_call_or_nop() > > > for PIC codegen when flag_plt is set. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/119386 > > > * gcc.target/i386/pr119386-1.c: New test. > > > * gcc.target/i386/pr119386-2.c: New test. > > > --- > > > gcc/config/i386/i386.cc | 8 +++++++- > > > gcc/testsuite/gcc.target/i386/pr119386-1.c | 11 +++++++++++ > > > gcc/testsuite/gcc.target/i386/pr119386-2.c | 12 ++++++++++++ > > > 3 files changed, 30 insertions(+), 1 deletion(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr119386-1.c > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr119386-2.c > > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > > index be5e27fc391..0b238c3dddc 100644 > > > --- a/gcc/config/i386/i386.cc > > > +++ b/gcc/config/i386/i386.cc > > > @@ -23154,6 +23154,12 @@ x86_print_call_or_nop (FILE *file, const char > > > *target) > > > if (flag_nop_mcount || !strcmp (target, "nop")) > > > /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */ > > > fprintf (file, "1:" ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n"); > > > + else if (!TARGET_PECOFF && flag_pic) > > > + { > > > + gcc_assert (flag_plt); > > > + > > > + fprintf (file, "1:\tcall\t%s@PLT\n", target); > > > + } > > > else > > > fprintf (file, "1:\tcall\t%s\n", target); > > > } > > > @@ -23317,7 +23323,7 @@ x86_function_profiler (FILE *file, int labelno > > > ATTRIBUTE_UNUSED) > > > break; > > > case CM_SMALL_PIC: > > > case CM_MEDIUM_PIC: > > > - if (!ix86_direct_extern_access) > > > + if (!flag_plt) > > > { > > > if (ASSEMBLER_DIALECT == ASM_INTEL) > > > fprintf (file, "1:\tcall\t[QWORD PTR > > > %s@GOTPCREL[rip]]\n", > > > diff --git a/gcc/testsuite/gcc.target/i386/pr119386-1.c > > > b/gcc/testsuite/gcc.target/i386/pr119386-1.c > > > new file mode 100644 > > > index 00000000000..174d00f1e27 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/pr119386-1.c > > > @@ -0,0 +1,11 @@ > > > +/* PR target/119386 */ > > > +/* { dg-do compile { target *-*-linux* } } */ > > > +/* { dg-options "-O2 -fpic -pg" } */ > > > +/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" { target { ! ia32 > > > } } } } */ > > > +/* { dg-final { scan-assembler "call\[ \t\]\\*mcount@GOT\\(" { target > > > ia32 } } } */ > > > > This is wrong for ia32, which should also generate "call mcount@PLT". > > > > But it hasn't done that for a long time - it is hard to figure out > from the Git history how long but at least 20 years IIUC > > So do you think this change should fix IA-32 as well? Note that the > issue is about emitting 'call mcount' on 64-bit where 'call > mcount@PLT' is needed, not about changing the indirect GOT based call > to a PLT call.
Try this: diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 0c7cf5f827f..20059b775b9 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -23358,7 +23358,9 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) "\tleal\t%sP%d@GOTOFF(%%ebx), %%" PROFILE_COUNT_REGISTER "\n", LPREFIX, labelno); #endif - if (ASSEMBLER_DIALECT == ASM_INTEL) + if (flag_plt) + x86_print_call_or_nop (file, mcount_name); + else if (ASSEMBLER_DIALECT == ASM_INTEL) fprintf (file, "1:\tcall\t[DWORD PTR %s@GOT[ebx]]\n", mcount_name); else fprintf (file, "1:\tcall\t*%s@GOT(%%ebx)\n", mcount_name); diff --git a/gcc/testsuite/gcc.target/i386/pr119386-1.c b/gcc/testsuite/gcc.target/i386/pr119386-1.c index 174d00f1e27..56e44c89859 100644 --- a/gcc/testsuite/gcc.target/i386/pr119386-1.c +++ b/gcc/testsuite/gcc.target/i386/pr119386-1.c @@ -1,8 +1,7 @@ /* PR target/119386 */ /* { dg-do compile { target *-*-linux* } } */ /* { dg-options "-O2 -fpic -pg" } */ -/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" { target { ! ia32 } } } } */ -/* { dg-final { scan-assembler "call\[ \t\]\\*mcount@GOT\\(" { target ia32 } } } */ +/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" } } */ int main () -- H.J.