On Wed, Apr 9, 2025 at 8:54 AM Ard Biesheuvel <a...@kernel.org> wrote:
>
> On Wed, 9 Apr 2025 at 16:46, H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Wed, Apr 9, 2025 at 1:53 AM Ard Biesheuvel <ardb+...@google.com> wrote:
> > >
> > > From: Ard Biesheuvel <a...@kernel.org>
> > >
> > > Commit bde21de1205 ("i386: Honour -mdirect-extern-access when calling
> > > __fentry__") updated the logic that emits mcount() / __fentry__() calls
> > > into function prologues when profiling is enabled, to avoid GOT-based
> > > indirect calls when a direct call would suffice.
> > >
> > > There are two problems with that change:
> > > - it relies on -mdirect-extern-access rather than -fno-plt to decide
> > >   whether or not a direct [PLT based] call is appropriate;
> > > - for the PLT case, it falls through to x86_print_call_or_nop(), which
> > >   does not emit the @PLT suffix, resulting in the wrong relocation to be
> > >   used (R_X86_64_PC32 instead of R_X86_64_PLT32)
> > >
> > > Fix this by testing flag_plt instead of ix86_direct_extern_access, and
> > > updating x86_print_call_or_nop() to take flag_pic and flag_plt into
> > > account. This also ensures that -mnop-mcount works as expected when
> > > emitting the PLT based profiling calls.
> > >
> > > Note that only 64-bit codegen is affected by this change or by the
> > > commit referenced above; -m32 will yield 'call *mcount@GOT()' as before.
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386
> > >
> > > Signed-off-by: Ard Biesheuvel <a...@kernel.org>
> > >
> > > gcc/ChangeLog:
> > >
> > >         PR target/119386
> > >         * config/i386/i386.cc (x86_print_call_or_nop): Add @PLT suffix
> > >         where appropriate.
> > >         (x86_function_profiler): Fall through to x86_print_call_or_nop()
> > >         for PIC codegen when flag_plt is set.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         PR target/119386
> > >         * gcc.target/i386/pr119386-1.c: New test.
> > >         * gcc.target/i386/pr119386-2.c: New test.
> > > ---
> > >  gcc/config/i386/i386.cc                    |  8 +++++++-
> > >  gcc/testsuite/gcc.target/i386/pr119386-1.c | 11 +++++++++++
> > >  gcc/testsuite/gcc.target/i386/pr119386-2.c | 12 ++++++++++++
> > >  3 files changed, 30 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr119386-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr119386-2.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index be5e27fc391..0b238c3dddc 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -23154,6 +23154,12 @@ x86_print_call_or_nop (FILE *file, const char 
> > > *target)
> > >    if (flag_nop_mcount || !strcmp (target, "nop"))
> > >      /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */
> > >      fprintf (file, "1:" ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
> > > +  else if (!TARGET_PECOFF && flag_pic)
> > > +    {
> > > +      gcc_assert (flag_plt);
> > > +
> > > +      fprintf (file, "1:\tcall\t%s@PLT\n", target);
> > > +    }
> > >    else
> > >      fprintf (file, "1:\tcall\t%s\n", target);
> > >  }
> > > @@ -23317,7 +23323,7 @@ x86_function_profiler (FILE *file, int labelno 
> > > ATTRIBUTE_UNUSED)
> > >               break;
> > >             case CM_SMALL_PIC:
> > >             case CM_MEDIUM_PIC:
> > > -             if (!ix86_direct_extern_access)
> > > +             if (!flag_plt)
> > >                 {
> > >                   if (ASSEMBLER_DIALECT == ASM_INTEL)
> > >                     fprintf (file, "1:\tcall\t[QWORD PTR 
> > > %s@GOTPCREL[rip]]\n",
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr119386-1.c 
> > > b/gcc/testsuite/gcc.target/i386/pr119386-1.c
> > > new file mode 100644
> > > index 00000000000..174d00f1e27
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr119386-1.c
> > > @@ -0,0 +1,11 @@
> > > +/* PR target/119386 */
> > > +/* { dg-do compile { target *-*-linux* } } */
> > > +/* { dg-options "-O2 -fpic -pg" } */
> > > +/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" { target { ! ia32 
> > > } } } } */
> > > +/* { dg-final { scan-assembler "call\[ \t\]\\*mcount@GOT\\(" { target 
> > > ia32 } } } */
> >
> > This is wrong for ia32, which should also generate "call mcount@PLT".
> >
>
> But it hasn't done that for a long time - it is hard to figure out
> from the Git history how long but at least 20 years IIUC
>
> So do you think this change should fix IA-32 as well? Note that the
> issue is about emitting 'call mcount' on 64-bit where 'call
> mcount@PLT' is needed, not about changing the indirect GOT based call
> to a PLT call.

Try this:

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 0c7cf5f827f..20059b775b9 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -23358,7 +23358,9 @@ x86_function_profiler (FILE *file, int labelno
ATTRIBUTE_UNUSED)
  "\tleal\t%sP%d@GOTOFF(%%ebx), %%" PROFILE_COUNT_REGISTER "\n",
  LPREFIX, labelno);
 #endif
-      if (ASSEMBLER_DIALECT == ASM_INTEL)
+      if (flag_plt)
+ x86_print_call_or_nop (file, mcount_name);
+      else if (ASSEMBLER_DIALECT == ASM_INTEL)
  fprintf (file, "1:\tcall\t[DWORD PTR %s@GOT[ebx]]\n", mcount_name);
       else
  fprintf (file, "1:\tcall\t*%s@GOT(%%ebx)\n", mcount_name);
diff --git a/gcc/testsuite/gcc.target/i386/pr119386-1.c
b/gcc/testsuite/gcc.target/i386/pr119386-1.c
index 174d00f1e27..56e44c89859 100644
--- a/gcc/testsuite/gcc.target/i386/pr119386-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr119386-1.c
@@ -1,8 +1,7 @@
 /* PR target/119386 */
 /* { dg-do compile { target *-*-linux* } } */
 /* { dg-options "-O2 -fpic -pg" } */
-/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" { target { !
ia32 } } } } */
-/* { dg-final { scan-assembler "call\[ \t\]\\*mcount@GOT\\(" { target
ia32 } } } */
+/* { dg-final { scan-assembler "call\[ \t\]mcount@PLT" } } */

 int
 main ()

-- 
H.J.

Reply via email to