On Tue, May 9, 2023 at 10:58 AM Ard Biesheuvel <a...@kernel.org> wrote:
>
> The small and medium PIC code models generate profiling calls that
> always load the address of __fentry__() via the GOT, even if
> -mdirect-extern-access is in effect.
>
> This deviates from the behavior with respect to other external
> references, and results in a longer opcode that relies on linker
> relaxation to eliminate the GOT load. In this particular case, the
> transformation replaces an indirect 'CALL *__fentry__@GOTPCREL(%rip)'
> with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the
> NOP is a 1 byte NOP that preserves the 6 byte length of the sequence.
>
> This is problematic for the Linux kernel, which generally relies on
> -mdirect-extern-access and hidden visibility to eliminate GOT based
> symbol references in code generated with -fpie/-fpic, without having to
> depend on linker relaxation.
>
> The Linux kernel relies on code patching to replace these opcodes with
> NOPs at runtime, and this is complicated code that we'd prefer not to
> complicate even more by adding support for patching both 5 and 6 byte
> sequences as well as parsing the instruction stream to decide which
> variant of CALL+NOP we are dealing with.
>
> So let's honour -mdirect-extern-access, and only load the address of
> __fentry__ via the GOT if direct references to external symbols are not
> permitted.
>
> Note that the GOT reference in question is in fact a data reference: we
> explicitly load the address of __fentry__ from the GOT, which amounts to
> eager binding, rather than emitting a PLT call that could bind eagerly,
> lazily or directly at link time.
>
> gcc/ChangeLog:
>
>         * config/i386/i386.cc (x86_function_profiler): Take
>           ix86_direct_extern_access into account when generating calls
>           to __fentry__()

HJ, is the patch OK with you?

Uros.

>
> Cc: H.J. Lu <hjl.to...@gmail.com>
> Cc: Jakub Jelinek <ja...@redhat.com>
> Cc: Richard Biener <rguent...@suse.de>
> Cc: Uros Bizjak <ubiz...@gmail.com>
> Cc: Hou Wenlong <houwenlong....@antgroup.com>
> ---
>  gcc/config/i386/i386.cc | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b1d08ecdb3d44729..69b183abb4318b0a 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -21836,8 +21836,12 @@ x86_function_profiler (FILE *file, int labelno 
> ATTRIBUTE_UNUSED)
>               break;
>             case CM_SMALL_PIC:
>             case CM_MEDIUM_PIC:
> -             fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name);
> -             break;
> +             if (!ix86_direct_extern_access)
> +               {
> +                 fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", 
> mcount_name);
> +                 break;
> +               }
> +             /* fall through */
>             default:
>               x86_print_call_or_nop (file, mcount_name);
>               break;
> --
> 2.39.2
>

Reply via email to