On Tue, May 9, 2023 at 10:58 AM Ard Biesheuvel <a...@kernel.org> wrote: > > The small and medium PIC code models generate profiling calls that > always load the address of __fentry__() via the GOT, even if > -mdirect-extern-access is in effect. > > This deviates from the behavior with respect to other external > references, and results in a longer opcode that relies on linker > relaxation to eliminate the GOT load. In this particular case, the > transformation replaces an indirect 'CALL *__fentry__@GOTPCREL(%rip)' > with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the > NOP is a 1 byte NOP that preserves the 6 byte length of the sequence. > > This is problematic for the Linux kernel, which generally relies on > -mdirect-extern-access and hidden visibility to eliminate GOT based > symbol references in code generated with -fpie/-fpic, without having to > depend on linker relaxation. > > The Linux kernel relies on code patching to replace these opcodes with > NOPs at runtime, and this is complicated code that we'd prefer not to > complicate even more by adding support for patching both 5 and 6 byte > sequences as well as parsing the instruction stream to decide which > variant of CALL+NOP we are dealing with. > > So let's honour -mdirect-extern-access, and only load the address of > __fentry__ via the GOT if direct references to external symbols are not > permitted. > > Note that the GOT reference in question is in fact a data reference: we > explicitly load the address of __fentry__ from the GOT, which amounts to > eager binding, rather than emitting a PLT call that could bind eagerly, > lazily or directly at link time. > > gcc/ChangeLog: > > * config/i386/i386.cc (x86_function_profiler): Take > ix86_direct_extern_access into account when generating calls > to __fentry__()
HJ, is the patch OK with you? Uros. > > Cc: H.J. Lu <hjl.to...@gmail.com> > Cc: Jakub Jelinek <ja...@redhat.com> > Cc: Richard Biener <rguent...@suse.de> > Cc: Uros Bizjak <ubiz...@gmail.com> > Cc: Hou Wenlong <houwenlong....@antgroup.com> > --- > gcc/config/i386/i386.cc | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index b1d08ecdb3d44729..69b183abb4318b0a 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -21836,8 +21836,12 @@ x86_function_profiler (FILE *file, int labelno > ATTRIBUTE_UNUSED) > break; > case CM_SMALL_PIC: > case CM_MEDIUM_PIC: > - fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); > - break; > + if (!ix86_direct_extern_access) > + { > + fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", > mcount_name); > + break; > + } > + /* fall through */ > default: > x86_print_call_or_nop (file, mcount_name); > break; > -- > 2.39.2 >