https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119279

            Bug ID: 119279
           Summary: Specifying frame pointer dependency in inline asm
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jpoimboe at redhat dot com
                CC: ak at gcc dot gnu.org, hjl.tools at gmail dot com, hpa at 
zytor dot com,
                    jakub at gcc dot gnu.org, peterz at infradead dot org,
                    pinskia at gcc dot gnu.org, rguenth at gcc dot gnu.org,
                    torva...@linux-foundation.org, ubizjak at gmail dot com
  Target Milestone: ---
            Target: x86_64-*-*

Created attachment 60749
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60749&action=edit
Reduced test case (gcc -O2 -fno-omit-frame-pointer -fno-optimize-sibling-calls
-c mmap.i -o mmap.o)

[ This was discussed previously in bug 117311, which was a documentation
request for existing (but unsupported) behavior.  This bug here is not for
documenting existing behavior, but rather a request to define a supported
solution for specifying a frame pointer dependency in inline asm, whatever that
looks like. ]

In the Linux kernel on x86 it's common for inline asm to have a "call" to a
thunk which saves registers before calling out to another function.  If the
inline asm is in a leaf function, or gets emitted before the function prologue,
the frame pointer doesn't get set up before the call.

The current unsupported workaround used throughout the kernel is to make the
stack pointer an in/out constraint:

  register unsigned long current_stack_pointer asm("%rsp");
  #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)

We've also experimented with making __builtin_frame_address(0) an input
constraint, which also seems to work:

  #define ASM_CALL_CONSTRAINT "r" (__builtin_frame_address(0))

Unfortunately these are unsupported hacks which just happen to work.  It would
be much better to have a supported solution.

The attached test case results in the following:

0000000000000000 <pfn_modify_allowed>:
   0:   e8 00 00 00 00          call   5 <pfn_modify_allowed+0x5>       1:
R_X86_64_PLT32       __SCT__preempt_schedule-0x4
   5:   48 83 3d 00 00 00 00 00         cmpq   $0x0,0x0(%rip)        # d
<pfn_modify_allowed+0xd>       8: R_X86_64_PC32       
pfn_modify_allowed_pfn-0x5
   d:   75 01                   jne    10 <pfn_modify_allowed+0x10>
   f:   c3                      ret
  10:   55                      push   %rbp
  11:   31 c0                   xor    %eax,%eax
  13:   48 89 e5                mov    %rsp,%rbp
  16:   e8 00 00 00 00          call   1b <pfn_modify_allowed+0x1b>     17:
R_X86_64_PLT32      capable-0x4
  1b:   5d                      pop    %rbp
  1c:   c3                      ret

Reply via email to