https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82432

            Bug ID: 82432
           Summary: Missed constant propagation of return values of
                    non-inlined static functions
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: peter at cordes dot ca
  Target Milestone: ---

static __attribute((noinline)) 
  int get_constant() { /* optionally stuff with side effects */
                       return 42; }
        movl    $42, %eax
        ret

// Consider the case where this is large enough to not inline (even without an
attribute), but still returns a constant.  e.g. a success/fail status that we
can prove is always success, or just the current implementation always returns
success but the callers still check.

int call_constant() { return 10 - get_constant(); }

        call    get_constant()
        movl    $10, %edx
        subl    %eax, %edx
        movl    %edx, %eax
        ret

Even though the function didn't inline so we still have to call it, its return
value is a compile-time constant.  

   call  get_constant
   mov $(10-42), %eax
   ret

would be a better way to compile this.  It potentially breaks a data dependency
chain, and saves instructions.  And enables further constprop if the caller
isn't trivial and does more with the return value.

For return values passed by hidden pointer, it avoids store-forwarding latency.
 If we want the value in memory, we can use the copy the callee put there.  If
we made a .clone version that uses a custom calling convention, we could have
the callee skip storing the return value if it's constant for all callers. 
(Hmm, checking this could cost a lot of compile time, especially with LTO.  The
simpler version is to only optimize it away for small objects that are really
constant, not just from constant propagation from one caller's args.)


One useful case is returning a std::optional<>.  Even if the .value() is
unknown, it might be known that there *is* a value, so the caller doesn't have
to check the `bool` member.

libstdc++'s optional<T> is not trivially-copyable even if T is, so it returns
via hidden pointer for optional<int>.  (libc++ does implement it that way, so
it returns packed into a register in x86-64, but clang also still checks the
return value when it doesn't inline.
https://stackoverflow.com/a/46546636/224132)

int baz() {
    return 1 + get_std_optional_int().value();
}
        subq    $24, %rsp
        leaq    8(%rsp), %rdi
        call    get_std_optional_int()
        cmpb    $0, 12(%rsp)
        je      .L98
        movl    8(%rsp), %eax
        addq    $24, %rsp
        addl    $1, %eax
        ret
baz() [clone .cold.49]:
.L98:
        call    abort

This obviously simplifies the call site some if we don't have to check the
return value.

But we still have to provide storage space unless we make a
nonstandard-calling-convention clone of get_std_optional_int() which ideally
returns in %eax and %edx.  (Returning small objects packed less tightly into
multiple registers would probably be a win in general for non-constant return
values, if we want to start cloning static functions and discarding the
ABI-compliant definition.  Or with LTO or whole-program, as this post argues:
https://stackoverflow.com/a/46549978/224132)

Reply via email to