https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82432
Bug ID: 82432 Summary: Missed constant propagation of return values of non-inlined static functions Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: peter at cordes dot ca Target Milestone: --- static __attribute((noinline)) int get_constant() { /* optionally stuff with side effects */ return 42; } movl $42, %eax ret // Consider the case where this is large enough to not inline (even without an attribute), but still returns a constant. e.g. a success/fail status that we can prove is always success, or just the current implementation always returns success but the callers still check. int call_constant() { return 10 - get_constant(); } call get_constant() movl $10, %edx subl %eax, %edx movl %edx, %eax ret Even though the function didn't inline so we still have to call it, its return value is a compile-time constant. call get_constant mov $(10-42), %eax ret would be a better way to compile this. It potentially breaks a data dependency chain, and saves instructions. And enables further constprop if the caller isn't trivial and does more with the return value. For return values passed by hidden pointer, it avoids store-forwarding latency. If we want the value in memory, we can use the copy the callee put there. If we made a .clone version that uses a custom calling convention, we could have the callee skip storing the return value if it's constant for all callers. (Hmm, checking this could cost a lot of compile time, especially with LTO. The simpler version is to only optimize it away for small objects that are really constant, not just from constant propagation from one caller's args.) One useful case is returning a std::optional<>. Even if the .value() is unknown, it might be known that there *is* a value, so the caller doesn't have to check the `bool` member. libstdc++'s optional<T> is not trivially-copyable even if T is, so it returns via hidden pointer for optional<int>. (libc++ does implement it that way, so it returns packed into a register in x86-64, but clang also still checks the return value when it doesn't inline. https://stackoverflow.com/a/46546636/224132) int baz() { return 1 + get_std_optional_int().value(); } subq $24, %rsp leaq 8(%rsp), %rdi call get_std_optional_int() cmpb $0, 12(%rsp) je .L98 movl 8(%rsp), %eax addq $24, %rsp addl $1, %eax ret baz() [clone .cold.49]: .L98: call abort This obviously simplifies the call site some if we don't have to check the return value. But we still have to provide storage space unless we make a nonstandard-calling-convention clone of get_std_optional_int() which ideally returns in %eax and %edx. (Returning small objects packed less tightly into multiple registers would probably be a win in general for non-constant return values, if we want to start cloning static functions and discarding the ABI-compliant definition. Or with LTO or whole-program, as this post argues: https://stackoverflow.com/a/46549978/224132)