https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120395

            Bug ID: 120395
           Summary: Calls to std::__is_constant_evaluated() hurt codegen
                    at -O0
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

void x(int);

[[gnu::always_inline]] inline bool
is_constant_evaluated()
{ return __builtin_is_constant_evaluated(); }

struct Iter
{
    typedef int value_type;

    int& operator*() const;
    Iter& operator++();
    bool operator!=(const Iter&) const;
};

void f(Iter first, Iter last)
{
    if (__is_trivial(Iter::value_type))
        if (!is_constant_evaluated())
            return;
    for (; first != last; ++first)
        x(*first);
}

At -O1 this is fine, the code compiles away completey as expected. But at -O0
the always_inline function seems to not be inlined, or to act as though it
isn't.

Even though the function f is not constexpr, so can never be constant
evaluated, the generated code includes the loop, which should be dead code:

f(Iter, Iter):
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     eax, 0
        xor     eax, 1
        test    al, al
        jne     .L7
        jmp     .L5
.L6:
        lea     rax, [rbp-1]
        mov     rdi, rax
        call    Iter::operator*() const
        mov     eax, DWORD PTR [rax]
        mov     edi, eax
        call    x(int)
        lea     rax, [rbp-1]
        mov     rdi, rax
        call    Iter::operator++()
.L5:
        lea     rdx, [rbp-2]
        lea     rax, [rbp-1]
        mov     rsi, rdx
        mov     rdi, rax
        call    Iter::operator!=(Iter const&) const
        test    al, al
        jne     .L6
        jmp     .L1
.L7:
        nop
.L1:
        leave
        ret


If we comment out the if (!is_constant_evaluated()) check then the code is
ideal:

f(Iter, Iter):
        push    rbp
        mov     rbp, rsp
        nop
        pop     rbp
        ret

(again, this is for -O0 so it's not a single instruction like at -O1 but it's
what I expect for -O0).

If we replace the always_inline function and just call
__builtin_is_constant_evaluated() directly, the code is ideal. So why does an
always_inline function that calls the builtin not produce the same code as the
builtin? Using the builtin directly isn't an option for the library code.

Is there anything we can do to remove this overhead for
std::__is_constant_evaluated() in libstdc++? Newer C++ standards require us to
put those checks in more and more places, but I didn't realise the effect on
codegen could be so significant.

I'd really prefer not to have to replace the always_inline function with a
macro that expands directly to the builtin (or to `if consteval` for C++23).

(As a separate issue, I've noticed that std::is_constant_evaluated() isn't
marked always_inline, which should be fixed so it's at least on a par with
std::__is_constant_evaluated(), even if that has bad codegen for -O0.)

Reply via email to