https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117742

            Bug ID: 117742
           Summary: Inefficient code for __builtin_clear_padding
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

With all versions of `gcc -O3` on godbolt, the following C++ code

struct A {
    char c;
    int l;
};

A clear_padding(A x) {
    __builtin_clear_padding(&x);
    return x;
}

compiles on aarch64 to

clear_padding(A):
        sub     sp, sp, #16
        strh    wzr, [sp]
        strb    wzr, [sp, 2]
        ldr     x1, [sp]
        add     sp, sp, 16
        bfi     x0, x1, 8, 24
        ret

where for some reason it writes zeros to the stack and then loads them back to
insert into the struct.  This is wholly unnecessary and the function could be
the single instruction `and x0, x0, #0xffffffff000000ff`.

This also shows up in places like `std::atomic<A>::compare_exchange_weak()`
which need to clear padding.

Other target architectures are similar.

Reply via email to