https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117742
Bug ID: 117742
Summary: Inefficient code for __builtin_clear_padding
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: nate at thatsmathematics dot com
Target Milestone: ---
With all versions of `gcc -O3` on godbolt, the following C++ code
struct A {
char c;
int l;
};
A clear_padding(A x) {
__builtin_clear_padding(&x);
return x;
}
compiles on aarch64 to
clear_padding(A):
sub sp, sp, #16
strh wzr, [sp]
strb wzr, [sp, 2]
ldr x1, [sp]
add sp, sp, 16
bfi x0, x1, 8, 24
ret
where for some reason it writes zeros to the stack and then loads them back to
insert into the struct. This is wholly unnecessary and the function could be
the single instruction `and x0, x0, #0xffffffff000000ff`.
This also shows up in places like `std::atomic<A>::compare_exchange_weak()`
which need to clear padding.
Other target architectures are similar.