https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118276

--- Comment #6 from Ben FrantzDale <benfrantzdale at gmail dot com> ---
I think I understand. You are saying that gcc wants to (or must?) zero-out the
entire struct in the trivial case, which includes `S() = default;` but with
`S() noexcept {}` it winds up on a code path where it knows it doesn't have to
zero the padding, which leads to the xmm version which on the quick-bench.com
CPU turned out faster?

This is backed up by the fact that if I do
```
    S() noexcept {
        std::memset(this, 0, sizeof(*this));
    }
```
I get the `stosq` version, and if I leave c and x uninitialized but `memset`
just them to zero, I still get the xmm version. https://godbolt.org/z/bbnc8a66E



So, is this inconsistency a bug, or just that the optimizer honestly thinks
it's faster to use `stosq` (which may be true on some CPUs)?

FWIW Clang generates the XMM version for all cases:
https://godbolt.org/z/EjGEcd4rv

Reply via email to