https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112666
Bug ID: 112666 Summary: Missed optimization: Value initialization zero-initializes members with user-defined constructor Product: gcc Version: 11.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: paisanafc at gmail dot com Target Milestone: --- Looking for the presence of "memset" instructions in the generated assembly, it seems that gcc is zero-initializing class members with user-defined constructors that shouldn't need to be zero-initialized. I share below the example benchmark and a godbolt link for convenience (https://godbolt.org/z/158q6sfen). I used the benchmark library as I didn't know an easy way to reproduce the instruction `benchmark::DoNotOptimize`. I hope that's ok. --- #include <benchmark/benchmark.h> #include <array> struct A { A() = default; ~A() { benchmark::DoNotOptimize(c); // avoid inlining } std::array<char, 50000> member; char c; }; struct B { B() {} // user-defined ctor ~B() { benchmark::DoNotOptimize(c); // avoid inlining } std::array<char, 50000> member; char c; }; struct C { // no user-defined ctor B b; int dummy; }; // The benchmark code: static void ACreation(benchmark::State& state) { for (auto _ : state) { A a{}; benchmark::DoNotOptimize(a); } } BENCHMARK(ACreation); static void BCreation(benchmark::State& state) { for (auto _ : state) { B b{}; benchmark::DoNotOptimize(b); } } BENCHMARK(BCreation); static void CCreation(benchmark::State& state) { for (auto _ : state) { C c{}; benchmark::DoNotOptimize(c); } } BENCHMARK(CCreation); BENCHMARK_MAIN(); --- When I run this with https://github.com/google/benchmark, I get the following results (with gcc++11.4 and above): ----------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------- ACreation 736 ns 736 ns 933741 BCreation 3.62 ns 3.62 ns 191180154 CCreation 755 ns 754 ns 944906 The struct "C" which is just "B" and an int is much slower at being initialized than B when value initialization (via {}) is used. However, my understanding of the C++ standard is that members with a user-defined default constructor do not need to be zero-initialized in this situation. Looking at the godbolt assembly output, I see that both `A a{}` and `C c{}` generate a memset instruction, while `B b{}` doesn't. Clang, on the other hand, seems to initialize C almost as fast as B. This potentially missed optimization in gcc is particularly nasty for structs with large embedded storage (e.g. structs that contain C-arrays, std::arrays, or static_vectors).