https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81501
--- Comment #1 from Julian Andres Klode <j...@jak-linux.org> --- To qualify the performance overhead, I added empty constructors and destructors with noinline, and compiled the code with g++ and clang++, and then ran a loop 100000000 over the function. The clang code took 6 nanosecond, the g++ code 8 nanosecond per iteration, that's 33% worse. I think it's probably a neglectable overhead after all, but it seems like that would be a sensible and maybe an easy optimization to do.