https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113258
--- Comment #3 from Nicholas Miell <nmiell at gmail dot com> --- Windows and macOS (and Solaris) have sensible dynamic linker semantics where the pre-C++17 code and the post-C++17 code uses two different C++ runtimes and furthermore the application can use tcmalloc while the GPU driver uses whatever allocator it feels like. GCC has dealt with similar issues before, cf. bug 68210. One way to fix this issue would probably be to cause the libstdc++ supplied std::align_val_t variants to always invoke the pre-C++17 operators, since that is the implicit ABI contract.