https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #3 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> --- The stdx::simd implementation in this area is old and mainly tuned to be correct. I can rewrite the split and concat implementation to use __builtin_shufflevector (which wasn't available in GCC at the time when I originally implemented it). Doing so I can resolve this issue. How do you want to handle this? Because it would certainly be nice if the compiler can optimize this in the same way as Clang can. Should I try to come up with a testcase that doesn't need stdx::simd and then improve stdx::simd independently?