https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98461
--- Comment #8 from Denis Yaroshevskiy <denis.yaroshevskij at gmail dot com> --- Thank you for the fast fix. I can already see that the code is in trunk and works for both 256 and 128 bit registers. 256: https://godbolt.org/z/5sT48f 128: https://godbolt.org/z/Exo3d9 I am a bit confused as to why 128 bit codegen differs in two cases but both seem to be reasonable at glance.