https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120687
--- Comment #3 from Robin Dapp <rdapp at gcc dot gnu.org> --- Yeah, for 8 elements we still have a mode but beyond 8 we at least cannot do a segment access anymore. Then we try with even/odd or interleaved permutations. I kind of wonder why the cost model doesn't reject them, that's probably the saner thing to do for now. On x86 at least, not vectorizing such a loop was the more performant way last I checked.