https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102404
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Freddie Witherden from comment #4) > Created attachment 51485 [details] > Clang assembly. It seems to be because the current GCC loop vectorizer does not support different vector sizes, and here the index vector is 256bit. Change tripcount to 16 successfully generate zmm.