https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111796
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- GCN handles this fine using simdlen(64). <bb 3> [local count: 939524096]: # ivtmp.31_10 = PHI <ivtmp.31_11(3), ivtmp.31_4(2)> vectp_x.27_17 = (vector(64) int *) ivtmp.31_10; vect__4.24_14 = MEM <vector(64) int> [(int *)vectp_x.27_17]; vect__5.25_15 = (vector(64) short int) vect__4.24_14; vect__6.26_16 = foo.simdclone.0 (vect__4.24_14, vect__5.25_15); MEM <vector(64) int> [(int *)vectp_x.27_17] = vect__6.26_16; ivtmp.31_11 = ivtmp.31_10 + 256; if (_8 != ivtmp.31_11) goto <bb 3>; [85.71%]