https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70130
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- The preprocessed source hints at the for (j=0; j < MB_BLOCK_SIZE; j++) { for (i=0; i < MB_BLOCK_SIZE; i++) { img->mprr_2[VERT_PRED_16][j][i]=s[i][0]; // store vertical prediction img->mprr_2[HOR_PRED_16 ][j][i]=s[j][1]; // store horizontal prediction img->mprr_2[DC_PRED_16 ][j][i]=s0; // store DC prediction } } loop instead. Where I see bb3: vect__39.11_16 = MEM[(int *)vectp_s.13_15]; vect__39.14_17 = __builtin_altivec_mask_for_load (vectp_s.13_13); ... bb4: # vect__39.15_18 = PHI <vect__39.18_24(8), vect__39.11_16(3)> # vectp_s.16_20 = PHI <vectp_s.16_22(8), vectp_s.17_19(3)> vectp_s.16_23 = vectp_s.16_20 & -16B; vect__39.18_24 = MEM[(int *)vectp_s.16_23]; vect__39.19_1 = REALIGN_LOAD <vect__39.15_18, vect__39.18_24, vect__39.14_17>; vectp_s.16_314 = vectp_s.16_20 + 18446744073709551608; (oops) ... vectp_s.16_22 = vectp_s.16_314 + 16; ... if (ivtmp_210 < 15) goto <bb 8>; else goto <bb 7>; <bb 8>: goto <bb 4>; the (oops) marked IV adjustment breaks the realign-optimized handling I believe. It adjusts for the group gap (it subtracts 8 from the pointer) and thus makes both the mask and the previous load value invalid.