https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P1 Target Milestone|--- |14.0 --- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Kewen Lin from comment #5) > For example on full-1.c int8_t type: > > <bb 3> [local count: 75161909]: > # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)> > # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)> > # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)> > # ivtmp_29 = PHI <ivtmp_30(5), 0(2)> > # loop_len_16 = PHI <_34(5), 16(2)> > vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0); > vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13); > vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0); > vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22); > vect__5.12_24 = vect__2.7_12 + vect__4.11_23; > vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24); > .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0); > vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16; > vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16; > vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16; > ivtmp_30 = ivtmp_29 + 16; > _32 = MIN_EXPR <ivtmp_30, 127>; > _33 = 127 - _32; > _34 = MIN_EXPR <_33, 16>; > if (ivtmp_30 <= 126) With this exit condition niter analysis can work. > goto <bb 5>; [85.71%] > else > goto <bb 4>; [14.29%] > > vs. > > <bb 3> [local count: 75161909]: > # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)> > # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)> > # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)> > # ivtmp_29 = PHI <ivtmp_30(5), 127(2)> > loop_len_16 = MIN_EXPR <ivtmp_29, 16>; > vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0); > vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13); > vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0); > vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22); > vect__5.12_24 = vect__2.7_12 + vect__4.11_23; > vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24); > .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0); > vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16; > vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16; > vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16; > ivtmp_30 = ivtmp_29 - loop_len_16; > if (ivtmp_30 != 0) While here it will fail because ivtmp_30 isn't affine - it doesn't decrement by an invariant amount but instead by MIN <ivtmp_29, 16>. Note this will not only pessimize niter analysis but all analyses relying on SCEV (for uses of this IV!). The decrement is essentially saturating to zero so we might be able to special-case this in niter analysis - but still I don't see how to generally handle this in SCEV. If we know that niter will fit into a signed IV we could rewrite the exit test to ivtmp_30 > 0 and decrement by constant 16. Alternatively one can test the pre-decrement value, in the above case if (ivtmp_29 >= 16) which isn't ideal for IV coalescing later but it also allows ivtmp_30 = ivtmp_29 - 16; here.