https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
   Target Milestone|---                         |14.0

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #5)
> For example on full-1.c int8_t type:
> 
>   <bb 3> [local count: 75161909]:
>   # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)>
>   # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)>
>   # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)>
>   # ivtmp_29 = PHI <ivtmp_30(5), 0(2)>
>   # loop_len_16 = PHI <_34(5), 16(2)>
>   vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0);
>   vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13);
>   vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0);
>   vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22);
>   vect__5.12_24 = vect__2.7_12 + vect__4.11_23;
>   vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24);
>   .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0);
>   vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16;
>   vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16;
>   vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16;
>   ivtmp_30 = ivtmp_29 + 16;
>   _32 = MIN_EXPR <ivtmp_30, 127>;
>   _33 = 127 - _32;
>   _34 = MIN_EXPR <_33, 16>;
>   if (ivtmp_30 <= 126)

With this exit condition niter analysis can work.

>     goto <bb 5>; [85.71%]
>   else
>     goto <bb 4>; [14.29%]
> 
> vs.
> 
>   <bb 3> [local count: 75161909]:
>   # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)>
>   # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)>
>   # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)>
>   # ivtmp_29 = PHI <ivtmp_30(5), 127(2)>
>   loop_len_16 = MIN_EXPR <ivtmp_29, 16>;
>   vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0);
>   vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13);
>   vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0);
>   vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22);
>   vect__5.12_24 = vect__2.7_12 + vect__4.11_23;
>   vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24);
>   .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0);
>   vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16;
>   vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16;
>   vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16;
>   ivtmp_30 = ivtmp_29 - loop_len_16;
>   if (ivtmp_30 != 0)

While here it will fail because ivtmp_30 isn't affine - it doesn't
decrement by an invariant amount but instead by MIN <ivtmp_29, 16>.

Note this will not only pessimize niter analysis but all analyses relying
on SCEV (for uses of this IV!).

The decrement is essentially saturating to zero so we might be able to
special-case this in niter analysis - but still I don't see how to
generally handle this in SCEV.  If we know that niter will fit into
a signed IV we could rewrite the exit test to ivtmp_30 > 0 and decrement
by constant 16.  Alternatively one can test the pre-decrement value,
in the above case

  if (ivtmp_29 >= 16)

which isn't ideal for IV coalescing later but it also allows

  ivtmp_30 = ivtmp_29 - 16;

here.

Reply via email to