[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905

rguenth at gcc dot gnu.org via Gcc-bugs Thu, 25 May 2023 23:22:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
   Target Milestone|---                         |14.0

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #5)
> For example on full-1.c int8_t type:
> 
>   <bb 3> [local count: 75161909]:
>   # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)>
>   # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)>
>   # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)>
>   # ivtmp_29 = PHI <ivtmp_30(5), 0(2)>
>   # loop_len_16 = PHI <_34(5), 16(2)>
>   vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0);
>   vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13);
>   vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0);
>   vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22);
>   vect__5.12_24 = vect__2.7_12 + vect__4.11_23;
>   vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24);
>   .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0);
>   vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16;
>   vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16;
>   vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16;
>   ivtmp_30 = ivtmp_29 + 16;
>   _32 = MIN_EXPR <ivtmp_30, 127>;
>   _33 = 127 - _32;
>   _34 = MIN_EXPR <_33, 16>;
>   if (ivtmp_30 <= 126)

With this exit condition niter analysis can work.

>     goto <bb 5>; [85.71%]
>   else
>     goto <bb 4>; [14.29%]
> 
> vs.
> 
>   <bb 3> [local count: 75161909]:
>   # vectp_a_int8_t.4_18 = PHI <vectp_a_int8_t.4_17(5), &a_int8_t(2)>
>   # vectp_b_int8_t.8_8 = PHI <vectp_b_int8_t.8_7(5), &b_int8_t(2)>
>   # vectp_c_int8_t.14_26 = PHI <vectp_c_int8_t.14_27(5), &c_int8_t(2)>
>   # ivtmp_29 = PHI <ivtmp_30(5), 127(2)>
>   loop_len_16 = MIN_EXPR <ivtmp_29, 16>;
>   vect__1.6_13 = .LEN_LOAD (vectp_a_int8_t.4_18, 8B, loop_len_16, 0);
>   vect__2.7_12 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__1.6_13);
>   vect__3.10_22 = .LEN_LOAD (vectp_b_int8_t.8_8, 8B, loop_len_16, 0);
>   vect__4.11_23 = VIEW_CONVERT_EXPR<vector(16) unsigned char>(vect__3.10_22);
>   vect__5.12_24 = vect__2.7_12 + vect__4.11_23;
>   vect__6.13_25 = VIEW_CONVERT_EXPR<vector(16) signed char>(vect__5.12_24);
>   .LEN_STORE (vectp_c_int8_t.14_26, 8B, loop_len_16, vect__6.13_25, 0);
>   vectp_a_int8_t.4_17 = vectp_a_int8_t.4_18 + 16;
>   vectp_b_int8_t.8_7 = vectp_b_int8_t.8_8 + 16;
>   vectp_c_int8_t.14_27 = vectp_c_int8_t.14_26 + 16;
>   ivtmp_30 = ivtmp_29 - loop_len_16;
>   if (ivtmp_30 != 0)

While here it will fail because ivtmp_30 isn't affine - it doesn't
decrement by an invariant amount but instead by MIN <ivtmp_29, 16>.

Note this will not only pessimize niter analysis but all analyses relying
on SCEV (for uses of this IV!).

The decrement is essentially saturating to zero so we might be able to
special-case this in niter analysis - but still I don't see how to
generally handle this in SCEV.  If we know that niter will fit into
a signed IV we could rewrite the exit test to ivtmp_30 > 0 and decrement
by constant 16.  Alternatively one can test the pre-decrement value,
in the above case

  if (ivtmp_29 >= 16)

which isn't ideal for IV coalescing later but it also allows

  ivtmp_30 = ivtmp_29 - 16;

here.

[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905

Reply via email to