Jakub Jelinek <ja...@redhat.com> writes:
> Hi!
>
> The following testcase is miscompiled on aarch64-linux since r11-5160.
> Given
>   <bb 3> [local count: 955630225]:
>   # i_22 = PHI <i_20(6), 0(5)>
>   # r_23 = PHI <r_19(6), 0(5)>
> ...
>   a.0_5 = (unsigned char) a_15;
>   _6 = (int) a.0_5;
>   b.1_7 = (unsigned char) b_17;
>   _8 = (int) b.1_7;
>   c_18 = _6 - _8;
>   _9 = ABS_EXPR <c_18>;
>   r_19 = _9 + r_23;
> ...
> where SSA_NAMEs 15/17 have signed char, 5/7 unsigned char and rest is int
> we first pattern recognize c_18 as
> patt_34 = (a.0_5) w- (b.1_7);
> which is still correct, 5/7 are unsigned char subtracted in wider type,
> but then vect_recog_sad_pattern turns it into
> SAD_EXPR <a_15, b_17, r_23>
> which is incorrect, because 15/17 are signed char and so it is
> sum of absolute signed differences rather than unsigned sum of
> absolute unsigned differences.
> The reason why this happens is that vect_recog_sad_pattern calls
> vect_widened_op_tree with MINUS_EXPR, WIDEN_MINUS_EXPR on the
> patt_34 = (a.0_5) w- (b.1_7); statement's vinfo and vect_widened_op_tree
> calls vect_look_through_possible_promotion on the operands of the
> WIDEN_MINUS_EXPR, which looks through the further casts.
> vect_look_through_possible_promotion has careful code to stop when there
> would be nested casts that need to be preserved, but the problem here
> is that the WIDEN_*_EXPR operation itself has an implicit cast on the
> operands already - in this case of WIDEN_MINUS_EXPR the unsigned char
> 5/7 SSA_NAMEs are widened to unsigned short before the subtraction,
> and vect_look_through_possible_promotion obviously isn't told about that.
>
> Now, I think when we see those WIDEN_{MULT,MINUS,PLUS}_EXPR codes, we had
> to look through possible promotions already when creating those and so
> vect_look_through_possible_promotion again isn't really needed, all we need
> to do is arrange what that function will do if the operand isn't result
> of any cast.  Other option would be let vect_look_through_possible_promotion
> know about the implicit promotion from the WIDEN_*_EXPR, but I'm afraid
> that would be much harder.
>
> Bootstrapped/regtested on aarch64-linux, x86_64-linux, i686-linux and
> powerpc64le-linux, ok for trunk?
>
> 2023-02-08  Jakub Jelinek  <ja...@redhat.com>
>
>       PR tree-optimization/108692
>       * tree-vect-patterns.cc (vect_widened_op_tree): If rhs_code is
>       widened_code which is different from code, don't call
>       vect_look_through_possible_promotion but instead just check op is
>       SSA_NAME with integral type for which vect_is_simple_use is true
>       and call set_op on this_unprom.

OK, thanks.

(I think the INTEGRAL_TYPE_P is redundant, but it can't hurt.)

Richard

>
>       * gcc.dg/pr108692.c: New test.
>
> --- gcc/tree-vect-patterns.cc.jj      2023-01-02 09:32:45.635949342 +0100
> +++ gcc/tree-vect-patterns.cc 2023-02-07 15:27:33.214608837 +0100
> @@ -601,7 +601,25 @@ vect_widened_op_tree (vec_info *vinfo, s
>         if (shift_p && i == 1)
>           return 0;
>  
> -       if (!vect_look_through_possible_promotion (vinfo, op, this_unprom))
> +       if (rhs_code != code)
> +         {
> +           /* If rhs_code is widened_code, don't look through further
> +              possible promotions, there is a promotion already embedded
> +              in the WIDEN_*_EXPR.  */
> +           if (TREE_CODE (op) != SSA_NAME
> +               || !INTEGRAL_TYPE_P (TREE_TYPE (op)))
> +             return 0;
> +
> +           stmt_vec_info def_stmt_info;
> +           gimple *def_stmt;
> +           vect_def_type dt;
> +           if (!vect_is_simple_use (op, vinfo, &dt, &def_stmt_info,
> +                                    &def_stmt))
> +             return 0;
> +           this_unprom->set_op (op, dt, NULL);
> +         }
> +       else if (!vect_look_through_possible_promotion (vinfo, op,
> +                                                       this_unprom))
>           return 0;
>  
>         if (TYPE_PRECISION (this_unprom->type) == TYPE_PRECISION (type))
> --- gcc/testsuite/gcc.dg/pr108692.c.jj        2023-02-07 15:47:20.329076264 
> +0100
> +++ gcc/testsuite/gcc.dg/pr108692.c   2023-02-07 15:46:15.623031983 +0100
> @@ -0,0 +1,31 @@
> +/* PR tree-optimization/108692 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +__attribute__((noipa)) int
> +foo (signed char *x, signed char *y, int n)
> +{
> +  int i, r = 0;
> +  signed char a, b;
> +  for (i = 0; i < n; i++)
> +    {
> +      a = x[i];
> +      b = y[i];
> +      int c = (unsigned char) a - (unsigned char) b;
> +      r = r + (c < 0 ? -c : c);
> +    }
> +  return r;
> +}
> +
> +int
> +main ()
> +{
> +  signed char x[64] = {}, y[64] = {};
> +  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4)
> +    return 0;
> +  x[32] = -128;
> +  y[32] = 1;
> +  if (foo (x, y, 64) != 127)
> +    __builtin_abort ();
> +  return 0;
> +}
>
>       Jakub

Reply via email to