https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #17 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Xi Ruoyao from comment #16) > (In reply to Tamar Christina from comment #15) > > (In reply to Xi Ruoyao from comment #13) > > > For example for the original gcc.dg/pr108692.c: > > > > > > a.0_4 = (unsigned char) a_14; > > > _5 = (int) a.0_4; > > > b.1_6 = (unsigned char) b_16; > > > _7 = (int) b.1_6; > > > c_17 = _5 - _7; > > > _8 = ABS_EXPR <c_17>; > > > r_18 = _8 + r_23; > > > > > > becomes: > > > > > > patt_31 = .ABD (a.0_4, b.1_6); > > > > > > so when we get the ABD we already stripped the redundant (int) conversion > > > here, note that the (unsigned char) conversion is really needed. And for > > > my > > > twisted test case: > > > > > > a.0_4 = (unsigned char) a_14; > > > _5 = (unsigned int) a.0_4; > > > _6 = (unsigned int) b_16; > > > _7 = _5 - _6; > > > c_17 = (int) _7; > > > _8 = ABS_EXPR <c_17>; > > > > > > becomes: > > > > > > patt_38 = (unsigned short) a.0_4; > > > patt_35 = (signed short) b_16; > > > patt_36 = (signed short) patt_38; > > > patt_33 = .ABD (patt_36, patt_35); > > > > > > To me this is already the "optimal" way. Correct me if I'm wrong, but > > > anyway even if there's a redundant conversion it should be stripped in > > > vect_recog_abd_pattern instead of vect_recog_sad_pattern. > > > > No, that's not correct. SAD_EXPR is the most optimal version here as > > SAD_EXPR performs double widening. It removes the need for the intermediate > > vector widening. > > > > a.0_4 = (unsigned char) a_14; > > patt_b_16 = (unsigned char) b_16 > > patt_41 = SAD_EXPR <a.0_4, patt_b_16, r_23>; > > > > is the most optimal form, as it does the operation on bytes rather than > > short. > > I don't think it's correct... In the twisted test case we have something > like > > r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b) > > So if a is 255 and b is 128, we should add r with abs(255 - (-128)) = 383, > but with this form we'll add it with 127. The operation you posted above is an unsigned operation, in the original expression the subtract is done on zero extended bytes. > > > so when we get the ABD we already stripped the redundant (int) conversion > > > here, note that the (unsigned char) conversion is really needed. And for > > > my > > > twisted test case: > > > > > > a.0_4 = (unsigned char) a_14; > > > _5 = (unsigned int) a.0_4; > > > _6 = (unsigned int) b_16; > > > _7 = _5 - _6; > > > c_17 = (int) _7; (In reply to Xi Ruoyao from comment #16) > (In reply to Tamar Christina from comment #15) > > (In reply to Xi Ruoyao from comment #13) > > > For example for the original gcc.dg/pr108692.c: > > > > > > a.0_4 = (unsigned char) a_14; > > > _5 = (int) a.0_4; > > > b.1_6 = (unsigned char) b_16; > > > _7 = (int) b.1_6; > > > c_17 = _5 - _7; > > > _8 = ABS_EXPR <c_17>; > > > r_18 = _8 + r_23; > > > > > > becomes: > > > > > > patt_31 = .ABD (a.0_4, b.1_6); > > > > > > so when we get the ABD we already stripped the redundant (int) conversion > > > here, note that the (unsigned char) conversion is really needed. And for > > > my > > > twisted test case: > > > > > > a.0_4 = (unsigned char) a_14; > > > _5 = (unsigned int) a.0_4; > > > _6 = (unsigned int) b_16; > > > _7 = _5 - _6; > > > c_17 = (int) _7; > > > _8 = ABS_EXPR <c_17>; > > > > > > becomes: > > > > > > patt_38 = (unsigned short) a.0_4; > > > patt_35 = (signed short) b_16; > > > patt_36 = (signed short) patt_38; > > > patt_33 = .ABD (patt_36, patt_35); > > > > > > To me this is already the "optimal" way. Correct me if I'm wrong, but > > > anyway even if there's a redundant conversion it should be stripped in > > > vect_recog_abd_pattern instead of vect_recog_sad_pattern. > > > > No, that's not correct. SAD_EXPR is the most optimal version here as > > SAD_EXPR performs double widening. It removes the need for the intermediate > > vector widening. > > > > a.0_4 = (unsigned char) a_14; > > patt_b_16 = (unsigned char) b_16 > > patt_41 = SAD_EXPR <a.0_4, patt_b_16, r_23>; > > > > is the most optimal form, as it does the operation on bytes rather than > > short. > > I don't think it's correct... In the twisted test case we have something > like > > r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b) > > So if a is 255 and b is 128, we should add r with abs(255 - (-128)) = 383, > but with this form we'll add it with 127. Ah right, because b didn't have an explicit cast to unsigned char first and it's unsigned in this case. I was however arguing about: int foo (unsigned char *x, unsigned char *y, int n) { int i, r = 0; unsigned char a, b; for (i = 0; i < n; i++) { a = x[i]; b = y[i]; int c = (unsigned char) a - (unsigned char) b; r = r + (c < 0 ? -c : c); } return r; } where vect_recog_over_widening_pattern will split the zero extension into: /app/example.c:6:17: note: Splitting statement: _4 = (int) a_12; /app/example.c:6:17: note: into pattern statements: patt_43 = (unsigned short) a_12; /app/example.c:6:17: note: and: patt_42 = (int) patt_43; and so SAD_EXPR needs to recover from the intermediate split. However you're right that ABD already does this correctly because vect_recog_absolute_difference handles it through vect_widened_op_tree. So indeed I agree that the calls are redundant for vect_recog_sad_pattern