15 Regression] 531.deepsjeng_r fails to verify with -O3 -march=znver4 --param vect-partial-vector-usage=2

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 15 Jul 2024 04:50:42 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843


--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
t.c:9:17: note:  misalignment for fully-masked loop: 15

so in the first iteration only the last element should be active.  But

  # loop_mask_58 = PHI <_100(10), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)>
  # loop_mask_57 = PHI <_101(10), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)>

is then wrong and

  # vect_vec_iv_.6_46 = PHI <_47(10), { -15, -14, -13, -12, -11, -10, -9, -8,
-7, -6, -5, -4, -3, -2, -1, 0 }(2)>
  _47 = vect_vec_iv_.6_46 + { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
16, 16, 16, 16 };
  vect__1.7_49 = vect_vec_iv_.6_46 & { 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7 };

are the values for { 1, 2, ... } thus the next iteration (not
relevant for this particular induction use).

There are then un(loop-)masked uses of the mask derived from vect__1.7_49
in

  vect_patt_15.25_84 = VEC_COND_EXPR <mask_patt_23.15_66, vect_patt_20.18_72,
vect__6.22_79>;

but ultimatively the loop mask is applied in its uses via .MASK_STORE.

I have a fix.

[Bug tree-optimization/115843] [14/15 Regression] 531.deepsjeng_r fails to verify with -O3 -march=znver4 --param vect-partial-vector-usage=2

Reply via email to