[PATCH] vect: Don't set excess bits in unform masks

Andrew Stubbs Fri, 20 Oct 2023 08:49:07 -0700

This patch fixes a wrong-code bug on amdgcn in which the excess "ones"in the mask enable extra lanes that were supposed to be unused and aretherefore undefined.

Richi suggested an alternative approach involving narrower types andthen a zero-extend to the actual mask type. This solved the problem forthe specific test case that I had, but I'm not sure if it would workwith V2 and V4 modes (not that I've observed bad behaviour from themanyway, but still). There were some other caveats involving "two-lanedivision" that I don't fully understand, so I went with the simplerimplementation.

This patch does have the disadvantage of an additional "and" instructionin the non-constant case even for machines that don't need it. I'm notsure how to fix that without an additional target hook. (If GCC coulduse the 64-lane vectors more effectively without the assistance ofartificially reduced sizes then this problem wouldn't exist.)


OK to commit?

Andrew

vect: Don't set excess bits in unform masks

AVX ignores any excess bits in the mask, but AMD GCN magically uses a larger
vector than was intended (the smaller sizes are "fake"), leading to wrong-code.

gcc/ChangeLog:

        * expr.cc (store_constructor): Add "and" operation to uniform mask
        generation.

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 4220cbd9f8f..fb4609f616e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -7440,7 +7440,7 @@ store_constructor (tree exp, rtx target, int cleared, 
poly_int64 size,
            break;
          }
        /* Use sign-extension for uniform boolean vectors with
-          integer modes.  */
+          integer modes.  Effectively "vec_duplicate" for bitmasks.  */
        if (!TREE_SIDE_EFFECTS (exp)
            && VECTOR_BOOLEAN_TYPE_P (type)
            && SCALAR_INT_MODE_P (mode)
@@ -7449,7 +7449,21 @@ store_constructor (tree exp, rtx target, int cleared, 
poly_int64 size,
          {
            rtx op0 = force_reg (TYPE_MODE (TREE_TYPE (elt)),
                                 expand_normal (elt));
-           convert_move (target, op0, 0);
+           rtx tmp = gen_reg_rtx (mode);
+           convert_move (tmp, op0, 0);
+
+           if (known_ne (TYPE_VECTOR_SUBPARTS (type),
+                         GET_MODE_PRECISION (mode)))
+             {
+               /* Ensure no excess bits are set.
+                  GCN needs this, AVX does not.  */
+               expand_binop (mode, and_optab, tmp,
+                             GEN_INT ((1 << (TYPE_VECTOR_SUBPARTS (type)
+                                             .to_constant())) - 1),
+                             target, true, OPTAB_DIRECT);
+             }
+           else
+             emit_move_insn (target, tmp);
            break;
          }

[PATCH] vect: Don't set excess bits in unform masks

Reply via email to