On Wed, 19 Apr 2023 at 16:17, Richard Biener <richard.guent...@gmail.com> wrote: > > On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni > <prathamesh.kulka...@linaro.org> wrote: > > > > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni > > <prathamesh.kulka...@linaro.org> wrote: > > > > > > On Tue, 11 Apr 2023 at 14:17, Richard Biener <richard.guent...@gmail.com> > > > wrote: > > > > > > > > On Wed, Apr 5, 2023 at 10:39 AM Prathamesh Kulkarni via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > Hi, > > > > > For the following test: > > > > > > > > > > svint32_t f(svint32_t v) > > > > > { > > > > > return svrev_s32 (svrev_s32 (v)); > > > > > } > > > > > > > > > > We generate 2 rev instructions instead of nop: > > > > > f: > > > > > rev z0.s, z0.s > > > > > rev z0.s, z0.s > > > > > ret > > > > > > > > > > The attached patch tries to fix that by trying to recognize the > > > > > following > > > > > pattern in match.pd: > > > > > v1 = VEC_PERM_EXPR (v0, v0, mask) > > > > > v2 = VEC_PERM_EXPR (v1, v1, mask) > > > > > --> > > > > > v2 = v0 > > > > > if mask is { nelts - 1, nelts - 2, nelts - 3, ... } > > > > > > > > > > Code-gen with patch: > > > > > f: > > > > > ret > > > > > > > > > > Bootstrap+test passes on aarch64-linux-gnu, and SVE bootstrap in > > > > > progress. > > > > > Does it look OK for stage-1 ? > > > > > > > > I didn't look at the patch but > > > > tree-ssa-forwprop.cc:simplify_permutation should > > > > handle two consecutive permutes with the > > > > is_combined_permutation_identity > > > > which might need tweaking for VLA vectors > > > Hi Richard, > > > Thanks for the suggestions. The attached patch modifies > > > is_combined_permutation_identity > > > to recognize the above pattern. > > > Does it look OK ? > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-linux-gnu. > > Hi, > > ping https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615502.html > > Can you instead of def_stmt pass in a bool whether rhs1 is equal to rhs2 > and amend the function comment accordingly, say, > > tem = VEC_PERM <op0, op1, MASK1>; > res = VEC_PERM <tem, tem, MASK2>; > > SAME_P specifies whether op0 and op1 compare equal. */ > > + if (def_stmt) > + gcc_checking_assert (is_gimple_assign (def_stmt) > + && gimple_assign_rhs_code (def_stmt) == > VEC_PERM_EXPR); > this is then unnecessary > > mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, > mask2); > + > + /* For VLA masks, check for the following pattern: > + v1 = VEC_PERM_EXPR (v0, v0, mask) > + v2 = VEC_PERM_EXPR (v1, v1, mask) > + --> > + v2 = v0 > > you are not using 'mask' so please defer fold_ternary until after your > special-case. > > + if (operand_equal_p (mask1, mask2, 0) > + && !VECTOR_CST_NELTS (mask1).is_constant () > + && def_stmt > + && operand_equal_p (gimple_assign_rhs1 (def_stmt), > + gimple_assign_rhs2 (def_stmt), 0)) > + { > + vec_perm_builder builder; > + if (tree_to_vec_perm_builder (&builder, mask1)) > + { > + poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1)); > + vec_perm_indices sel (builder, 1, nelts); > + if (sel.series_p (0, 1, nelts - 1, -1)) > + return 1; > + } > + return 0; > > I'm defering to Richard whether this is the correct way to check for a vector > reversing mask (I wonder how constructing such mask is even possible) Hi Richard, Thanks for the suggestions, I have updated the patch accordingly.
The following hunk from svrev_impl::fold() constructs mask in reverse: /* Permute as { nelts - 1, nelts - 2, nelts - 3, ... }. */ poly_int64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (f.lhs)); vec_perm_builder builder (nelts, 1, 3); for (int i = 0; i < 3; ++i) builder.quick_push (nelts - i - 1); return fold_permute (f, builder); To see if mask chooses elements in reverse, I borrowed it from function comment for series_p in vec-perm-indices.cc: /* Return true if index OUT_BASE + I * OUT_STEP selects input element IN_BASE + I * IN_STEP. For example, the call to test whether a permute reverses a vector of N elements would be: series_p (0, 1, N - 1, -1) which would return true for { N - 1, N - 2, N - 3, ... }. */ Thanks, Prathamesh > > Richard. > > > Thanks, > > Prathamesh > > > > > > Thanks, > > > Prathamesh > > > > > > > > Richard. > > > > > > > > > > > > > > Thanks, > > > > > Prathamesh
gcc/ChangeLog: * tree-ssa-forwprop.cc (is_combined_permutation_identity): New parameter same_p. Try to simplify two successive VEC_PERM_EXPRs with single operand and same mask, where mask chooses elements in reverse order. gcc/testesuite/ChangeLog: * gcc.target/aarch64/sve/acle/general/rev-1.c: New test. diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c new file mode 100644 index 00000000000..e57ee67d716 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/rev-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ + +#include <arm_sve.h> + +svint32_t f(svint32_t v) +{ + return svrev_s32 (svrev_s32 (v)); +} + +/* { dg-final { scan-tree-dump "return v_1\\(D\\)" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */ diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 9b567440ba4..ebd4a368ae9 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -2528,11 +2528,16 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi) return true; } -/* Determine whether applying the 2 permutations (mask1 then mask2) - gives back one of the input. */ +/* For the following sequence: + tem = VEC_PERM_EXPR <op0, op1, mask1> + res = VEC_PERM_EXPR <tem, tem, mask2> + + Determine whether applying the 2 permutations (mask1 then mask2) + gives back one of the input. SAME_P specifies whether op0 + and op1 compare equal. */ static int -is_combined_permutation_identity (tree mask1, tree mask2) +is_combined_permutation_identity (tree mask1, tree mask2, bool same_p) { tree mask; unsigned HOST_WIDE_INT nelts, i, j; @@ -2541,6 +2546,29 @@ is_combined_permutation_identity (tree mask1, tree mask2) gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST && TREE_CODE (mask2) == VECTOR_CST); + + /* For VLA masks, check for the following pattern: + v1 = VEC_PERM_EXPR (v0, v0, mask1) + v2 = VEC_PERM_EXPR (v1, v1, mask2) + --> + v2 = v0 + if mask1 == mask2 == {nelts - 1, nelts - 2, ...}. */ + + if (operand_equal_p (mask1, mask2, 0) + && !VECTOR_CST_NELTS (mask1).is_constant () + && same_p) + { + vec_perm_builder builder; + if (tree_to_vec_perm_builder (&builder, mask1)) + { + poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask1)); + vec_perm_indices sel (builder, 1, nelts); + if (sel.series_p (0, 1, nelts - 1, -1)) + return 1; + } + return 0; + } + mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2); if (mask == NULL_TREE || TREE_CODE (mask) != VECTOR_CST) return 0; @@ -2629,7 +2657,9 @@ simplify_permutation (gimple_stmt_iterator *gsi) op3 = gimple_assign_rhs3 (def_stmt); if (TREE_CODE (op3) != VECTOR_CST) return 0; - ident = is_combined_permutation_identity (op3, op2); + bool same_p = operand_equal_p (gimple_assign_rhs1 (def_stmt), + gimple_assign_rhs2 (def_stmt), 0); + ident = is_combined_permutation_identity (op3, op2, same_p); if (!ident) return 0; orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt)