On Mon, Jun 7, 2021 at 11:10 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Mon, Jun 7, 2021 at 7:59 PM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Mon, Jun 7, 2021 at 4:19 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Mon, Jun 7, 2021 at 12:12 AM Richard Sandiford > > > <richard.sandif...@arm.com> wrote: > > > > > > > > "H.J. Lu" <hjl.to...@gmail.com> writes: > > > > > Update vec_duplicate to allow to fail so that backend can only allow > > > > > broadcasting an integer constant to a vector when broadcast > > > > > instruction > > > > > is available. > > > > > > > > I'm not sure why we need this to fail though. Once the optab is defined > > > > for target X, the optab should handle all duplicates for target X, > > > > even if there are different strategies it can use. > > > > > > > > AIUI the case you want to make conditional is the constant case. > > > > I guess the first question is: why don't we simplify those CONSTRUCTORs > > > > to VECTOR_CSTs in gimple? I'm surprised we still see the constant case > > > > as a constructor here. > > > > > > The particular testcase for vec_duplicate is gcc.dg/pr100239.c. > > > > > > > If we can't rely on that happening, then would it work to change: > > > > > > > > /* Try using vec_duplicate_optab for uniform vectors. */ > > > > if (!TREE_SIDE_EFFECTS (exp) > > > > && VECTOR_MODE_P (mode) > > > > && eltmode == GET_MODE_INNER (mode) > > > > && ((icode = optab_handler (vec_duplicate_optab, mode)) > > > > != CODE_FOR_nothing) > > > > && (elt = uniform_vector_p (exp))) > > > > > > > > to something like: > > > > > > > > /* Try using vec_duplicate_optab for uniform vectors. */ > > > > if (!TREE_SIDE_EFFECTS (exp) > > > > && VECTOR_MODE_P (mode) > > > > && eltmode == GET_MODE_INNER (mode) > > > > && (elt = uniform_vector_p (exp))) > > > > { > > > > if (TREE_CODE (elt) == INTEGER_CST > > > > || TREE_CODE (elt) == POLY_INT_CST > > > > || TREE_CODE (elt) == REAL_CST > > > > || TREE_CODE (elt) == FIXED_CST) > > > > { > > > > rtx src = gen_const_vec_duplicate (mode, expand_normal > > > > (node)); > > > > emit_move_insn (target, src); > > > > break; > > > > } > > > > … > > > > } > > > > > > I will give it a try. > > > > I can confirm that veclower leaves us with an unfolded constant CTOR. > > If you file a PR to remind me I'll fix that. > > The attached untested patch fixes this for the testcase. >
Here is the patch + the testcase. -- H.J.
From aac56894719b59e552b493c970946225ed8c27f6 Mon Sep 17 00:00:00 2001 From: Richard Biener <rguent...@suse.de> Date: Mon, 7 Jun 2021 20:08:13 +0200 Subject: [PATCH] middle-end/100951 - make sure to generate VECTOR_CST in lowering When vector lowering creates piecewise ops make sure to create VECTOR_CSTs instead of CONSTRUCTORs when possible. gcc/ 2021-06-07 Richard Biener <rguent...@suse.de> PR middle-end/100951 * tree-vect-generic.c (): Build a VECTOR_CST if all elements are constant. gcc/testsuite/ 2021-06-07 H.J. Lu <hjl.to...@gmail.com> PR middle-end/100951 * gcc.target/i386/pr100951.c: New test. --- gcc/testsuite/gcc.target/i386/pr100951.c | 15 +++++++++++ gcc/tree-vect-generic.c | 34 +++++++++++++++++++++--- 2 files changed, 45 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr100951.c diff --git a/gcc/testsuite/gcc.target/i386/pr100951.c b/gcc/testsuite/gcc.target/i386/pr100951.c new file mode 100644 index 00000000000..16d8bafa663 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100951.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -march=x86-64" } */ + +typedef short __attribute__((__vector_size__ (8 * sizeof (short)))) V; +V v, w; + +void +foo (void) +{ + w = __builtin_shuffle (v != v, 0 < (V) {}, (V) {192} >> 5); +} + +/* { dg-final { scan-assembler-not "punpcklwd" } } */ +/* { dg-final { scan-assembler-not "pshufd" } } */ +/* { dg-final { scan-assembler-times "pxor\[\\t \]%xmm\[0-9\]+, %xmm\[0-9\]+" 1 } } */ diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c index d9c0ac9de7e..5f3f9fa005e 100644 --- a/gcc/tree-vect-generic.c +++ b/gcc/tree-vect-generic.c @@ -328,16 +328,22 @@ expand_vector_piecewise (gimple_stmt_iterator *gsi, elem_op_func f, if (!ret_type) ret_type = type; vec_alloc (v, (nunits + delta - 1) / delta); + bool constant_p = true; for (i = 0; i < nunits; i += delta, index = int_const_binop (PLUS_EXPR, index, part_width)) { tree result = f (gsi, inner_type, a, b, index, part_width, code, ret_type); + if (!CONSTANT_CLASS_P (result)) + constant_p = false; constructor_elt ce = {NULL_TREE, result}; v->quick_push (ce); } - return build_constructor (ret_type, v); + if (constant_p) + return build_vector_from_ctor (ret_type, v); + else + return build_constructor (ret_type, v); } /* Expand a vector operation to scalars with the freedom to use @@ -1105,6 +1111,7 @@ expand_vector_condition (gimple_stmt_iterator *gsi, bitmap dce_ssa_names) int nunits = nunits_for_known_piecewise_op (type); vec_alloc (v, nunits); + bool constant_p = true; for (int i = 0; i < nunits; i++) { tree aa, result; @@ -1129,6 +1136,8 @@ expand_vector_condition (gimple_stmt_iterator *gsi, bitmap dce_ssa_names) else aa = tree_vec_extract (gsi, cond_type, a, width, index); result = gimplify_build3 (gsi, COND_EXPR, inner_type, aa, bb, cc); + if (!CONSTANT_CLASS_P (result)) + constant_p = false; constructor_elt ce = {NULL_TREE, result}; v->quick_push (ce); index = int_const_binop (PLUS_EXPR, index, width); @@ -1138,7 +1147,10 @@ expand_vector_condition (gimple_stmt_iterator *gsi, bitmap dce_ssa_names) comp_index = int_const_binop (PLUS_EXPR, comp_index, comp_width); } - constr = build_constructor (type, v); + if (constant_p) + constr = build_vector_from_ctor (type, v); + else + constr = build_constructor (type, v); gimple_assign_set_rhs_from_tree (gsi, constr); update_stmt (gsi_stmt (*gsi)); @@ -1578,6 +1590,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi) "vector shuffling operation will be expanded piecewise"); vec_alloc (v, elements); + bool constant_p = true; for (i = 0; i < elements; i++) { si = size_int (i); @@ -1639,10 +1652,15 @@ lower_vec_perm (gimple_stmt_iterator *gsi) t = v0_val; } + if (!CONSTANT_CLASS_P (t)) + constant_p = false; CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, t); } - constr = build_constructor (vect_type, v); + if (constant_p) + constr = build_vector_from_ctor (vect_type, v); + else + constr = build_constructor (vect_type, v); gimple_assign_set_rhs_from_tree (gsi, constr); update_stmt (gsi_stmt (*gsi)); } @@ -2014,6 +2032,7 @@ expand_vector_conversion (gimple_stmt_iterator *gsi) } vec_alloc (v, (nunits + delta - 1) / delta * 2); + bool constant_p = true; for (i = 0; i < nunits; i += delta, index = int_const_binop (PLUS_EXPR, index, part_width)) @@ -2024,12 +2043,19 @@ expand_vector_conversion (gimple_stmt_iterator *gsi) index); tree result = gimplify_build1 (gsi, code1, cretd_type, a); constructor_elt ce = { NULL_TREE, result }; + if (!CONSTANT_CLASS_P (ce.value)) + constant_p = false; v->quick_push (ce); ce.value = gimplify_build1 (gsi, code2, cretd_type, a); + if (!CONSTANT_CLASS_P (ce.value)) + constant_p = false; v->quick_push (ce); } - new_rhs = build_constructor (ret_type, v); + if (constant_p) + new_rhs = build_vector_from_ctor (ret_type, v); + else + new_rhs = build_constructor (ret_type, v); g = gimple_build_assign (lhs, new_rhs); gsi_replace (gsi, g, false); return; -- 2.31.1