https://gcc.gnu.org/g:d2f9159cfe7ea904e6476cabefea0c6ac9532e29
commit r15-4802-gd2f9159cfe7ea904e6476cabefea0c6ac9532e29 Author: Tamar Christina <tamar.christ...@arm.com> Date: Thu Oct 31 12:50:23 2024 +0000 middle-end: Lower all gconds during vector pattern matching [PR117176] I have been taking a look at boolean handing once more in the vectorizer. There are two situation to consider: 1. when the boolean being created are created from comparing data inputs then for the resulting vector boolean we need to know the vector type and the precision. In this case, when we have an operation such as NOT on the data element, this has to be lowered to XOR because the truncation to the vector precision needs to be explicit. 2. when the boolean being created comes from another boolean operation, then we don't need to lower NOT, as the precision doesn't change. We don't do any lowering for these (as denoted in check_bool_pattern) and instead the precision is copied from the element feeding the boolean statement during VF analysis. For early break gcond lowering in order to correctly handle the second scenario above we punted the lowering of VECT_SCALAR_BOOLEAN_TYPE_P comparisons that were already in the right shape. e.g. e != 0 where e is a boolean does not need any lowering. The issue however is that the statement feeding e may need to be lowered in the case where it's a data expression. This patch changes a bit how we do the lowering. We now always emit an additional compare. e.g. if the input is; if (e != 0) where is a boolean we would punt on thi before, but now we generate f = e != 0 if (f != 0) We then use the same infrastructre as recog_bool to ask it to lower f, and in doing so handle and boolean conversions that need to be lowered. Because we now guarantee that f is an internal def we can also simplify the SLP building code. When e is a boolean, the precision we build for f needs to reflect the precision of the operation feeding e. To get this value we use integer_type_for_mask the same way recog_bool does, and if it's defined (e.g. we have a data conversions somewhere) we pass that precision on instead. This gets us the correct VF on the newly lowered boolean expressions. gcc/ChangeLog: PR tree-optimization/117176 * tree-vect-patterns.cc (vect_recog_gcond_pattern): Lower all gconds. * tree-vect-slp.cc (vect_analyze_slp): No longer check for in vect def. gcc/testsuite/ChangeLog: PR tree-optimization/117176 * gcc.dg/vect/vect-early-break_130-pr117176.c: New test. Diff: --- .../gcc.dg/vect/vect-early-break_130-pr117176.c | 21 ++++++++++++ gcc/tree-vect-patterns.cc | 19 ++++++----- gcc/tree-vect-slp.cc | 39 +++++----------------- 3 files changed, 40 insertions(+), 39 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c new file mode 100644 index 000000000000..841dcce284dd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target vect_int } */ + +struct ColorSpace { + int componentCt; +}; + +struct Psnr { + double psnr[3]; +}; + +int f(struct Psnr psnr, struct ColorSpace colorSpace) { + int i, hitsTarget = 1; + + for (i = 1; i < colorSpace.componentCt && hitsTarget; ++i) + hitsTarget = !(psnr.psnr[i] < 1); + + return hitsTarget; +} diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 945e7d2dc45d..a708234304fe 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -5426,17 +5426,19 @@ vect_recog_gcond_pattern (vec_info *vinfo, if (VECTOR_TYPE_P (scalar_type)) return NULL; - if (code == NE_EXPR - && zerop (rhs) - && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) - return NULL; + /* If the input is a boolean then try to figure out the precision that the + vector type should use. We cannot use the scalar precision as this would + later mismatch. This is similar to what recog_bool does. */ + if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) + { + if (tree stype = integer_type_for_mask (lhs, vinfo)) + scalar_type = stype; + } - tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); - if (vecitype == NULL_TREE) + tree vectype = get_mask_type_for_scalar_type (vinfo, scalar_type); + if (vectype == NULL_TREE) return NULL; - tree vectype = truth_type_for (vecitype); - tree new_lhs = vect_recog_temp_ssa_var (boolean_type_node, NULL); gimple *new_stmt = gimple_build_assign (new_lhs, code, lhs, rhs); append_pattern_def_seq (vinfo, stmt_vinfo, new_stmt, vectype, scalar_type); @@ -5667,7 +5669,6 @@ vect_recog_bool_pattern (vec_info *vinfo, return NULL; } - /* A helper for vect_recog_mask_conversion_pattern. Build conversion of MASK to a type suitable for masking VECTYPE. Built statement gets required vectype and is appended to diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 21fd4c8dc09e..97c362d24f8d 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -4911,37 +4911,16 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size, from them. It's highly likely that the resulting SLP tree here if both arguments have a def will be incompatible, but we rely on it being split later on. */ - if (auto varg = loop_vinfo->lookup_def (args0)) - { - vec<stmt_vec_info> stmts; - vec<tree> remain = vNULL; - stmts.create (1); - stmts.quick_push (vect_stmt_to_vectorize (varg)); + auto varg = loop_vinfo->lookup_def (args0); + vec<stmt_vec_info> stmts; + vec<tree> remain = vNULL; + stmts.create (1); + stmts.quick_push (vect_stmt_to_vectorize (varg)); - vect_build_slp_instance (vinfo, slp_inst_kind_gcond, - stmts, roots, remain, - max_tree_size, &limit, - bst_map, NULL, force_single_lane); - } - else - { - /* Create a new SLP instance. */ - slp_instance new_instance = XNEW (class _slp_instance); - vec<tree> ops; - ops.create (1); - ops.quick_push (args0); - slp_tree invnode = vect_create_new_slp_node (ops); - SLP_TREE_DEF_TYPE (invnode) = vect_external_def; - SLP_INSTANCE_TREE (new_instance) = invnode; - SLP_INSTANCE_LOADS (new_instance) = vNULL; - SLP_INSTANCE_ROOT_STMTS (new_instance) = roots; - SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL; - SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond; - new_instance->reduc_phis = NULL; - new_instance->cost_vec = vNULL; - new_instance->subgraph_entries = vNULL; - vinfo->slp_instances.safe_push (new_instance); - } + vect_build_slp_instance (vinfo, slp_inst_kind_gcond, + stmts, roots, remain, + max_tree_size, &limit, + bst_map, NULL, force_single_lane); } /* Find and create slp instances for inductions that have been forced