> Am 29.08.2025 um 15:21 schrieb Tamar Christina <[email protected]>:
>
>
>>
>> -----Original Message-----
>> From: Richard Biener <[email protected]>
>> Sent: Friday, August 29, 2025 12:40 PM
>> To: [email protected]
>> Cc: RISC-V CI <[email protected]>; Tamar Christina
>> <[email protected]>
>> Subject: [PATCH 2/4] Separate reduction info and associate it with SLP nodes
>>
>> The following splits out reduction related information from
>> stmt_vec_info, retaining (and duplicating) parts used by scalar
>> cycle analysis. The data is then associated with SLP nodes
>> forming reduction cycles and accessible via info_for_reduction.
>> The data is created at SLP discovery time as we look at it even
>> pre-vectorizable_reduction analysis, but most of the data is
>> only populated by the latter. There is no reduction info with
>> nested cycles that are not part of an outer reduction.
>> In the process this adds cycle info to each SLP tree, notably
>> the reduc-idx and a way to identify the reduction info.
>>
>> Cleanup possibilities will be realized in a later patch of the
>> series. This patch is going to be squashed with the first.
>
> This one breaks the aarch64 build
>
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc: In
> function 'bool aarch64_force_single_cycle(vec_info*, stmt_vec_info)':
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc:17778:41:
> error: invalid conversion from 'vec_info*' to 'loop_vec_info' {aka
> '_loop_vec_info*'} [-fpermissive]
> 17778 | auto reduc_info = info_for_reduction (vinfo, stmt_info);
> | ^~~~~
> | |
> | vec_info*
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc:17778:48:
> error: cannot convert 'stmt_vec_info' {aka '_stmt_vec_info*'} to 'slp_tree'
> {aka '_slp_tree*'}
> 17778 | auto reduc_info = info_for_reduction (vinfo, stmt_info);
> | ^~~~~~~~~
> | |
> | stmt_vec_info {aka
> _stmt_vec_info*}
>
> Since info_for_reduction no longer takes the stmt_info. If I'm reading this
> right for AArch64 we'd need:
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 1cdd5a26a83..2831ef38948 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -17770,12 +17770,13 @@ aarch64_adjust_stmt_cost (vec_info *vinfo,
> vect_cost_for_stmt kind,
>
> with the single accumulator being read and written multiple times. */
> static bool
> -aarch64_force_single_cycle (vec_info *vinfo, stmt_vec_info stmt_info)
> +aarch64_force_single_cycle (vec_info *vinfo, slp_tree node)
> {
> + stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (node);
> if (!STMT_VINFO_REDUC_DEF (stmt_info))
> return false;
>
> - auto reduc_info = info_for_reduction (vinfo, stmt_info);
> + auto reduc_info = info_for_reduction (vinfo, node);
> return VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info);
> }
>
> @@ -17803,7 +17804,7 @@ aarch64_vector_costs::count_ops (unsigned int count,
> vect_cost_for_stmt kind,
> = aarch64_in_loop_reduction_latency (m_vinfo, node,
> stmt_info, m_vec_flags);
> if (m_costing_for_scalar
> - || aarch64_force_single_cycle (m_vinfo, stmt_info))
> + || aarch64_force_single_cycle (m_vinfo, node))
> /* ??? Ideally we'd use a tree to reduce the copies down to 1 vector,
> and then accumulate that, but at the moment the loop-carried
> dependency includes all copies. */
>
> Should I try again with this change?
Yes please.
Richard
> Thanks,
> Tamar
>
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>>
>> * tree-vectorizer.h (_slp_tree::cycle_info): New member.
>> (SLP_TREE_REDUC_IDX): Likewise.
>> (vect_reduc_info_s): Move/copy data from ...
>> (_stmt_vec_info): ... here.
>> (_loop_vec_info::redcu_infos): New member.
>> (info_for_reduction): Adjust to take SLP node.
>> (vect_reduc_type): Adjust.
>> (vect_is_reduction): Add overload for SLP node.
>> * tree-vectorizer.cc (vec_info::new_stmt_vec_info):
>> Do not initialize removed members.
>> (vec_info::free_stmt_vec_info): Do not release them.
>> * tree-vect-stmts.cc (vectorizable_condition): Adjust.
>> * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
>> cycle info.
>> (vect_build_slp_tree_2): Compute SLP reduc_idx and store
>> it. Create, populate and propagate reduction info.
>> (vect_print_slp_tree): Print cycle info.
>> (vect_analyze_slp_reduc_chain): Set cycle info on the
>> manual added conversion node.
>> (vect_optimize_slp_pass::start_choosing_layouts): Adjust.
>> * tree-vect-loop.cc (_loop_vec_info::~_loop_vec_info):
>> Release reduction infos.
>> (info_for_reduction): Get the reduction info from
>> the vector in the loop_vinfo.
>> (vect_create_epilog_for_reduction): Adjust.
>> (vectorizable_reduction): Likewise.
>> (vect_transform_reduction): Likewise.
>> (vect_transform_cycle_phi): Likewise, deal with nested
>> cycles not part of a double reduction have no reduction info.
>> ---
>> gcc/tree-vect-loop.cc | 82 +++++++---------------
>> gcc/tree-vect-slp.cc | 79 +++++++++++++++++++--
>> gcc/tree-vect-stmts.cc | 3 +-
>> gcc/tree-vectorizer.cc | 12 ----
>> gcc/tree-vectorizer.h | 154 +++++++++++++++++++++++------------------
>> 5 files changed, 190 insertions(+), 140 deletions(-)
>>
>> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
>> index a0e77bdced6..db63891955b 100644
>> --- a/gcc/tree-vect-loop.cc
>> +++ b/gcc/tree-vect-loop.cc
>> @@ -955,6 +955,8 @@ _loop_vec_info::~_loop_vec_info ()
>> delete scan_map;
>> delete scalar_costs;
>> delete vector_costs;
>> + for (auto reduc_info : reduc_infos)
>> + delete reduc_info;
>>
>> /* When we release an epiloge vinfo that we do not intend to use
>> avoid clearing AUX of the main loop which should continue to
>> @@ -5135,46 +5137,12 @@ get_initial_defs_for_reduction (loop_vec_info
>> loop_vinfo,
>> vect_emit_reduction_init_stmts (loop_vinfo, reduc_info, ctor_seq);
>> }
>>
>> -/* For a statement STMT_INFO taking part in a reduction operation return
>> - the stmt_vec_info the meta information is stored on. */
>> -
>> -static vect_reduc_info
>> -info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info, bool create)
>> -{
>> - stmt_info = vect_orig_stmt (stmt_info);
>> - gcc_assert (STMT_VINFO_REDUC_DEF (stmt_info));
>> - if (!is_a <gphi *> (stmt_info->stmt)
>> - || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
>> - stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
>> - gphi *phi = as_a <gphi *> (stmt_info->stmt);
>> - if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> - {
>> - if (gimple_phi_num_args (phi) == 1)
>> - stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
>> - }
>> - else if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>> - {
>> - stmt_vec_info info = vinfo->lookup_def (vect_phi_initial_value (phi));
>> - if (info && STMT_VINFO_DEF_TYPE (info) == vect_double_reduction_def)
>> - stmt_info = info;
>> - }
>> - if (create)
>> - stmt_info->is_reduc_info = true;
>> - else
>> - gcc_assert (stmt_info->is_reduc_info);
>> - return vect_reduc_info (stmt_info);
>> -}
>> -
>> -vect_reduc_info
>> -info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info)
>> -{
>> - return info_for_reduction (vinfo, stmt_info, false);
>> -}
>> -
>> vect_reduc_info
>> -create_info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info)
>> +info_for_reduction (loop_vec_info loop_vinfo, slp_tree node)
>> {
>> - return info_for_reduction (vinfo, stmt_info, true);
>> + if (node->cycle_info.id == -1)
>> + return NULL;
>> + return loop_vinfo->reduc_infos[node->cycle_info.id];
>> }
>>
>> /* See if LOOP_VINFO is an epilogue loop whose main loop had a reduction that
>> @@ -5434,7 +5402,7 @@ vect_create_epilog_for_reduction (loop_vec_info
>> loop_vinfo,
>> slp_instance slp_node_instance,
>> edge loop_exit)
>> {
>> - vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> /* For double reductions we need to get at the inner loop reduction
>> stmt which has the meta info attached. Our stmt_info is that of the
>> loop-closed PHI of the inner loop which we remember as
>> @@ -6920,7 +6888,7 @@ vectorizable_lane_reducing (loop_vec_info loop_vinfo,
>> stmt_vec_info stmt_info,
>> || STMT_VINFO_REDUC_IDX (stmt_info) < 0)
>> return false;
>>
>> - vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>>
>> /* Lane-reducing pattern inside any inner loop of LOOP_VINFO is not
>> recoginized. */
>> @@ -7083,8 +7051,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>> && STMT_VINFO_DEF_TYPE (stmt_info) != vect_nested_cycle)
>> return false;
>>
>> - /* The stmt we store reduction analysis meta on. */
>> - vect_reduc_info reduc_info = create_info_for_reduction (loop_vinfo,
>> stmt_info);
>> + /* The reduction meta. */
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>>
>> if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>> {
>> @@ -7160,7 +7128,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>> slp_tree vdef_slp = slp_node_instance->root;
>> /* For double-reductions we start SLP analysis at the inner loop LC PHI
>> which is the def of the outer loop live stmt. */
>> - if (STMT_VINFO_DEF_TYPE (reduc_info.fixme ()) ==
>> vect_double_reduction_def)
>> + if (VECT_REDUC_INFO_DEF_TYPE (reduc_info) == vect_double_reduction_def)
>> vdef_slp = SLP_TREE_CHILDREN (vdef_slp)[0];
>> while (reduc_def != PHI_RESULT (reduc_def_phi))
>> {
>> @@ -7399,8 +7367,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>> }
>> }
>>
>> - enum vect_reduction_type reduction_type = STMT_VINFO_REDUC_TYPE
>> (phi_info);
>> - VECT_REDUC_INFO_TYPE (reduc_info) = reduction_type;
>> + enum vect_reduction_type reduction_type = VECT_REDUC_INFO_TYPE
>> (reduc_info);
>> /* If we have a condition reduction, see if we can simplify it further. */
>> if (reduction_type == COND_REDUCTION)
>> {
>> @@ -7514,7 +7481,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>>
>> if (nested_cycle)
>> {
>> - gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info.fixme ())
>> + gcc_assert (VECT_REDUC_INFO_DEF_TYPE (reduc_info)
>> == vect_double_reduction_def);
>> double_reduc = true;
>> }
>> @@ -7554,7 +7521,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>> (and also the same tree-code) when generating the epilog code and
>> when generating the code inside the loop. */
>>
>> - code_helper orig_code = STMT_VINFO_REDUC_CODE (phi_info);
>> + code_helper orig_code = VECT_REDUC_INFO_CODE (reduc_info);
>>
>> /* If conversion might have created a conditional operation like
>> IFN_COND_ADD already. Use the internal code for the following checks.
>> */
>> @@ -8031,12 +7998,13 @@ vect_transform_reduction (loop_vec_info
>> loop_vinfo,
>> class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>> unsigned vec_num;
>>
>> - vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>>
>> if (nested_in_vect_loop_p (loop, stmt_info))
>> {
>> loop = loop->inner;
>> - gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info.fixme ()) ==
>> vect_double_reduction_def);
>> + gcc_assert (VECT_REDUC_INFO_DEF_TYPE (reduc_info)
>> + == vect_double_reduction_def);
>> }
>>
>> gimple_match_op op;
>> @@ -8382,10 +8350,10 @@ vect_transform_cycle_phi (loop_vec_info
>> loop_vinfo,
>> nested_cycle = true;
>> }
>>
>> - vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> -
>> - if (VECT_REDUC_INFO_TYPE (reduc_info) == EXTRACT_LAST_REDUCTION
>> - || VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION)
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> + if (reduc_info
>> + && (VECT_REDUC_INFO_TYPE (reduc_info) == EXTRACT_LAST_REDUCTION
>> + || VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION))
>> /* Leave the scalar phi in place. */
>> return true;
>>
>> @@ -8393,7 +8361,7 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>>
>> /* Check whether we should use a single PHI node and accumulate
>> vectors to one before the backedge. */
>> - if (VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info))
>> + if (reduc_info && VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info))
>> vec_num = 1;
>>
>> /* Create the destination vector */
>> @@ -8408,7 +8376,8 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>> /* Optimize: if initial_def is for REDUC_MAX smaller than the base
>> and we can't use zero for induc_val, use initial_def. Similarly
>> for REDUC_MIN and initial_def larger than the base. */
>> - if (VECT_REDUC_INFO_TYPE (reduc_info) ==
>> INTEGER_INDUC_COND_REDUCTION)
>> + if (reduc_info
>> + && VECT_REDUC_INFO_TYPE (reduc_info) ==
>> INTEGER_INDUC_COND_REDUCTION)
>> {
>> gcc_assert (SLP_TREE_LANES (slp_node) == 1);
>> tree initial_def = vect_phi_initial_value (phi);
>> @@ -8486,6 +8455,7 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>> vec_initial_defs.quick_push (vec_initial_def);
>> }
>>
>> + if (reduc_info)
>> if (auto *accumulator = VECT_REDUC_INFO_REUSED_ACCUMULATOR
>> (reduc_info))
>> {
>> tree def = accumulator->reduc_input;
>> @@ -10280,7 +10250,7 @@ vectorizable_live_operation (vec_info *vinfo,
>> stmt_vec_info stmt_info,
>> if (SLP_INSTANCE_KIND (slp_node_instance) == slp_inst_kind_reduc_group
>> && slp_index != 0)
>> return true;
>> - vect_reduc_info reduc_info = info_for_reduction (loop_vinfo,
>> stmt_info);
>> + vect_reduc_info reduc_info = info_for_reduction (loop_vinfo,
>> slp_node);
>> if (VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION
>> || VECT_REDUC_INFO_TYPE (reduc_info) ==
>> EXTRACT_LAST_REDUCTION)
>> return true;
>> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
>> index 733f4ece724..5236eac5a42 100644
>> --- a/gcc/tree-vect-slp.cc
>> +++ b/gcc/tree-vect-slp.cc
>> @@ -126,6 +126,8 @@ _slp_tree::_slp_tree ()
>> this->avoid_stlf_fail = false;
>> SLP_TREE_VECTYPE (this) = NULL_TREE;
>> SLP_TREE_REPRESENTATIVE (this) = NULL;
>> + this->cycle_info.id = -1;
>> + this->cycle_info.reduc_idx = -1;
>> SLP_TREE_REF_COUNT (this) = 1;
>> this->failed = NULL;
>> this->max_nunits = 1;
>> @@ -2735,6 +2737,7 @@ out:
>>
>> stmt_info = stmts[0];
>>
>> + int reduc_idx = -1;
>> int gs_scale = 0;
>> tree gs_base = NULL_TREE;
>>
>> @@ -2826,6 +2829,33 @@ out:
>> continue;
>> }
>>
>> + /* See which SLP operand a reduction chain continues on. We want
>> + to chain even PHIs but not backedges. */
>> + if (VECTORIZABLE_CYCLE_DEF (oprnd_info->first_dt)
>> + || STMT_VINFO_REDUC_IDX (oprnd_info->def_stmts[0]) != -1)
>> + {
>> + if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>> + {
>> + if (oprnd_info->first_dt == vect_double_reduction_def)
>> + reduc_idx = i;
>> + }
>> + else if (is_a <gphi *> (stmt_info->stmt)
>> + && gimple_phi_num_args
>> + (as_a <gphi *> (stmt_info->stmt)) != 1)
>> + ;
>> + else if (STMT_VINFO_REDUC_IDX (stmt_info) == -1
>> + && STMT_VINFO_DEF_TYPE (stmt_info) !=
>> vect_double_reduction_def)
>> + ;
>> + else if (reduc_idx == -1)
>> + reduc_idx = i;
>> + else
>> + /* For .COND_* reduction operations the else value can be the
>> + same as one of the operation operands. The other def
>> + stmts have been moved, so we can't check easily. Check
>> + it's a call at least. */
>> + gcc_assert (is_a <gcall *> (stmt_info->stmt));
>> + }
>> +
>> /* When we have a masked load with uniform mask discover this
>> as a single-lane mask with a splat permute. This way we can
>> recognize this as a masked load-lane by stripping the splat. */
>> @@ -3157,6 +3187,41 @@ fail:
>> SLP_TREE_CHILDREN (node).splice (children);
>> SLP_TREE_GS_SCALE (node) = gs_scale;
>> SLP_TREE_GS_BASE (node) = gs_base;
>> + if (reduc_idx != -1)
>> + {
>> + gcc_assert (STMT_VINFO_REDUC_IDX (stmt_info) != -1
>> + || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>> + || STMT_VINFO_DEF_TYPE (stmt_info) ==
>> vect_double_reduction_def);
>> + SLP_TREE_REDUC_IDX (node) = reduc_idx;
>> + node->cycle_info.id = SLP_TREE_CHILDREN
>> (node)[reduc_idx]->cycle_info.id;
>> + }
>> + /* When reaching the reduction PHI, create a vect_reduc_info. */
>> + else if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
>> + || STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> + && is_a <gphi *> (STMT_VINFO_STMT (stmt_info)))
>> + {
>> + loop_vec_info loop_vinfo = as_a <loop_vec_info> (vinfo);
>> + gcc_assert (STMT_VINFO_REDUC_IDX (stmt_info) == -1);
>> + node->cycle_info.id = loop_vinfo->reduc_infos.length ();
>> + vect_reduc_info reduc_info = new vect_reduc_info_s ();
>> + loop_vinfo->reduc_infos.safe_push (reduc_info);
>> + stmt_vec_info reduc_phi = stmt_info;
>> + /* ??? For double reductions vect_is_simple_reduction stores the
>> + reduction type and code on the inner loop header PHI. */
>> + if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> + {
>> + use_operand_p use_p;
>> + gimple *use_stmt;
>> + bool res = single_imm_use (gimple_phi_result (stmt_info->stmt),
>> + &use_p, &use_stmt);
>> + gcc_assert (res);
>> + reduc_phi = loop_vinfo->lookup_stmt (use_stmt);
>> + }
>> + VECT_REDUC_INFO_DEF_TYPE (reduc_info) = STMT_VINFO_DEF_TYPE
>> (stmt_info);
>> + VECT_REDUC_INFO_TYPE (reduc_info) = STMT_VINFO_REDUC_TYPE
>> (reduc_phi);
>> + VECT_REDUC_INFO_CODE (reduc_info) = STMT_VINFO_REDUC_CODE
>> (reduc_phi);
>> + VECT_REDUC_INFO_FN (reduc_info) = IFN_LAST;
>> + }
>> return node;
>> }
>>
>> @@ -3185,8 +3250,12 @@ vect_print_slp_tree (dump_flags_t dump_kind,
>> dump_location_t loc,
>> SLP_TREE_REF_COUNT (node));
>> if (SLP_TREE_VECTYPE (node))
>> dump_printf (metadata, " %T", SLP_TREE_VECTYPE (node));
>> - dump_printf (metadata, "%s\n",
>> + dump_printf (metadata, "%s",
>> node->avoid_stlf_fail ? " (avoid-stlf-fail)" : "");
>> + if (node->cycle_info.id != -1 || node->cycle_info.reduc_idx != -1)
>> + dump_printf (metadata, " cycle %d, link %d", node->cycle_info.id,
>> + node->cycle_info.reduc_idx);
>> + dump_printf (metadata, "\n");
>> if (SLP_TREE_DEF_TYPE (node) == vect_internal_def)
>> {
>> if (SLP_TREE_PERMUTE_P (node))
>> @@ -4241,6 +4310,8 @@ vect_analyze_slp_reduc_chain (vec_info *vinfo,
>> TREE_TYPE
>> (gimple_assign_lhs (scalar_def)),
>> group_size);
>> + SLP_TREE_REDUC_IDX (conv) = 0;
>> + conv->cycle_info.id = node->cycle_info.id;
>> SLP_TREE_CHILDREN (conv).quick_push (node);
>> SLP_INSTANCE_TREE (new_instance) = conv;
>> /* We also have to fake this conversion stmt as SLP reduction
>> @@ -6719,11 +6790,9 @@ vect_optimize_slp_pass::start_choosing_layouts ()
>> {
>> stmt_vec_info stmt_info
>> = SLP_TREE_REPRESENTATIVE (SLP_INSTANCE_TREE (instance));
>> - /* ??? vectorizable_reduction did not run yet, scalar cycle
>> - detection sets reduc_code. Either that or SLP discovery
>> - should create a reduction info. */
>> vect_reduc_info reduc_info
>> - = create_info_for_reduction (m_vinfo, stmt_info);
>> + = info_for_reduction (as_a <loop_vec_info> (m_vinfo),
>> + SLP_INSTANCE_TREE (instance));
>> if (needs_fold_left_reduction_p (TREE_TYPE
>> (gimple_get_lhs (stmt_info->stmt)),
>> VECT_REDUC_INFO_CODE (reduc_info)))
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index 2248b361e1c..77a03ed4a7b 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -11572,7 +11572,8 @@ vectorizable_condition (vec_info *vinfo,
>> every stmt, use the conservative default setting then. */
>> if (STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)))
>> {
>> - vect_reduc_info reduc_info = info_for_reduction (vinfo, stmt_info);
>> + vect_reduc_info reduc_info
>> + = info_for_reduction (loop_vinfo, slp_node);
>> reduction_type = VECT_REDUC_INFO_TYPE (reduc_info);
>> nested_cycle_p = nested_in_vect_loop_p (LOOP_VINFO_LOOP
>> (loop_vinfo),
>> stmt_info);
>> diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
>> index 726383f487c..d7dc30bbeac 100644
>> --- a/gcc/tree-vectorizer.cc
>> +++ b/gcc/tree-vectorizer.cc
>> @@ -724,16 +724,6 @@ vec_info::new_stmt_vec_info (gimple *stmt)
>> STMT_VINFO_SLP_VECT_ONLY (res) = false;
>> STMT_VINFO_SLP_VECT_ONLY_PATTERN (res) = false;
>>
>> - /* To be moved. */
>> - res->reduc_epilogue_adjustment = NULL_TREE;
>> - res->force_single_cycle = false;
>> - res->reduc_fn = IFN_LAST;
>> - res->reduc_initial_values = vNULL;
>> - res->reduc_scalar_results = vNULL;
>> - res->reduc_vectype = NULL_TREE;
>> - res->induc_cond_initial_val = NULL_TREE;
>> - res->reused_accumulator = NULL;
>> -
>> if (is_a <loop_vec_info> (this)
>> && gimple_code (stmt) == GIMPLE_PHI
>> && is_loop_header_bb_p (gimple_bb (stmt)))
>> @@ -794,8 +784,6 @@ vec_info::free_stmt_vec_info (stmt_vec_info stmt_info)
>> release_ssa_name (lhs);
>> }
>>
>> - stmt_info->reduc_initial_values.release ();
>> - stmt_info->reduc_scalar_results.release ();
>> free (stmt_info);
>> }
>>
>> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
>> index 90d862f2987..260cb2ddd3e 100644
>> --- a/gcc/tree-vectorizer.h
>> +++ b/gcc/tree-vectorizer.h
>> @@ -310,6 +310,13 @@ struct _slp_tree {
>> code generation. */
>> stmt_vec_info representative;
>>
>> + struct {
>> + /* SLP cycle the node resides in, or -1. */
>> + int id;
>> + /* The SLP operand index with the edge on the SLP cycle, or -1. */
>> + int reduc_idx;
>> + } cycle_info;
>> +
>> /* Load permutation relative to the stores, NULL if there is no
>> permutation. */
>> load_permutation_t load_permutation;
>> @@ -446,6 +453,7 @@ public:
>> #define SLP_TREE_TYPE(S) (S)->type
>> #define SLP_TREE_GS_SCALE(S) (S)->gs_scale
>> #define SLP_TREE_GS_BASE(S) (S)->gs_base
>> +#define SLP_TREE_REDUC_IDX(S) (S)->cycle_info.reduc_idx
>> #define SLP_TREE_PERMUTE_P(S) ((S)->code ==
>> VEC_PERM_EXPR)
>>
>> inline vect_memory_access_type
>> @@ -816,26 +824,70 @@ typedef auto_vec<std::pair<data_reference*, tree> >
>> drs_init_vec;
>>
>> /* Abstraction around info on reductions which is still in stmt_vec_info
>> but will be duplicated or moved elsewhere. */
>> -class vect_reduc_info
>> +class vect_reduc_info_s
>> {
>> public:
>> - explicit vect_reduc_info (stmt_vec_info s) : i (s) {}
>> - stmt_vec_info fixme () const { return i; }
>> -private:
>> - stmt_vec_info i;
>> + /* The def type of the main reduction PHI, vect_reduction_def or
>> + vect_double_reduction_def. */
>> + enum vect_def_type def_type;
>> +
>> + /* The reduction type as detected by
>> + vect_is_simple_reduction and vectorizable_reduction. */
>> + enum vect_reduction_type reduc_type;
>> +
>> + /* The original scalar reduction code, to be used in the epilogue. */
>> + code_helper reduc_code;
>> +
>> + /* A vector internal function we should use in the epilogue. */
>> + internal_fn reduc_fn;
>> +
>> + /* For loop reduction with multiple vectorized results (ncopies > 1), a
>> + lane-reducing operation participating in it may not use all of those
>> + results, this field specifies result index starting from which any
>> + following land-reducing operation would be assigned to. */
>> + unsigned int reduc_result_pos;
>> +
>> + /* Whether we force a single cycle PHI during reduction vectorization. */
>> + bool force_single_cycle;
>> +
>> + /* The vector type for performing the actual reduction operation. */
>> + tree reduc_vectype;
>> +
>> + /* For INTEGER_INDUC_COND_REDUCTION, the initial value to be used. */
>> + tree induc_cond_initial_val;
>> +
>> + /* If not NULL the value to be added to compute final reduction value. */
>> + tree reduc_epilogue_adjustment;
>> +
>> + /* If non-null, the reduction is being performed by an epilogue loop
>> + and we have decided to reuse this accumulator from the main loop. */
>> + struct vect_reusable_accumulator *reused_accumulator;
>> +
>> + /* If the vector code is performing N scalar reductions in parallel,
>> + this variable gives the initial scalar values of those N reductions.
>> */
>> + auto_vec<tree> reduc_initial_values;
>> +
>> + /* If the vector code is performing N scalar reductions in parallel, this
>> + variable gives the vectorized code's final (scalar) result for each of
>> + those N reductions. In other words, REDUC_SCALAR_RESULTS[I] replaces
>> + the original scalar code's loop-closed SSA PHI for reduction number I.
>> */
>> + auto_vec<tree> reduc_scalar_results;
>> };
>>
>> -#define VECT_REDUC_INFO_TYPE(I) ((I).fixme ()->reduc_type)
>> -#define VECT_REDUC_INFO_CODE(I) ((I).fixme ()->reduc_code)
>> -#define VECT_REDUC_INFO_FN(I) ((I).fixme ()->reduc_fn)
>> -#define VECT_REDUC_INFO_SCALAR_RESULTS(I) ((I).fixme ()-
>>> reduc_scalar_results)
>> -#define VECT_REDUC_INFO_INITIAL_VALUES(I) ((I).fixme
>> ()->reduc_initial_values)
>> -#define VECT_REDUC_INFO_REUSED_ACCUMULATOR(I) ((I).fixme ()-
>>> reused_accumulator)
>> -#define VECT_REDUC_INFO_INDUC_COND_INITIAL_VAL(I) ((I).fixme ()-
>>> induc_cond_initial_val)
>> -#define VECT_REDUC_INFO_EPILOGUE_ADJUSTMENT(I) ((I).fixme ()-
>>> reduc_epilogue_adjustment)
>> -#define VECT_REDUC_INFO_VECTYPE(I) ((I).fixme ()->reduc_vectype)
>> -#define VECT_REDUC_INFO_FORCE_SINGLE_CYCLE(I) ((I).fixme ()-
>>> force_single_cycle)
>> -#define VECT_REDUC_INFO_RESULT_POS(I) ((I).fixme ()->reduc_result_pos)
>> +typedef class vect_reduc_info_s *vect_reduc_info;
>> +
>> +#define VECT_REDUC_INFO_DEF_TYPE(I) ((I)->def_type)
>> +#define VECT_REDUC_INFO_TYPE(I) ((I)->reduc_type)
>> +#define VECT_REDUC_INFO_CODE(I) ((I)->reduc_code)
>> +#define VECT_REDUC_INFO_FN(I) ((I)->reduc_fn)
>> +#define VECT_REDUC_INFO_SCALAR_RESULTS(I) ((I)->reduc_scalar_results)
>> +#define VECT_REDUC_INFO_INITIAL_VALUES(I) ((I)->reduc_initial_values)
>> +#define VECT_REDUC_INFO_REUSED_ACCUMULATOR(I) ((I)-
>>> reused_accumulator)
>> +#define VECT_REDUC_INFO_INDUC_COND_INITIAL_VAL(I) ((I)-
>>> induc_cond_initial_val)
>> +#define VECT_REDUC_INFO_EPILOGUE_ADJUSTMENT(I) ((I)-
>>> reduc_epilogue_adjustment)
>> +#define VECT_REDUC_INFO_VECTYPE(I) ((I)->reduc_vectype)
>> +#define VECT_REDUC_INFO_FORCE_SINGLE_CYCLE(I) ((I)->force_single_cycle)
>> +#define VECT_REDUC_INFO_RESULT_POS(I) ((I)->reduc_result_pos)
>>
>> /* Information about a reduction accumulator from the main loop that could
>> conceivably be reused as the input to a reduction in an epilogue loop. */
>> @@ -902,6 +954,10 @@ public:
>> the main loop, this edge is the one that skips the epilogue. */
>> edge skip_this_loop_edge;
>>
>> + /* Reduction descriptors of this loop. Referenced to from SLP nodes
>> + by index. */
>> + auto_vec<vect_reduc_info> reduc_infos;
>> +
>> /* The vectorized form of a standard reduction replaces the original
>> scalar code's final result (a loop-closed SSA PHI) with the result
>> of a vector-to-scalar reduction operation. After vectorization,
>> @@ -1517,62 +1573,22 @@ public:
>> /* For both loads and stores. */
>> unsigned simd_lane_access_p : 3;
>>
>> - /* For INTEGER_INDUC_COND_REDUCTION, the initial value to be used. */
>> - tree induc_cond_initial_val;
>> -
>> - /* If not NULL the value to be added to compute final reduction value. */
>> - tree reduc_epilogue_adjustment;
>> -
>> /* On a reduction PHI the reduction type as detected by
>> - vect_is_simple_reduction and vectorizable_reduction. */
>> + vect_is_simple_reduction. */
>> enum vect_reduction_type reduc_type;
>>
>> - /* The original reduction code, to be used in the epilogue. */
>> + /* On a reduction PHI, the original reduction code as detected by
>> + vect_is_simple_reduction. */
>> code_helper reduc_code;
>> - /* An internal function we should use in the epilogue. */
>> - internal_fn reduc_fn;
>>
>> - /* On a stmt participating in the reduction the index of the operand
>> + /* On a stmt participating in a reduction the index of the operand
>> on the reduction SSA cycle. */
>> int reduc_idx;
>>
>> - /* On a reduction PHI the def returned by vect_force_simple_reduction.
>> - On the def returned by vect_force_simple_reduction the
>> - corresponding PHI. */
>> + /* On a reduction PHI the def returned by vect_is_simple_reduction.
>> + On the def returned by vect_is_simple_reduction the corresponding PHI.
>> */
>> stmt_vec_info reduc_def;
>>
>> - /* The vector type for performing the actual reduction. */
>> - tree reduc_vectype;
>> -
>> - /* For loop reduction with multiple vectorized results (ncopies > 1), a
>> - lane-reducing operation participating in it may not use all of those
>> - results, this field specifies result index starting from which any
>> - following land-reducing operation would be assigned to. */
>> - unsigned int reduc_result_pos;
>> -
>> - /* If IS_REDUC_INFO is true and if the vector code is performing
>> - N scalar reductions in parallel, this variable gives the initial
>> - scalar values of those N reductions. */
>> - vec<tree> reduc_initial_values;
>> -
>> - /* If IS_REDUC_INFO is true and if the vector code is performing
>> - N scalar reductions in parallel, this variable gives the vectorized
>> code's
>> - final (scalar) result for each of those N reductions. In other words,
>> - REDUC_SCALAR_RESULTS[I] replaces the original scalar code's loop-closed
>> - SSA PHI for reduction number I. */
>> - vec<tree> reduc_scalar_results;
>> -
>> - /* Only meaningful if IS_REDUC_INFO. If non-null, the reduction is
>> - being performed by an epilogue loop and we have decided to reuse
>> - this accumulator from the main loop. */
>> - vect_reusable_accumulator *reused_accumulator;
>> -
>> - /* Whether we force a single cycle PHI during reduction vectorization. */
>> - bool force_single_cycle;
>> -
>> - /* Whether on this stmt reduction meta is recorded. */
>> - bool is_reduc_info;
>> -
>> /* If nonzero, the lhs of the statement could be truncated to this
>> many bits without affecting any users of the result. */
>> unsigned int min_output_precision;
>> @@ -2674,8 +2690,7 @@ extern tree vect_gen_loop_len_mask (loop_vec_info,
>> gimple_stmt_iterator *,
>> unsigned int, tree, tree, unsigned int,
>> unsigned int);
>> extern gimple_seq vect_gen_len (tree, tree, tree, tree);
>> -extern vect_reduc_info info_for_reduction (vec_info *, stmt_vec_info);
>> -extern vect_reduc_info create_info_for_reduction (vec_info *,
>> stmt_vec_info);
>> +extern vect_reduc_info info_for_reduction (loop_vec_info, slp_tree);
>> extern bool reduction_fn_for_scalar_code (code_helper, internal_fn *);
>>
>> /* Drive for loop transformation stage. */
>> @@ -2891,7 +2906,14 @@ vect_is_store_elt_extraction (vect_cost_for_stmt
>> kind, stmt_vec_info stmt_info)
>> inline bool
>> vect_is_reduction (stmt_vec_info stmt_info)
>> {
>> - return STMT_VINFO_REDUC_IDX (stmt_info) >= 0;
>> + return STMT_VINFO_REDUC_IDX (stmt_info) != -1;
>> +}
>> +
>> +/* Return true if SLP_NODE represents part of a reduction. */
>> +inline bool
>> +vect_is_reduction (slp_tree slp_node)
>> +{
>> + return SLP_TREE_REDUC_IDX (slp_node) != -1;
>> }
>>
>> /* If STMT_INFO describes a reduction, return the vect_reduction_type
>> @@ -2905,7 +2927,7 @@ vect_reduc_type (vec_info *vinfo, slp_tree node)
>> if (STMT_VINFO_REDUC_DEF (stmt_info))
>> {
>> vect_reduc_info reduc_info
>> - = info_for_reduction (loop_vinfo, stmt_info);
>> + = info_for_reduction (loop_vinfo, node);
>> return int (VECT_REDUC_INFO_TYPE (reduc_info));
>> }
>> }
>> --
>> 2.43.0
>