> Am 29.08.2025 um 15:21 schrieb Tamar Christina <[email protected]>:
> 
> 
>> 
>> -----Original Message-----
>> From: Richard Biener <[email protected]>
>> Sent: Friday, August 29, 2025 12:40 PM
>> To: [email protected]
>> Cc: RISC-V CI <[email protected]>; Tamar Christina
>> <[email protected]>
>> Subject: [PATCH 2/4] Separate reduction info and associate it with SLP nodes
>> 
>> The following splits out reduction related information from
>> stmt_vec_info, retaining (and duplicating) parts used by scalar
>> cycle analysis.  The data is then associated with SLP nodes
>> forming reduction cycles and accessible via info_for_reduction.
>> The data is created at SLP discovery time as we look at it even
>> pre-vectorizable_reduction analysis, but most of the data is
>> only populated by the latter.  There is no reduction info with
>> nested cycles that are not part of an outer reduction.
>> In the process this adds cycle info to each SLP tree, notably
>> the reduc-idx and a way to identify the reduction info.
>> 
>> Cleanup possibilities will be realized in a later patch of the
>> series.  This patch is going to be squashed with the first.
> 
> This one breaks the aarch64 build
> 
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc: In 
> function 'bool aarch64_force_single_cycle(vec_info*, stmt_vec_info)':
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc:17778:41: 
> error: invalid conversion from 'vec_info*' to 'loop_vec_info' {aka 
> '_loop_vec_info*'} [-fpermissive]
> 17778 |   auto reduc_info = info_for_reduction (vinfo, stmt_info);
>      |                                         ^~~~~
>      |                                         |
>      |                                         vec_info*
> /opt/buildAgent/work/505bfdd4dad8af3d/gcc/config/aarch64/aarch64.cc:17778:48: 
> error: cannot convert 'stmt_vec_info' {aka '_stmt_vec_info*'} to 'slp_tree' 
> {aka '_slp_tree*'}
> 17778 |   auto reduc_info = info_for_reduction (vinfo, stmt_info);
>      |                                                ^~~~~~~~~
>      |                                                |
>      |                                                stmt_vec_info {aka 
> _stmt_vec_info*}
> 
> Since info_for_reduction no longer takes the stmt_info.  If I'm reading this 
> right for AArch64 we'd need:
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 1cdd5a26a83..2831ef38948 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -17770,12 +17770,13 @@ aarch64_adjust_stmt_cost (vec_info *vinfo, 
> vect_cost_for_stmt kind,
> 
>    with the single accumulator being read and written multiple times.  */
> static bool
> -aarch64_force_single_cycle (vec_info *vinfo, stmt_vec_info stmt_info)
> +aarch64_force_single_cycle (vec_info *vinfo, slp_tree node)
> {
> +  stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (node);
>   if (!STMT_VINFO_REDUC_DEF (stmt_info))
>     return false;
> 
> -  auto reduc_info = info_for_reduction (vinfo, stmt_info);
> +  auto reduc_info = info_for_reduction (vinfo, node);
>   return VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info);
> }
> 
> @@ -17803,7 +17804,7 @@ aarch64_vector_costs::count_ops (unsigned int count, 
> vect_cost_for_stmt kind,
>        = aarch64_in_loop_reduction_latency (m_vinfo, node,
>                                             stmt_info, m_vec_flags);
>       if (m_costing_for_scalar
> -         || aarch64_force_single_cycle (m_vinfo, stmt_info))
> +         || aarch64_force_single_cycle (m_vinfo, node))
>        /* ??? Ideally we'd use a tree to reduce the copies down to 1 vector,
>           and then accumulate that, but at the moment the loop-carried
>           dependency includes all copies.  */
> 
> Should I try again with this change?

Yes please.

Richard 

> Thanks,
> Tamar
> 
>> 
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> 
>>    * tree-vectorizer.h (_slp_tree::cycle_info): New member.
>>    (SLP_TREE_REDUC_IDX): Likewise.
>>    (vect_reduc_info_s): Move/copy data from ...
>>    (_stmt_vec_info): ... here.
>>    (_loop_vec_info::redcu_infos): New member.
>>    (info_for_reduction): Adjust to take SLP node.
>>    (vect_reduc_type): Adjust.
>>    (vect_is_reduction): Add overload for SLP node.
>>    * tree-vectorizer.cc (vec_info::new_stmt_vec_info):
>>    Do not initialize removed members.
>>    (vec_info::free_stmt_vec_info): Do not release them.
>>    * tree-vect-stmts.cc (vectorizable_condition): Adjust.
>>    * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
>>    cycle info.
>>    (vect_build_slp_tree_2): Compute SLP reduc_idx and store
>>    it.  Create, populate and propagate reduction info.
>>    (vect_print_slp_tree): Print cycle info.
>>    (vect_analyze_slp_reduc_chain): Set cycle info on the
>>    manual added conversion node.
>>    (vect_optimize_slp_pass::start_choosing_layouts): Adjust.
>>    * tree-vect-loop.cc (_loop_vec_info::~_loop_vec_info):
>>    Release reduction infos.
>>    (info_for_reduction): Get the reduction info from
>>    the vector in the loop_vinfo.
>>    (vect_create_epilog_for_reduction): Adjust.
>>    (vectorizable_reduction): Likewise.
>>    (vect_transform_reduction): Likewise.
>>    (vect_transform_cycle_phi): Likewise, deal with nested
>>    cycles not part of a double reduction have no reduction info.
>> ---
>> gcc/tree-vect-loop.cc  |  82 +++++++---------------
>> gcc/tree-vect-slp.cc   |  79 +++++++++++++++++++--
>> gcc/tree-vect-stmts.cc |   3 +-
>> gcc/tree-vectorizer.cc |  12 ----
>> gcc/tree-vectorizer.h  | 154 +++++++++++++++++++++++------------------
>> 5 files changed, 190 insertions(+), 140 deletions(-)
>> 
>> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
>> index a0e77bdced6..db63891955b 100644
>> --- a/gcc/tree-vect-loop.cc
>> +++ b/gcc/tree-vect-loop.cc
>> @@ -955,6 +955,8 @@ _loop_vec_info::~_loop_vec_info ()
>>   delete scan_map;
>>   delete scalar_costs;
>>   delete vector_costs;
>> +  for (auto reduc_info : reduc_infos)
>> +    delete reduc_info;
>> 
>>   /* When we release an epiloge vinfo that we do not intend to use
>>      avoid clearing AUX of the main loop which should continue to
>> @@ -5135,46 +5137,12 @@ get_initial_defs_for_reduction (loop_vec_info
>> loop_vinfo,
>>     vect_emit_reduction_init_stmts (loop_vinfo, reduc_info, ctor_seq);
>> }
>> 
>> -/* For a statement STMT_INFO taking part in a reduction operation return
>> -   the stmt_vec_info the meta information is stored on.  */
>> -
>> -static vect_reduc_info
>> -info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info, bool create)
>> -{
>> -  stmt_info = vect_orig_stmt (stmt_info);
>> -  gcc_assert (STMT_VINFO_REDUC_DEF (stmt_info));
>> -  if (!is_a <gphi *> (stmt_info->stmt)
>> -      || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
>> -    stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
>> -  gphi *phi = as_a <gphi *> (stmt_info->stmt);
>> -  if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> -    {
>> -      if (gimple_phi_num_args (phi) == 1)
>> -    stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
>> -    }
>> -  else if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>> -    {
>> -      stmt_vec_info info = vinfo->lookup_def (vect_phi_initial_value (phi));
>> -      if (info && STMT_VINFO_DEF_TYPE (info) == vect_double_reduction_def)
>> -    stmt_info = info;
>> -    }
>> -  if (create)
>> -    stmt_info->is_reduc_info = true;
>> -  else
>> -    gcc_assert (stmt_info->is_reduc_info);
>> -  return vect_reduc_info (stmt_info);
>> -}
>> -
>> -vect_reduc_info
>> -info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info)
>> -{
>> -  return info_for_reduction (vinfo, stmt_info, false);
>> -}
>> -
>> vect_reduc_info
>> -create_info_for_reduction (vec_info *vinfo, stmt_vec_info stmt_info)
>> +info_for_reduction (loop_vec_info loop_vinfo, slp_tree node)
>> {
>> -  return info_for_reduction (vinfo, stmt_info, true);
>> +  if (node->cycle_info.id == -1)
>> +    return NULL;
>> +  return loop_vinfo->reduc_infos[node->cycle_info.id];
>> }
>> 
>> /* See if LOOP_VINFO is an epilogue loop whose main loop had a reduction that
>> @@ -5434,7 +5402,7 @@ vect_create_epilog_for_reduction (loop_vec_info
>> loop_vinfo,
>>                  slp_instance slp_node_instance,
>>                  edge loop_exit)
>> {
>> -  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> +  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>>   /* For double reductions we need to get at the inner loop reduction
>>      stmt which has the meta info attached.  Our stmt_info is that of the
>>      loop-closed PHI of the inner loop which we remember as
>> @@ -6920,7 +6888,7 @@ vectorizable_lane_reducing (loop_vec_info loop_vinfo,
>> stmt_vec_info stmt_info,
>>       || STMT_VINFO_REDUC_IDX (stmt_info) < 0)
>>     return false;
>> 
>> -  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> +  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> 
>>   /* Lane-reducing pattern inside any inner loop of LOOP_VINFO is not
>>      recoginized.  */
>> @@ -7083,8 +7051,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>>       && STMT_VINFO_DEF_TYPE (stmt_info) != vect_nested_cycle)
>>     return false;
>> 
>> -  /* The stmt we store reduction analysis meta on.  */
>> -  vect_reduc_info reduc_info = create_info_for_reduction (loop_vinfo, 
>> stmt_info);
>> +  /* The reduction meta.  */
>> +  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> 
>>   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>>     {
>> @@ -7160,7 +7128,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>>   slp_tree vdef_slp = slp_node_instance->root;
>>   /* For double-reductions we start SLP analysis at the inner loop LC PHI
>>      which is the def of the outer loop live stmt.  */
>> -  if (STMT_VINFO_DEF_TYPE (reduc_info.fixme ()) == 
>> vect_double_reduction_def)
>> +  if (VECT_REDUC_INFO_DEF_TYPE (reduc_info) == vect_double_reduction_def)
>>     vdef_slp = SLP_TREE_CHILDREN (vdef_slp)[0];
>>   while (reduc_def != PHI_RESULT (reduc_def_phi))
>>     {
>> @@ -7399,8 +7367,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>>    }
>>     }
>> 
>> -  enum vect_reduction_type reduction_type = STMT_VINFO_REDUC_TYPE
>> (phi_info);
>> -  VECT_REDUC_INFO_TYPE (reduc_info) = reduction_type;
>> +  enum vect_reduction_type reduction_type = VECT_REDUC_INFO_TYPE
>> (reduc_info);
>>   /* If we have a condition reduction, see if we can simplify it further.  */
>>   if (reduction_type == COND_REDUCTION)
>>     {
>> @@ -7514,7 +7481,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>> 
>>   if (nested_cycle)
>>     {
>> -      gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info.fixme ())
>> +      gcc_assert (VECT_REDUC_INFO_DEF_TYPE (reduc_info)
>>          == vect_double_reduction_def);
>>       double_reduc = true;
>>     }
>> @@ -7554,7 +7521,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>>           (and also the same tree-code) when generating the epilog code and
>>           when generating the code inside the loop.  */
>> 
>> -  code_helper orig_code = STMT_VINFO_REDUC_CODE (phi_info);
>> +  code_helper orig_code = VECT_REDUC_INFO_CODE (reduc_info);
>> 
>>   /* If conversion might have created a conditional operation like
>>      IFN_COND_ADD already.  Use the internal code for the following checks.  
>> */
>> @@ -8031,12 +7998,13 @@ vect_transform_reduction (loop_vec_info
>> loop_vinfo,
>>   class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>>   unsigned vec_num;
>> 
>> -  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> +  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> 
>>   if (nested_in_vect_loop_p (loop, stmt_info))
>>     {
>>       loop = loop->inner;
>> -      gcc_assert (STMT_VINFO_DEF_TYPE (reduc_info.fixme ()) ==
>> vect_double_reduction_def);
>> +      gcc_assert (VECT_REDUC_INFO_DEF_TYPE (reduc_info)
>> +          == vect_double_reduction_def);
>>     }
>> 
>>   gimple_match_op op;
>> @@ -8382,10 +8350,10 @@ vect_transform_cycle_phi (loop_vec_info
>> loop_vinfo,
>>       nested_cycle = true;
>>     }
>> 
>> -  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, stmt_info);
>> -
>> -  if (VECT_REDUC_INFO_TYPE (reduc_info) == EXTRACT_LAST_REDUCTION
>> -      || VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION)
>> +  vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, slp_node);
>> +  if (reduc_info
>> +      && (VECT_REDUC_INFO_TYPE (reduc_info) == EXTRACT_LAST_REDUCTION
>> +      || VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION))
>>     /* Leave the scalar phi in place.  */
>>     return true;
>> 
>> @@ -8393,7 +8361,7 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>> 
>>   /* Check whether we should use a single PHI node and accumulate
>>      vectors to one before the backedge.  */
>> -  if (VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info))
>> +  if (reduc_info && VECT_REDUC_INFO_FORCE_SINGLE_CYCLE (reduc_info))
>>     vec_num = 1;
>> 
>>   /* Create the destination vector  */
>> @@ -8408,7 +8376,8 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>>   /* Optimize: if initial_def is for REDUC_MAX smaller than the base
>>      and we can't use zero for induc_val, use initial_def.  Similarly
>>      for REDUC_MIN and initial_def larger than the base.  */
>> -  if (VECT_REDUC_INFO_TYPE (reduc_info) ==
>> INTEGER_INDUC_COND_REDUCTION)
>> +  if (reduc_info
>> +      && VECT_REDUC_INFO_TYPE (reduc_info) ==
>> INTEGER_INDUC_COND_REDUCTION)
>>     {
>>       gcc_assert (SLP_TREE_LANES (slp_node) == 1);
>>       tree initial_def = vect_phi_initial_value (phi);
>> @@ -8486,6 +8455,7 @@ vect_transform_cycle_phi (loop_vec_info loop_vinfo,
>>       vec_initial_defs.quick_push (vec_initial_def);
>>     }
>> 
>> +  if (reduc_info)
>>   if (auto *accumulator = VECT_REDUC_INFO_REUSED_ACCUMULATOR
>> (reduc_info))
>>     {
>>       tree def = accumulator->reduc_input;
>> @@ -10280,7 +10250,7 @@ vectorizable_live_operation (vec_info *vinfo,
>> stmt_vec_info stmt_info,
>>       if (SLP_INSTANCE_KIND (slp_node_instance) == slp_inst_kind_reduc_group
>>      && slp_index != 0)
>>    return true;
>> -      vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, 
>> stmt_info);
>> +      vect_reduc_info reduc_info = info_for_reduction (loop_vinfo, 
>> slp_node);
>>       if (VECT_REDUC_INFO_TYPE (reduc_info) == FOLD_LEFT_REDUCTION
>>      || VECT_REDUC_INFO_TYPE (reduc_info) ==
>> EXTRACT_LAST_REDUCTION)
>>    return true;
>> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
>> index 733f4ece724..5236eac5a42 100644
>> --- a/gcc/tree-vect-slp.cc
>> +++ b/gcc/tree-vect-slp.cc
>> @@ -126,6 +126,8 @@ _slp_tree::_slp_tree ()
>>   this->avoid_stlf_fail = false;
>>   SLP_TREE_VECTYPE (this) = NULL_TREE;
>>   SLP_TREE_REPRESENTATIVE (this) = NULL;
>> +  this->cycle_info.id = -1;
>> +  this->cycle_info.reduc_idx = -1;
>>   SLP_TREE_REF_COUNT (this) = 1;
>>   this->failed = NULL;
>>   this->max_nunits = 1;
>> @@ -2735,6 +2737,7 @@ out:
>> 
>>   stmt_info = stmts[0];
>> 
>> +  int reduc_idx = -1;
>>   int gs_scale = 0;
>>   tree gs_base = NULL_TREE;
>> 
>> @@ -2826,6 +2829,33 @@ out:
>>      continue;
>>    }
>> 
>> +      /* See which SLP operand a reduction chain continues on.  We want
>> +     to chain even PHIs but not backedges.  */
>> +      if (VECTORIZABLE_CYCLE_DEF (oprnd_info->first_dt)
>> +      || STMT_VINFO_REDUC_IDX (oprnd_info->def_stmts[0]) != -1)
>> +    {
>> +      if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle)
>> +        {
>> +          if (oprnd_info->first_dt == vect_double_reduction_def)
>> +        reduc_idx = i;
>> +        }
>> +      else if (is_a <gphi *> (stmt_info->stmt)
>> +           && gimple_phi_num_args
>> +            (as_a <gphi *> (stmt_info->stmt)) != 1)
>> +        ;
>> +      else if (STMT_VINFO_REDUC_IDX (stmt_info) == -1
>> +           && STMT_VINFO_DEF_TYPE (stmt_info) !=
>> vect_double_reduction_def)
>> +        ;
>> +      else if (reduc_idx == -1)
>> +        reduc_idx = i;
>> +      else
>> +        /* For .COND_* reduction operations the else value can be the
>> +           same as one of the operation operands.  The other def
>> +           stmts have been moved, so we can't check easily.  Check
>> +           it's a call at least.  */
>> +        gcc_assert (is_a <gcall *> (stmt_info->stmt));
>> +    }
>> +
>>       /* When we have a masked load with uniform mask discover this
>>     as a single-lane mask with a splat permute.  This way we can
>>     recognize this as a masked load-lane by stripping the splat.  */
>> @@ -3157,6 +3187,41 @@ fail:
>>   SLP_TREE_CHILDREN (node).splice (children);
>>   SLP_TREE_GS_SCALE (node) = gs_scale;
>>   SLP_TREE_GS_BASE (node) = gs_base;
>> +  if (reduc_idx != -1)
>> +    {
>> +      gcc_assert (STMT_VINFO_REDUC_IDX (stmt_info) != -1
>> +          || STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
>> +          || STMT_VINFO_DEF_TYPE (stmt_info) ==
>> vect_double_reduction_def);
>> +      SLP_TREE_REDUC_IDX (node) = reduc_idx;
>> +      node->cycle_info.id = SLP_TREE_CHILDREN 
>> (node)[reduc_idx]->cycle_info.id;
>> +    }
>> +  /* When reaching the reduction PHI, create a vect_reduc_info.  */
>> +  else if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
>> +        || STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> +       && is_a <gphi *> (STMT_VINFO_STMT (stmt_info)))
>> +    {
>> +      loop_vec_info loop_vinfo = as_a <loop_vec_info> (vinfo);
>> +      gcc_assert (STMT_VINFO_REDUC_IDX (stmt_info) == -1);
>> +      node->cycle_info.id = loop_vinfo->reduc_infos.length ();
>> +      vect_reduc_info reduc_info = new vect_reduc_info_s ();
>> +      loop_vinfo->reduc_infos.safe_push (reduc_info);
>> +      stmt_vec_info reduc_phi = stmt_info;
>> +      /* ???  For double reductions vect_is_simple_reduction stores the
>> +     reduction type and code on the inner loop header PHI.  */
>> +      if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
>> +    {
>> +      use_operand_p use_p;
>> +      gimple *use_stmt;
>> +      bool res = single_imm_use (gimple_phi_result (stmt_info->stmt),
>> +                     &use_p, &use_stmt);
>> +      gcc_assert (res);
>> +      reduc_phi = loop_vinfo->lookup_stmt (use_stmt);
>> +    }
>> +      VECT_REDUC_INFO_DEF_TYPE (reduc_info) = STMT_VINFO_DEF_TYPE
>> (stmt_info);
>> +      VECT_REDUC_INFO_TYPE (reduc_info) = STMT_VINFO_REDUC_TYPE
>> (reduc_phi);
>> +      VECT_REDUC_INFO_CODE (reduc_info) = STMT_VINFO_REDUC_CODE
>> (reduc_phi);
>> +      VECT_REDUC_INFO_FN (reduc_info) = IFN_LAST;
>> +    }
>>   return node;
>> }
>> 
>> @@ -3185,8 +3250,12 @@ vect_print_slp_tree (dump_flags_t dump_kind,
>> dump_location_t loc,
>>                     SLP_TREE_REF_COUNT (node));
>>   if (SLP_TREE_VECTYPE (node))
>>     dump_printf (metadata, " %T", SLP_TREE_VECTYPE (node));
>> -  dump_printf (metadata, "%s\n",
>> +  dump_printf (metadata, "%s",
>>           node->avoid_stlf_fail ? " (avoid-stlf-fail)" : "");
>> +  if (node->cycle_info.id != -1 || node->cycle_info.reduc_idx != -1)
>> +    dump_printf (metadata, " cycle %d, link %d", node->cycle_info.id,
>> +         node->cycle_info.reduc_idx);
>> +  dump_printf (metadata, "\n");
>>   if (SLP_TREE_DEF_TYPE (node) == vect_internal_def)
>>     {
>>       if (SLP_TREE_PERMUTE_P (node))
>> @@ -4241,6 +4310,8 @@ vect_analyze_slp_reduc_chain (vec_info *vinfo,
>>                           TREE_TYPE
>>                           (gimple_assign_lhs (scalar_def)),
>>                           group_size);
>> +          SLP_TREE_REDUC_IDX (conv) = 0;
>> +          conv->cycle_info.id = node->cycle_info.id;
>>          SLP_TREE_CHILDREN (conv).quick_push (node);
>>          SLP_INSTANCE_TREE (new_instance) = conv;
>>          /* We also have to fake this conversion stmt as SLP reduction
>> @@ -6719,11 +6790,9 @@ vect_optimize_slp_pass::start_choosing_layouts ()
>>       {
>>    stmt_vec_info stmt_info
>>      = SLP_TREE_REPRESENTATIVE (SLP_INSTANCE_TREE (instance));
>> -    /* ???  vectorizable_reduction did not run yet, scalar cycle
>> -       detection sets reduc_code.  Either that or SLP discovery
>> -       should create a reduction info.  */
>>    vect_reduc_info reduc_info
>> -      = create_info_for_reduction (m_vinfo, stmt_info);
>> +      = info_for_reduction (as_a <loop_vec_info> (m_vinfo),
>> +                SLP_INSTANCE_TREE (instance));
>>    if (needs_fold_left_reduction_p (TREE_TYPE
>>                       (gimple_get_lhs (stmt_info->stmt)),
>>                     VECT_REDUC_INFO_CODE (reduc_info)))
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index 2248b361e1c..77a03ed4a7b 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -11572,7 +11572,8 @@ vectorizable_condition (vec_info *vinfo,
>>     every stmt, use the conservative default setting then.  */
>>       if (STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)))
>>    {
>> -      vect_reduc_info reduc_info = info_for_reduction (vinfo, stmt_info);
>> +      vect_reduc_info reduc_info
>> +        = info_for_reduction (loop_vinfo, slp_node);
>>      reduction_type = VECT_REDUC_INFO_TYPE (reduc_info);
>>      nested_cycle_p = nested_in_vect_loop_p (LOOP_VINFO_LOOP
>> (loop_vinfo),
>>                          stmt_info);
>> diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
>> index 726383f487c..d7dc30bbeac 100644
>> --- a/gcc/tree-vectorizer.cc
>> +++ b/gcc/tree-vectorizer.cc
>> @@ -724,16 +724,6 @@ vec_info::new_stmt_vec_info (gimple *stmt)
>>   STMT_VINFO_SLP_VECT_ONLY (res) = false;
>>   STMT_VINFO_SLP_VECT_ONLY_PATTERN (res) = false;
>> 
>> -  /* To be moved.  */
>> -  res->reduc_epilogue_adjustment = NULL_TREE;
>> -  res->force_single_cycle = false;
>> -  res->reduc_fn = IFN_LAST;
>> -  res->reduc_initial_values = vNULL;
>> -  res->reduc_scalar_results = vNULL;
>> -  res->reduc_vectype = NULL_TREE;
>> -  res->induc_cond_initial_val = NULL_TREE;
>> -  res->reused_accumulator = NULL;
>> -
>>   if (is_a <loop_vec_info> (this)
>>       && gimple_code (stmt) == GIMPLE_PHI
>>       && is_loop_header_bb_p (gimple_bb (stmt)))
>> @@ -794,8 +784,6 @@ vec_info::free_stmt_vec_info (stmt_vec_info stmt_info)
>>    release_ssa_name (lhs);
>>     }
>> 
>> -  stmt_info->reduc_initial_values.release ();
>> -  stmt_info->reduc_scalar_results.release ();
>>   free (stmt_info);
>> }
>> 
>> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
>> index 90d862f2987..260cb2ddd3e 100644
>> --- a/gcc/tree-vectorizer.h
>> +++ b/gcc/tree-vectorizer.h
>> @@ -310,6 +310,13 @@ struct _slp_tree {
>>      code generation.  */
>>   stmt_vec_info representative;
>> 
>> +  struct {
>> +      /* SLP cycle the node resides in, or -1.  */
>> +      int id;
>> +      /* The SLP operand index with the edge on the SLP cycle, or -1.  */
>> +      int reduc_idx;
>> +  } cycle_info;
>> +
>>   /* Load permutation relative to the stores, NULL if there is no
>>      permutation.  */
>>   load_permutation_t load_permutation;
>> @@ -446,6 +453,7 @@ public:
>> #define SLP_TREE_TYPE(S)             (S)->type
>> #define SLP_TREE_GS_SCALE(S)             (S)->gs_scale
>> #define SLP_TREE_GS_BASE(S)             (S)->gs_base
>> +#define SLP_TREE_REDUC_IDX(S)             (S)->cycle_info.reduc_idx
>> #define SLP_TREE_PERMUTE_P(S)             ((S)->code ==
>> VEC_PERM_EXPR)
>> 
>> inline vect_memory_access_type
>> @@ -816,26 +824,70 @@ typedef auto_vec<std::pair<data_reference*, tree> >
>> drs_init_vec;
>> 
>> /* Abstraction around info on reductions which is still in stmt_vec_info
>>    but will be duplicated or moved elsewhere.  */
>> -class vect_reduc_info
>> +class vect_reduc_info_s
>> {
>> public:
>> -  explicit vect_reduc_info (stmt_vec_info s) : i (s) {}
>> -  stmt_vec_info fixme () const { return i; }
>> -private:
>> -  stmt_vec_info i;
>> +  /* The def type of the main reduction PHI, vect_reduction_def or
>> +     vect_double_reduction_def.  */
>> +  enum vect_def_type def_type;
>> +
>> +  /* The reduction type as detected by
>> +     vect_is_simple_reduction and vectorizable_reduction.  */
>> +  enum vect_reduction_type reduc_type;
>> +
>> +  /* The original scalar reduction code, to be used in the epilogue.  */
>> +  code_helper reduc_code;
>> +
>> +  /* A vector internal function we should use in the epilogue.  */
>> +  internal_fn reduc_fn;
>> +
>> +  /* For loop reduction with multiple vectorized results (ncopies > 1), a
>> +     lane-reducing operation participating in it may not use all of those
>> +     results, this field specifies result index starting from which any
>> +     following land-reducing operation would be assigned to.  */
>> +  unsigned int reduc_result_pos;
>> +
>> +  /* Whether we force a single cycle PHI during reduction vectorization.  */
>> +  bool force_single_cycle;
>> +
>> +  /* The vector type for performing the actual reduction operation.  */
>> +  tree reduc_vectype;
>> +
>> +  /* For INTEGER_INDUC_COND_REDUCTION, the initial value to be used.  */
>> +  tree induc_cond_initial_val;
>> +
>> +  /* If not NULL the value to be added to compute final reduction value.  */
>> +  tree reduc_epilogue_adjustment;
>> +
>> +  /* If non-null, the reduction is being performed by an epilogue loop
>> +     and we have decided to reuse this accumulator from the main loop.  */
>> +  struct vect_reusable_accumulator *reused_accumulator;
>> +
>> +  /* If the vector code is performing N scalar reductions in parallel,
>> +     this variable gives the initial scalar values of those N reductions.  
>> */
>> +  auto_vec<tree> reduc_initial_values;
>> +
>> +  /* If the vector code is performing N scalar reductions in parallel, this
>> +     variable gives the vectorized code's final (scalar) result for each of
>> +     those N reductions.  In other words, REDUC_SCALAR_RESULTS[I] replaces
>> +     the original scalar code's loop-closed SSA PHI for reduction number I. 
>>  */
>> +  auto_vec<tree> reduc_scalar_results;
>> };
>> 
>> -#define VECT_REDUC_INFO_TYPE(I) ((I).fixme ()->reduc_type)
>> -#define VECT_REDUC_INFO_CODE(I) ((I).fixme ()->reduc_code)
>> -#define VECT_REDUC_INFO_FN(I) ((I).fixme ()->reduc_fn)
>> -#define VECT_REDUC_INFO_SCALAR_RESULTS(I) ((I).fixme ()-
>>> reduc_scalar_results)
>> -#define VECT_REDUC_INFO_INITIAL_VALUES(I) ((I).fixme 
>> ()->reduc_initial_values)
>> -#define VECT_REDUC_INFO_REUSED_ACCUMULATOR(I) ((I).fixme ()-
>>> reused_accumulator)
>> -#define VECT_REDUC_INFO_INDUC_COND_INITIAL_VAL(I) ((I).fixme ()-
>>> induc_cond_initial_val)
>> -#define VECT_REDUC_INFO_EPILOGUE_ADJUSTMENT(I) ((I).fixme ()-
>>> reduc_epilogue_adjustment)
>> -#define VECT_REDUC_INFO_VECTYPE(I) ((I).fixme ()->reduc_vectype)
>> -#define VECT_REDUC_INFO_FORCE_SINGLE_CYCLE(I) ((I).fixme ()-
>>> force_single_cycle)
>> -#define VECT_REDUC_INFO_RESULT_POS(I) ((I).fixme ()->reduc_result_pos)
>> +typedef class vect_reduc_info_s *vect_reduc_info;
>> +
>> +#define VECT_REDUC_INFO_DEF_TYPE(I) ((I)->def_type)
>> +#define VECT_REDUC_INFO_TYPE(I) ((I)->reduc_type)
>> +#define VECT_REDUC_INFO_CODE(I) ((I)->reduc_code)
>> +#define VECT_REDUC_INFO_FN(I) ((I)->reduc_fn)
>> +#define VECT_REDUC_INFO_SCALAR_RESULTS(I) ((I)->reduc_scalar_results)
>> +#define VECT_REDUC_INFO_INITIAL_VALUES(I) ((I)->reduc_initial_values)
>> +#define VECT_REDUC_INFO_REUSED_ACCUMULATOR(I) ((I)-
>>> reused_accumulator)
>> +#define VECT_REDUC_INFO_INDUC_COND_INITIAL_VAL(I) ((I)-
>>> induc_cond_initial_val)
>> +#define VECT_REDUC_INFO_EPILOGUE_ADJUSTMENT(I) ((I)-
>>> reduc_epilogue_adjustment)
>> +#define VECT_REDUC_INFO_VECTYPE(I) ((I)->reduc_vectype)
>> +#define VECT_REDUC_INFO_FORCE_SINGLE_CYCLE(I) ((I)->force_single_cycle)
>> +#define VECT_REDUC_INFO_RESULT_POS(I) ((I)->reduc_result_pos)
>> 
>> /* Information about a reduction accumulator from the main loop that could
>>    conceivably be reused as the input to a reduction in an epilogue loop.  */
>> @@ -902,6 +954,10 @@ public:
>>      the main loop, this edge is the one that skips the epilogue.  */
>>   edge skip_this_loop_edge;
>> 
>> +  /* Reduction descriptors of this loop.  Referenced to from SLP nodes
>> +     by index.  */
>> +  auto_vec<vect_reduc_info> reduc_infos;
>> +
>>   /* The vectorized form of a standard reduction replaces the original
>>      scalar code's final result (a loop-closed SSA PHI) with the result
>>      of a vector-to-scalar reduction operation.  After vectorization,
>> @@ -1517,62 +1573,22 @@ public:
>>   /* For both loads and stores.  */
>>   unsigned simd_lane_access_p : 3;
>> 
>> -  /* For INTEGER_INDUC_COND_REDUCTION, the initial value to be used.  */
>> -  tree induc_cond_initial_val;
>> -
>> -  /* If not NULL the value to be added to compute final reduction value.  */
>> -  tree reduc_epilogue_adjustment;
>> -
>>   /* On a reduction PHI the reduction type as detected by
>> -     vect_is_simple_reduction and vectorizable_reduction.  */
>> +     vect_is_simple_reduction.  */
>>   enum vect_reduction_type reduc_type;
>> 
>> -  /* The original reduction code, to be used in the epilogue.  */
>> +  /* On a reduction PHI, the original reduction code as detected by
>> +     vect_is_simple_reduction.  */
>>   code_helper reduc_code;
>> -  /* An internal function we should use in the epilogue.  */
>> -  internal_fn reduc_fn;
>> 
>> -  /* On a stmt participating in the reduction the index of the operand
>> +  /* On a stmt participating in a reduction the index of the operand
>>      on the reduction SSA cycle.  */
>>   int reduc_idx;
>> 
>> -  /* On a reduction PHI the def returned by vect_force_simple_reduction.
>> -     On the def returned by vect_force_simple_reduction the
>> -     corresponding PHI.  */
>> +  /* On a reduction PHI the def returned by vect_is_simple_reduction.
>> +     On the def returned by vect_is_simple_reduction the corresponding PHI. 
>>  */
>>   stmt_vec_info reduc_def;
>> 
>> -  /* The vector type for performing the actual reduction.  */
>> -  tree reduc_vectype;
>> -
>> -  /* For loop reduction with multiple vectorized results (ncopies > 1), a
>> -     lane-reducing operation participating in it may not use all of those
>> -     results, this field specifies result index starting from which any
>> -     following land-reducing operation would be assigned to.  */
>> -  unsigned int reduc_result_pos;
>> -
>> -  /* If IS_REDUC_INFO is true and if the vector code is performing
>> -     N scalar reductions in parallel, this variable gives the initial
>> -     scalar values of those N reductions.  */
>> -  vec<tree> reduc_initial_values;
>> -
>> -  /* If IS_REDUC_INFO is true and if the vector code is performing
>> -     N scalar reductions in parallel, this variable gives the vectorized 
>> code's
>> -     final (scalar) result for each of those N reductions.  In other words,
>> -     REDUC_SCALAR_RESULTS[I] replaces the original scalar code's loop-closed
>> -     SSA PHI for reduction number I.  */
>> -  vec<tree> reduc_scalar_results;
>> -
>> -  /* Only meaningful if IS_REDUC_INFO.  If non-null, the reduction is
>> -     being performed by an epilogue loop and we have decided to reuse
>> -     this accumulator from the main loop.  */
>> -  vect_reusable_accumulator *reused_accumulator;
>> -
>> -  /* Whether we force a single cycle PHI during reduction vectorization.  */
>> -  bool force_single_cycle;
>> -
>> -  /* Whether on this stmt reduction meta is recorded.  */
>> -  bool is_reduc_info;
>> -
>>   /* If nonzero, the lhs of the statement could be truncated to this
>>      many bits without affecting any users of the result.  */
>>   unsigned int min_output_precision;
>> @@ -2674,8 +2690,7 @@ extern tree vect_gen_loop_len_mask (loop_vec_info,
>> gimple_stmt_iterator *,
>>                    unsigned int, tree, tree, unsigned int,
>>                    unsigned int);
>> extern gimple_seq vect_gen_len (tree, tree, tree, tree);
>> -extern vect_reduc_info info_for_reduction (vec_info *, stmt_vec_info);
>> -extern vect_reduc_info create_info_for_reduction (vec_info *, 
>> stmt_vec_info);
>> +extern vect_reduc_info info_for_reduction (loop_vec_info, slp_tree);
>> extern bool reduction_fn_for_scalar_code (code_helper, internal_fn *);
>> 
>> /* Drive for loop transformation stage.  */
>> @@ -2891,7 +2906,14 @@ vect_is_store_elt_extraction (vect_cost_for_stmt
>> kind, stmt_vec_info stmt_info)
>> inline bool
>> vect_is_reduction (stmt_vec_info stmt_info)
>> {
>> -  return STMT_VINFO_REDUC_IDX (stmt_info) >= 0;
>> +  return STMT_VINFO_REDUC_IDX (stmt_info) != -1;
>> +}
>> +
>> +/* Return true if SLP_NODE represents part of a reduction.  */
>> +inline bool
>> +vect_is_reduction (slp_tree slp_node)
>> +{
>> +  return SLP_TREE_REDUC_IDX (slp_node) != -1;
>> }
>> 
>> /* If STMT_INFO describes a reduction, return the vect_reduction_type
>> @@ -2905,7 +2927,7 @@ vect_reduc_type (vec_info *vinfo, slp_tree node)
>>       if (STMT_VINFO_REDUC_DEF (stmt_info))
>>    {
>>      vect_reduc_info reduc_info
>> -        = info_for_reduction (loop_vinfo, stmt_info);
>> +        = info_for_reduction (loop_vinfo, node);
>>      return int (VECT_REDUC_INFO_TYPE (reduc_info));
>>    }
>>     }
>> --
>> 2.43.0
> 

Reply via email to