On Wed, May 17, 2023 at 9:19 AM Kewen.Lin <[email protected]> wrote:
>
> Hi Richi,
>
> on 2023/5/17 14:34, Richard Biener wrote:
> > On Wed, May 17, 2023 at 8:09 AM Kewen.Lin <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> This patch is to refactor the handlings for the case (index
> >> == count) in a loop of vect_transform_slp_perm_load_1, in
> >> order to prepare a subsequent adjustment on *nperm. This
> >> patch doesn't have any functional changes.
> >
> > The diff is impossible to be reviewed - can you explain the
> > refactoring you have done or also attach a patch more clearly
> > showing what you change?
>
> Sorry, I should have made it more clear.
> It mainly to combine these two hunks:
>
> if (index == count && !noop_p)
> {
> // A ...
> // ++*n_perms;
> }
>
> if (index == count)
> {
> if (!analyze_only)
> {
> if (!noop_p)
> // B1 ...
>
> // B2 ...
>
> for ...
> {
> if (!noop_p)
> // B3 building VEC_PERM_EXPR
> else
> // B4 building nothing (no uses for B2 and its seq)
> }
> }
> // B5
> }
>
> The former can be part of the latter, so it becomes to:
>
> if (index == count)
> {
> if (!noop_p)
> {
> // A ...
> // ++*n_perms;
>
> if (!analyze_only)
> {
> // B1 ...
> // B2 ...
> for ...
> // B3 building VEC_PERM_EXPR
> }
> }
> else if (!analyze_only)
> {
> // no B2 since no any further uses here.
> for ...
> // B4 building nothing
> }
> // B5 ...
> }
Ah, thanks - that made reviewing easy. 1/2 is OK for trunk.
Thanks,
Richard.
> But it's mainly the basic for the subsequent patch for consistent n_perms
> calculation,
> the patch 2/2 is to make it further become to:
>
> if (index == count)
> {
> if (!noop_p)
> {
> // A ...
>
> if (!analyze_only)
> // B1 ...
>
> // B2 ... (trivial computations during analyze_only or not)
>
> for ...
> {
> // ++*n_perms; (now n_perms is consistent with building
> VEC_PERM_EXPR)
> if (analyze_only)
> continue;
> // B3 building VEC_PERM_EXPR
> }
> }
> else if (!analyze_only)
> {
> // no B2 since no any further uses here.
> for ...
> // B4 building nothing
> }
> // B5 ...
> }
>
> BR,
> Kewen
>
>
> >
> >> Bootstrapped and regtested on x86_64-redhat-linux,
> >> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
> >>
> >> BR,
> >> Kewen
> >> -----
> >> gcc/ChangeLog:
> >>
> >> * tree-vect-slp.cc (vect_transform_slp_perm_load_1): Refactor the
> >> handling on the case index == count.
> >> ---
> >> gcc/tree-vect-slp.cc | 89 ++++++++++++++++++++++----------------------
> >> 1 file changed, 44 insertions(+), 45 deletions(-)
> >>
> >> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> >> index 3b7a21724ec..e5c9d7e766e 100644
> >> --- a/gcc/tree-vect-slp.cc
> >> +++ b/gcc/tree-vect-slp.cc
> >> @@ -8230,59 +8230,50 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo,
> >> slp_tree node,
> >> noop_p = false;
> >> mask[index++] = mask_element;
> >>
> >> - if (index == count && !noop_p)
> >> + if (index == count)
> >> {
> >> - indices.new_vector (mask, second_vec_index == -1 ? 1 : 2,
> >> nunits);
> >> - if (!can_vec_perm_const_p (mode, mode, indices))
> >> + if (!noop_p)
> >> {
> >> - if (dump_p)
> >> + indices.new_vector (mask, second_vec_index == -1 ? 1 : 2,
> >> nunits);
> >> + if (!can_vec_perm_const_p (mode, mode, indices))
> >> {
> >> - dump_printf_loc (MSG_MISSED_OPTIMIZATION,
> >> - vect_location,
> >> - "unsupported vect permute { ");
> >> - for (i = 0; i < count; ++i)
> >> + if (dump_p)
> >> {
> >> - dump_dec (MSG_MISSED_OPTIMIZATION, mask[i]);
> >> - dump_printf (MSG_MISSED_OPTIMIZATION, " ");
> >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION,
> >> vect_location,
> >> + "unsupported vect permute { ");
> >> + for (i = 0; i < count; ++i)
> >> + {
> >> + dump_dec (MSG_MISSED_OPTIMIZATION, mask[i]);
> >> + dump_printf (MSG_MISSED_OPTIMIZATION, " ");
> >> + }
> >> + dump_printf (MSG_MISSED_OPTIMIZATION, "}\n");
> >> }
> >> - dump_printf (MSG_MISSED_OPTIMIZATION, "}\n");
> >> + gcc_assert (analyze_only);
> >> + return false;
> >> }
> >> - gcc_assert (analyze_only);
> >> - return false;
> >> - }
> >>
> >> - ++*n_perms;
> >> - }
> >> + ++*n_perms;
> >>
> >> - if (index == count)
> >> - {
> >> - if (!analyze_only)
> >> - {
> >> - tree mask_vec = NULL_TREE;
> >> -
> >> - if (! noop_p)
> >> - mask_vec = vect_gen_perm_mask_checked (vectype, indices);
> >> + if (!analyze_only)
> >> + {
> >> + tree mask_vec = vect_gen_perm_mask_checked (vectype,
> >> indices);
> >>
> >> - if (second_vec_index == -1)
> >> - second_vec_index = first_vec_index;
> >> + if (second_vec_index == -1)
> >> + second_vec_index = first_vec_index;
> >>
> >> - for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
> >> - {
> >> - /* Generate the permute statement if necessary. */
> >> - tree first_vec = dr_chain[first_vec_index + ri];
> >> - tree second_vec = dr_chain[second_vec_index + ri];
> >> - gimple *perm_stmt;
> >> - if (! noop_p)
> >> + for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
> >> {
> >> - gassign *stmt = as_a <gassign *> (stmt_info->stmt);
> >> + /* Generate the permute statement if necessary. */
> >> + tree first_vec = dr_chain[first_vec_index + ri];
> >> + tree second_vec = dr_chain[second_vec_index + ri];
> >> + gassign *stmt = as_a<gassign *> (stmt_info->stmt);
> >> tree perm_dest
> >> = vect_create_destination_var (gimple_assign_lhs
> >> (stmt),
> >> vectype);
> >> perm_dest = make_ssa_name (perm_dest);
> >> - perm_stmt
> >> + gimple *perm_stmt
> >> = gimple_build_assign (perm_dest, VEC_PERM_EXPR,
> >> - first_vec, second_vec,
> >> - mask_vec);
> >> + first_vec, second_vec,
> >> mask_vec);
> >> vect_finish_stmt_generation (vinfo, stmt_info,
> >> perm_stmt,
> >> gsi);
> >> if (dce_chain)
> >> @@ -8290,15 +8281,23 @@ vect_transform_slp_perm_load_1 (vec_info *vinfo,
> >> slp_tree node,
> >> bitmap_set_bit (used_defs, first_vec_index + ri);
> >> bitmap_set_bit (used_defs, second_vec_index +
> >> ri);
> >> }
> >> +
> >> + /* Store the vector statement in NODE. */
> >> + SLP_TREE_VEC_STMTS (node) [vect_stmts_counter++]
> >> + = perm_stmt;
> >> }
> >> - else
> >> - {
> >> - /* If mask was NULL_TREE generate the requested
> >> - identity transform. */
> >> - perm_stmt = SSA_NAME_DEF_STMT (first_vec);
> >> - if (dce_chain)
> >> - bitmap_set_bit (used_defs, first_vec_index + ri);
> >> - }
> >> + }
> >> + }
> >> + else if (!analyze_only)
> >> + {
> >> + for (unsigned int ri = 0; ri < nvectors_per_build; ++ri)
> >> + {
> >> + tree first_vec = dr_chain[first_vec_index + ri];
> >> + /* If mask was NULL_TREE generate the requested
> >> + identity transform. */
> >> + gimple *perm_stmt = SSA_NAME_DEF_STMT (first_vec);
> >> + if (dce_chain)
> >> + bitmap_set_bit (used_defs, first_vec_index + ri);
> >>
> >> /* Store the vector statement in NODE. */
> >> SLP_TREE_VEC_STMTS (node)[vect_stmts_counter++] =
> >> perm_stmt;
> >> --
> >> 2.39.1
>
>