https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117929

            Bug ID: 117929
           Summary: SLP permute optimization does not take into account a
                    load/lane permute when verifying a layout change
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Split out from PR117714.

So it seems that vect_optimize_slp_pass::change_layout_cost when trying to
transition the following load from layout 1 to layout 0:

slp-reduc-4.c:13:5: note: node 0x36cd760 (max_nunits=2, refcnt=1) vector(2)
unsigned int
slp-reduc-4.c:13:5: note: op template: _9 = uc[_8];
slp-reduc-4.c:13:5: note:     stmt 0 _9 = uc[_8];
slp-reduc-4.c:13:5: note:     stmt 1 _11 = uc[_10];
slp-reduc-4.c:13:5: note:     stmt 2 _16 = uc[_15];
slp-reduc-4.c:13:5: note:     stmt 3 _14 = uc[_13];
slp-reduc-4.c:13:5: note:     stmt 4 _5 = uc[_4];
slp-reduc-4.c:13:5: note:     stmt 5 _3 = uc[_2];
slp-reduc-4.c:13:5: note:     stmt 6 _7 = uc[_6];
slp-reduc-4.c:13:5: note:     stmt 7 _12 = uc[_1];
slp-reduc-4.c:13:5: note:     load permutation { 7 6 5 4 3 2 1 0 }

asks whether the target can do a { 7 6 5 4 3 2 1 0 } permute (which sparc
cannot do).  But it's missing the fact that the permute would be merged
with the load permutation, cancelling that out?

In fact with the following (not entirely sure the vect_permute_slp should
be forward...), we correctly reject layout 0 (given the load permutation
isn't supported) but accept layout 1 (no permute needed).

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9ad95104ec7..f870206b585 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6003,7 +6003,15 @@ vect_optimize_slp_pass::change_layout_cost (slp_tree
node,
   auto_vec<slp_tree, 1> children (1);
   children.quick_push (node);
   auto_lane_permutation_t perm (SLP_TREE_LANES (node));
-  if (from_layout_i > 0)
+  if (SLP_TREE_LOAD_PERMUTATION (node).exists () && from_layout_i > 0)
+    {
+      auto_load_permutation_t tmp_perm;
+      tmp_perm.safe_splice (SLP_TREE_LOAD_PERMUTATION (node));
+      vect_slp_permute (m_perms[from_layout_i], tmp_perm, false);
+      for (unsigned int i : tmp_perm)
+       perm.quick_push ({ 0, i });
+    }
+  else if (from_layout_i > 0)
     for (unsigned int i : m_perms[from_layout_i])
       perm.quick_push ({ 0, i });
   else

A similar issue would exist when the node to change is a VEC_PERM_NODE
with a lane permutation I think.

In general we seem to assume that layout 0 is OK wrt permutes?

Reply via email to