https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115597

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, I feared this would happen - this case seems to be because of a lot of
VEC_PERM nodes(?) which are not handled by the CSE process as well as the
two-operator nodes which lack SLP_TREE_SCALAR_STMTS (we'd need NULL elements
there, something I need to add anyway).

The bst_map deals as "visited" map, but nodes not handled there would need
a "visited" set (but as said above, the plan is to reduce that set to zero).

I'll see to reproduce to confirm.  Usually a two-operator node shouldn't
be too bad since the next non-two-operator one will serve as 'visited' point
but in this graph we have several adjacent two-operator nodes without any
intermediate node handled by the bst-map processing code.  I can't reproduce
with -Ofast -march=znver2 though.

The easiest way to remedy the situation is probably to allow VEC_PERM_EXPR
CSE when the node has SLP_TREE_SCALAR_STMTS as two-operator nodes have.
So I _think_ the following should fix this.  I'm going to test it (on x86-64).

Can you check whether that fixes the issue?

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9465d94de1a..212d5f97f7d 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6085,7 +6085,6 @@ static void
 vect_cse_slp_nodes (scalar_stmts_to_slp_tree_map_t *bst_map, slp_tree& node)
 {
   if (SLP_TREE_DEF_TYPE (node) == vect_internal_def
-      && SLP_TREE_CODE (node) != VEC_PERM_EXPR
       /* Besides some VEC_PERM_EXPR, two-operator nodes also
         lack scalar stmts and thus CSE doesn't work via bst_map.  Ideally
         we'd have sth that works for all internal and external nodes.  */

Reply via email to