Hi All,
This is a respin containing the requested changes.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-slp-patterns.c (vect_match_expression_p,
vect_check_lane_permute, vect_detect_pair_op,
vect_mark_stmts_as_in_pattern, class complex_pattern,
complex_pattern::validate_p, class complex_operations_pattern,
complex_operations_pattern::matches,
complex_operations_pattern::get_name,
complex_operations_pattern::build,
complex_operations_pattern::validate_p,
complex_operations_pattern::get_arity,
complex_operations_pattern::is_optab_supported_p,
complex_operations_pattern::get_ifn): New.
(slp_patterns): Add complex_operations_pattern.
> -----Original Message-----
> From: Gcc-patches <[email protected]> On Behalf Of Tamar
> Christina
> Sent: Monday, September 28, 2020 5:06 PM
> To: Richard Biener <[email protected]>
> Cc: nd <[email protected]>; [email protected]; [email protected]
> Subject: RE: [PATCH v2 5/16]middle-end: Add shared machinery for matching
> patterns involving complex numbers.
>
> Hi Richi,
>
> > -----Original Message-----
> > From: [email protected] <[email protected]> On
> > Behalf Of Richard Biener
> > Sent: Monday, September 28, 2020 2:22 PM
> > To: Tamar Christina <[email protected]>
> > Cc: [email protected]; nd <[email protected]>; [email protected]
> > Subject: Re: [PATCH v2 5/16]middle-end: Add shared machinery for
> > matching patterns involving complex numbers.
> >
> > On Fri, 25 Sep 2020, Tamar Christina wrote:
> >
> > > Hi All,
> > >
> > > This patch adds shared machinery for detecting patterns having to do
> > > with complex number operations. The class ComplexPattern provides
> > > helpers for matching and ultimately undoing the permutation in the
> > > tree by rebuilding the graph.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> >
> > I think you want to change all this to not look at individual
> > stmts:
> >
> > + vect_match_expression_p (slp_tree node, tree_code code, int base,
> > + int
> > idx,
> > + stmt_vec_info *op1, stmt_vec_info *op2)
> > + {
> > +
> > + vec<stmt_vec_info> scalar_stmts = SLP_TREE_SCALAR_STMTS (node);
> > +
> >
> > _all_ lanes are supposed to match the operation in
> > SLP_TREE_REPRESENTATIVE there's no need to do any operation-matching
> > on lane granularity.
> >
>
> That's fair, that flexibility seems like it indeed won't work since the
> statements are vectorized based on SLP_TREE_REPRESENTATIVE anyway. So
> I'll simplify it.
>
> > The only thing making a difference here is VEC_PERM_EXPR nodes where
> > again - there's no need to look at (eventually non-existant!)
> > SLP_TREE_SCALAR_STMTS but its SLP_TREE_REPRESENTATIVE.
> >
> > + vec<slp_tree> children = SLP_TREE_CHILDREN (node);
> > +
> > + /* If it's a VEC_PERM_EXPR we need to look one deeper.
> > VEC_PERM_EXPR
> > + only have one entry. So pick on. */
> > + if (node->code == VEC_PERM_EXPR)
> > + children = SLP_TREE_CHILDREN (children.last ());
> >
> > that's too simplistic ;)
> >
> > + *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n];
> >
> > please make op1,op2 slp_tree, not stmt_vec_info.
> >
> > Where do you look at SLP_TREE_LANE_PERMUTATION at all? I think the
> > result of
> >
>
> Here I have to admit that I have/am a bit confused as to the relation
> between the different permute fields.
> LOAD permute is quite straight forward, LANE permute I think are
> shuffles/explicit permutes.
>
> But then I am lost as to the purpose of a VEC_PERM_EXPR nodes. Is it just a
> marker to indicate that some node below has a load permute somewhere?
>
> > + vect_detect_pair_op (int base, slp_tree node1, int offset1,
> > + slp_tree
> > node2,
> > + int offset2, vec<stmt_vec_info> *ops)
> >
> > could be simply the SLP_TREE_LANE_PERMUTATION? plus its two child
> > nodes?
>
> Right, if I understood correctly, on the two_operands case the lane permute
> would tell me whether it's + or -, and in the case of the non- two_operands
> cases I probably want to check that it's vNULL since any permute in the order
> changes how the instruction accepts the inputs?
>
> >
> > In the ComplexAddPattern patch I see
> >
> > + /* Correct the arguments after matching. */
> > + std::swap (this->m_vects[1], this->m_vects[3]);
> >
> > how's that necessary? The replacement SLP node should end up with
> > just a SLP_TREE_REPRESENTATIVE (the IFN function call).
> > That is, the only thing necessary is verification / matching of the
> > appropriate "interleaving" scheme imposed by
> SLP_TREE_LANE_PERMUTATION.
>
> But the order or the statements are important as those decide the
> LOAD_PERMUTATION that build_slp_tree creates.
>
> So in this case the original statement is
>
> stmt 0 _39 = _37 + _12;
> stmt 1 _6 = _38 - _36;
>
> {_12, _36} result in a LOAD_PERMUTATION of {1, 0} because of how the
> addition is done.
> So to undo the LOAD_PERMUTE it has to build the new children using
>
> stmt 0 _39 = {_37, _36};
> stmt 1 _6 = {_38, _12};
>
> So the scalar statements are purely a re-ordering to get it to rebuild the
> children correctly.
>
> Now ADD is simplistic, the more complicated cases like MUL and FMA use this
> property to Also rebuild the tree removing unneeded statements. This
> method returns these and stores them in the PatternMatch class, so I don't
> have to ask for them again later when replacing the nodes. Even for
> SLP_TREE_REPRESENTATIVE don't the arguments in the IFN call need to be in
> such way that they both in the same place in the load chain?
>
> > I guess things would go wrong if instead of effectively blending two
> > vectors we'd interleave two smaller vector type vectors? Say a 4-way
> > add and a 4- way sub interleaved into a 8-way addsub, using V4SF and V8SF
> vectors?
> >
>
> At this stage yes, most likely, but it should be rejected at the validate
> level.
>
> What is also currently rejected is when some of the definitions are external,
> which I think I should be able to handle. I'll fix that up with the rest of
> the
> changes.
>
>
> Thanks for the feedback!
>
> Regards,
> Tamar
>
> > Going to stop looking at the series at this point, one further note is
> > that all of the Complex*Patterns seem to share the initial match and
> > thus redundantly do a vect_detect_pair_op repeatedly on each node for
> > each pattern? I wonder if it could be commoned into a single pattern,
> > they all seem to share the initial mixed plus/minus, then we have the
> > multiplication on one or both operand cases.
> >
> > Richard.
> >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-vect-slp-patterns.c (complex_operation_t,class
> > ComplexPattern):
> > > New.
> > >
> > >
> >
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > Nuernberg, Germany; GF: Felix Imend
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index b81cd6856ad4db5a0cda48ca214389f007e060da..65044a77d55e24cde6c663e81c11b66ad9650056 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -136,6 +136,27 @@ along with GCC; see the file COPYING3. If not see
To add a new pattern, implement the vect_pattern class and add the type to
slp_patterns. */
+/*******************************************************************************
+ * General helper types
+ ******************************************************************************/
+
+/* The COMPLEX_OPERATION enum denotes the possible pair of operations that can
+ be matched when looking for expressions that we are interested matching for
+ complex numbers addition and mla. */
+
+typedef enum _complex_operation : unsigned {
+ PLUS_PLUS,
+ MINUS_PLUS,
+ PLUS_MINUS,
+ MULT_MULT,
+ NEG_NEG,
+ CMPLX_NONE
+} complex_operation_t;
+
+/* A simple pair type used for ordering of combine operations. */
+
+typedef struct map_ { int a; int b; } map_t;
+
/*******************************************************************************
* Simple vector pattern matcher
******************************************************************************/
@@ -276,6 +297,348 @@ vect_simple_pattern_match::build ()
return call_stmt;
}
+/* Checks to see of the expression EXPR is a gimple assign with code CODE
+ and if this is the case the two operands of EXPR is returned in OP1 and
+ OP2.
+
+ If the matching and extraction is successful TRUE is returned otherwise
+ FALSE in which case the value of OP1 and OP2 will not have been touched.
+*/
+
+static bool
+vect_match_expression_p (slp_tree node, tree_code code)
+{
+ if (!SLP_TREE_REPRESENTATIVE (node))
+ return false;
+
+ gimple* expr = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node));
+ if (!is_gimple_assign (expr)
+ || gimple_expr_code (expr) != code)
+ return false;
+
+ return true;
+}
+
+/* Check if the given lane permute in PERMUTES matches an alternating sequence
+ of {P0 P1 P0 P1 ...}. This to account for unrolled loops. */
+static bool
+vect_check_lane_permute (lane_permutation_t &permutes,
+ unsigned p0, unsigned p1)
+{
+ unsigned val[2] = {p0, p1};
+ for (unsigned i = 0; i < permutes.length (); i++)
+ if (permutes[i].first != val[i % 2])
+ return false;
+
+ return true;
+}
+
+/* This function will match two gimple expressions STMT_0 and STMT_1 in
+ parallel and returns the pair operation that represents the two
+ expressions in the two statements. The statements are located in NODE1
+ and NODE2 at offset base + offset1 and base + offset2 respectively.
+
+ If match is successful then the corresponding complex_operation is
+ returned and the arguments to the two matched operations are returned in OPS.
+
+ If unsuccessful then CMPLX_NONE is returned and OPS is untouched.
+
+ e.g. the following gimple statements
+
+ stmt 0 _39 = _37 + _12;
+ stmt 1 _6 = _38 - _36;
+
+ will return PLUS_MINUS along with OPS containing {_37, _12, _38, _36}.
+*/
+
+static complex_operation_t
+vect_detect_pair_op (slp_tree node1, slp_tree node2, lane_permutation_t &lanes,
+ bool two_operands = true, vec<slp_tree> *ops = NULL)
+{
+ complex_operation_t result = CMPLX_NONE;
+#define CHECK_FOR(x, y) \
+ (vect_match_expression_p (node1, x) \
+ && vect_match_expression_p (node2, y))
+
+ if (CHECK_FOR (MINUS_EXPR, PLUS_EXPR)
+ && (!two_operands || vect_check_lane_permute (lanes, 0, 1)))
+ result = MINUS_PLUS;
+ else if (CHECK_FOR (PLUS_EXPR, MINUS_EXPR)
+ && (!two_operands || vect_check_lane_permute (lanes, 0, 1)))
+ result = PLUS_MINUS;
+ else if (CHECK_FOR (PLUS_EXPR, PLUS_EXPR))
+ result = PLUS_PLUS;
+ else if (CHECK_FOR (MULT_EXPR, MULT_EXPR))
+ result = MULT_MULT;
+ else if (CHECK_FOR (NEGATE_EXPR, NEGATE_EXPR))
+ result = NEG_NEG;
+#undef CHECK_FOR
+
+ if (result != CMPLX_NONE && ops != NULL)
+ {
+ ops->create (2);
+ ops->quick_push (node1);
+ ops->quick_push (node2);
+ }
+ return result;
+}
+
+/* Overload of vect_detect_pair_op where the statements are assumed to be
+ one after the other. This inspects node[base] and node[base+1]. */
+
+static complex_operation_t
+vect_detect_pair_op (slp_tree node, bool two_operands = true,
+ vec<slp_tree> *ops = NULL)
+{
+ if (!two_operands && SLP_TREE_CODE (node) == VEC_PERM_EXPR)
+ return CMPLX_NONE;
+
+ if (SLP_TREE_CHILDREN (node).length () != 2)
+ return CMPLX_NONE;
+
+ vec<slp_tree> children = SLP_TREE_CHILDREN (node);
+ lane_permutation_t &lanes = SLP_TREE_LANE_PERMUTATION (node);
+
+ return vect_detect_pair_op (children[0], children[1], lanes, two_operands,
+ ops);
+}
+
+/* This function marks every statement that is being replaced during the
+ the pattern matching as PURE. Normally when replacing a statement due
+ to a pattern we add the statement to the STMT_VINFO_PATTERN_DEF_SEQ of
+ the pattern that is replacing them. In this case however this won't
+ work as when doing the replacement we are changing the nodes that are
+ used by the statements. This means that when vectorized the SSA chain
+ is different than in the BB.
+
+ Declaring the statements as part of the sequence will then cause SSA
+ verification to fail as we may refer to statements that were not in the
+ original USE-DEF chain of the statement we are replacing.
+
+ The downside of this approach is that the statements will still be
+ seen as relevant and so we will still generate code for them and they
+ will be in the output, unconnected until DSE. We could mark them as
+ irrelevant, but that is only safe if there are no more uses of the node
+ in the SLP graph (So perhaps this should be done in free_slp_tree
+ instead of here. */
+
+static void
+vect_mark_stmts_as_in_pattern (hash_set<slp_tree> *cache, slp_tree node)
+{
+ if (cache->contains (node))
+ return;
+
+ unsigned i;
+ stmt_vec_info stmt_info;
+ slp_tree child;
+
+ cache->add (node);
+
+ FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
+ {
+ if (gimple_assign_load_p (STMT_VINFO_STMT (stmt_info)))
+ return;
+
+ STMT_SLP_TYPE (stmt_info) = pure_slp;
+ }
+
+ FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
+ vect_mark_stmts_as_in_pattern (cache, child);
+}
+
+
+/*******************************************************************************
+ * complex_pattern class
+ ******************************************************************************/
+
+/* The complex_pattern class contains common code for pattern matchers that work
+ on complex numbers. These provide functionality to allow de-construction and
+ validation of sequences depicting/transforming REAL and IMAG pairs. */
+
+class complex_pattern : public vect_pattern
+{
+ protected:
+ complex_pattern (slp_tree *node, vec_info *vinfo)
+ : vect_pattern (node, vinfo)
+ { }
+
+ /* Create and store a new vect_pattern_match object with the current match
+ that was found. */
+
+ void save_match ()
+ {
+ this->m_match
+ = new vect_simple_pattern_match (this->m_arity, this->m_ifn,
+ this->m_vinfo,
+ &SLP_TREE_CHILDREN (*this->m_node),
+ this->m_num_args);
+ }
+
+ public:
+ bool validate_p ();
+};
+
+/* The post transform and validation function for the complex number
+ patterns. This will re-arrange the tree and re-organize the nodes such
+ that they can be used by the complex number instructions that are to be
+ created. It does this by doing the following steps:
+
+ For each node that isn't already linear a new permute node is created which
+ when applied makes the node linear. If such permutation does not exist then
+ the function returns FALSE. For nodes that are already linear no work is
+ done. */
+
+bool
+complex_pattern::validate_p ()
+{
+ if (!this->m_match)
+ return false;
+
+ slp_tree node;
+ unsigned ix;
+ FOR_EACH_VEC_ELT (this->m_ops, ix, node)
+ {
+ auto_vec<slp_tree> nodes;
+ nodes.create (this->m_num_args);
+ slp_tree tmp = NULL;
+
+ unsigned i;
+ FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, tmp)
+ {
+ slp_tree vnode = NULL;
+ if (vect_slp_make_linear (node, tmp, &vnode))
+ nodes.quick_push (vnode);
+ else
+ {
+ nodes.release();
+ return false;
+ }
+ }
+
+ slp_tree new_node
+ = vect_create_new_slp_node (SLP_TREE_SCALAR_STMTS (node),
+ SLP_TREE_CHILDREN (node).length ());
+ SLP_TREE_VECTYPE (new_node) = SLP_TREE_VECTYPE (node);
+ SLP_TREE_LANE_PERMUTATION (new_node) = SLP_TREE_LANE_PERMUTATION (node);
+ SLP_TREE_CODE (new_node) = SLP_TREE_CODE (node);
+ SLP_TREE_REF_COUNT (new_node) = SLP_TREE_REF_COUNT (node);
+ SLP_TREE_REPRESENTATIVE (new_node) = SLP_TREE_REPRESENTATIVE (node);
+ SLP_TREE_CHILDREN (new_node).safe_splice (nodes);
+ SLP_TREE_LANES (new_node) = SLP_TREE_LANES (node);
+
+ SLP_TREE_CHILDREN (*this->m_node)[ix] = new_node;
+ }
+
+ return true;
+}
+
+/*******************************************************************************
+ * complex_operations_pattern class
+ ******************************************************************************/
+
+/* This function combines all the existing pattern matchers above into one class
+ that shares the functionality between them. */
+
+class complex_operations_pattern : public complex_pattern
+{
+ protected:
+ complex_operations_pattern (slp_tree *node, vec_info *vinfo)
+ : complex_pattern (node, vinfo)
+ { }
+
+ vect_pattern *m_patt = NULL;
+
+ public:
+ using complex_pattern::matches;
+ bool matches ();
+ const char* get_name ();
+ gcall *build ();
+ bool validate_p ();
+ uint8_t get_arity ();
+ bool is_optab_supported_p (tree, optimization_type);
+ internal_fn get_ifn ();
+
+ ~complex_operations_pattern ()
+ {
+ if (this->m_patt)
+ delete this->m_patt;
+ this->m_patt = NULL;
+ }
+
+ static vect_pattern* create (slp_tree *node, vec_info *vinfo)
+ {
+ return new complex_operations_pattern (node, vinfo);
+ }
+};
+
+bool
+complex_operations_pattern::matches ()
+{
+ complex_operation_t op
+ = vect_detect_pair_op (*this->m_node, true, &this->m_ops);
+
+ if (op == CMPLX_NONE)
+ return false;
+
+ this->m_patt = NULL;
+ return false;
+}
+
+const char*
+complex_operations_pattern::get_name ()
+{
+ if (!this->m_patt)
+ return "Complex Operations";
+
+ return this->m_patt->get_name ();
+}
+
+gcall *
+complex_operations_pattern::build ()
+{
+ if (!this->m_patt)
+ return NULL;
+
+ return this->m_patt->build ();
+}
+
+bool
+complex_operations_pattern::validate_p ()
+{
+ if (!this->m_patt)
+ return false;
+
+ return this->m_patt->validate_p ();
+}
+
+uint8_t
+complex_operations_pattern::get_arity ()
+{
+ if (!this->m_patt)
+ return 0;
+
+ return this->m_patt->get_arity ();
+}
+
+bool
+complex_operations_pattern::is_optab_supported_p (tree vectype,
+ optimization_type opt_type)
+{
+ if (!vectype || !this->m_patt)
+ return false;
+
+ return this->m_patt->is_optab_supported_p (vectype, opt_type);
+}
+
+internal_fn
+complex_operations_pattern::get_ifn ()
+{
+ if (!this->m_patt)
+ return IFN_LAST;
+
+ return this->m_patt->get_ifn ();
+}
+
/*******************************************************************************
* Pattern matching definitions
******************************************************************************/
@@ -287,6 +650,7 @@ vect_pattern_decl_t slp_patterns[]
order patterns from the largest to the smallest. Especially if they
overlap in what they can detect. */
+ SLP_PATTERN (complex_operations_pattern),
};
#undef SLP_PATTERN