Hi All, This is a respin containing the requested changes.
Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-slp-patterns.c (vect_match_expression_p, vect_check_lane_permute, vect_detect_pair_op, vect_mark_stmts_as_in_pattern, class complex_pattern, complex_pattern::validate_p, class complex_operations_pattern, complex_operations_pattern::matches, complex_operations_pattern::get_name, complex_operations_pattern::build, complex_operations_pattern::validate_p, complex_operations_pattern::get_arity, complex_operations_pattern::is_optab_supported_p, complex_operations_pattern::get_ifn): New. (slp_patterns): Add complex_operations_pattern. > -----Original Message----- > From: Gcc-patches <gcc-patches-boun...@gcc.gnu.org> On Behalf Of Tamar > Christina > Sent: Monday, September 28, 2020 5:06 PM > To: Richard Biener <rguent...@suse.de> > Cc: nd <n...@arm.com>; gcc-patches@gcc.gnu.org; o...@ucw.cz > Subject: RE: [PATCH v2 5/16]middle-end: Add shared machinery for matching > patterns involving complex numbers. > > Hi Richi, > > > -----Original Message----- > > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On > > Behalf Of Richard Biener > > Sent: Monday, September 28, 2020 2:22 PM > > To: Tamar Christina <tamar.christ...@arm.com> > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > > Subject: Re: [PATCH v2 5/16]middle-end: Add shared machinery for > > matching patterns involving complex numbers. > > > > On Fri, 25 Sep 2020, Tamar Christina wrote: > > > > > Hi All, > > > > > > This patch adds shared machinery for detecting patterns having to do > > > with complex number operations. The class ComplexPattern provides > > > helpers for matching and ultimately undoing the permutation in the > > > tree by rebuilding the graph. > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > Ok for master? > > > > I think you want to change all this to not look at individual > > stmts: > > > > + vect_match_expression_p (slp_tree node, tree_code code, int base, > > + int > > idx, > > + stmt_vec_info *op1, stmt_vec_info *op2) > > + { > > + > > + vec<stmt_vec_info> scalar_stmts = SLP_TREE_SCALAR_STMTS (node); > > + > > > > _all_ lanes are supposed to match the operation in > > SLP_TREE_REPRESENTATIVE there's no need to do any operation-matching > > on lane granularity. > > > > That's fair, that flexibility seems like it indeed won't work since the > statements are vectorized based on SLP_TREE_REPRESENTATIVE anyway. So > I'll simplify it. > > > The only thing making a difference here is VEC_PERM_EXPR nodes where > > again - there's no need to look at (eventually non-existant!) > > SLP_TREE_SCALAR_STMTS but its SLP_TREE_REPRESENTATIVE. > > > > + vec<slp_tree> children = SLP_TREE_CHILDREN (node); > > + > > + /* If it's a VEC_PERM_EXPR we need to look one deeper. > > VEC_PERM_EXPR > > + only have one entry. So pick on. */ > > + if (node->code == VEC_PERM_EXPR) > > + children = SLP_TREE_CHILDREN (children.last ()); > > > > that's too simplistic ;) > > > > + *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n]; > > > > please make op1,op2 slp_tree, not stmt_vec_info. > > > > Where do you look at SLP_TREE_LANE_PERMUTATION at all? I think the > > result of > > > > Here I have to admit that I have/am a bit confused as to the relation > between the different permute fields. > LOAD permute is quite straight forward, LANE permute I think are > shuffles/explicit permutes. > > But then I am lost as to the purpose of a VEC_PERM_EXPR nodes. Is it just a > marker to indicate that some node below has a load permute somewhere? > > > + vect_detect_pair_op (int base, slp_tree node1, int offset1, > > + slp_tree > > node2, > > + int offset2, vec<stmt_vec_info> *ops) > > > > could be simply the SLP_TREE_LANE_PERMUTATION? plus its two child > > nodes? > > Right, if I understood correctly, on the two_operands case the lane permute > would tell me whether it's + or -, and in the case of the non- two_operands > cases I probably want to check that it's vNULL since any permute in the order > changes how the instruction accepts the inputs? > > > > > In the ComplexAddPattern patch I see > > > > + /* Correct the arguments after matching. */ > > + std::swap (this->m_vects[1], this->m_vects[3]); > > > > how's that necessary? The replacement SLP node should end up with > > just a SLP_TREE_REPRESENTATIVE (the IFN function call). > > That is, the only thing necessary is verification / matching of the > > appropriate "interleaving" scheme imposed by > SLP_TREE_LANE_PERMUTATION. > > But the order or the statements are important as those decide the > LOAD_PERMUTATION that build_slp_tree creates. > > So in this case the original statement is > > stmt 0 _39 = _37 + _12; > stmt 1 _6 = _38 - _36; > > {_12, _36} result in a LOAD_PERMUTATION of {1, 0} because of how the > addition is done. > So to undo the LOAD_PERMUTE it has to build the new children using > > stmt 0 _39 = {_37, _36}; > stmt 1 _6 = {_38, _12}; > > So the scalar statements are purely a re-ordering to get it to rebuild the > children correctly. > > Now ADD is simplistic, the more complicated cases like MUL and FMA use this > property to Also rebuild the tree removing unneeded statements. This > method returns these and stores them in the PatternMatch class, so I don't > have to ask for them again later when replacing the nodes. Even for > SLP_TREE_REPRESENTATIVE don't the arguments in the IFN call need to be in > such way that they both in the same place in the load chain? > > > I guess things would go wrong if instead of effectively blending two > > vectors we'd interleave two smaller vector type vectors? Say a 4-way > > add and a 4- way sub interleaved into a 8-way addsub, using V4SF and V8SF > vectors? > > > > At this stage yes, most likely, but it should be rejected at the validate > level. > > What is also currently rejected is when some of the definitions are external, > which I think I should be able to handle. I'll fix that up with the rest of > the > changes. > > > Thanks for the feedback! > > Regards, > Tamar > > > Going to stop looking at the series at this point, one further note is > > that all of the Complex*Patterns seem to share the initial match and > > thus redundantly do a vect_detect_pair_op repeatedly on each node for > > each pattern? I wonder if it could be commoned into a single pattern, > > they all seem to share the initial mixed plus/minus, then we have the > > multiplication on one or both operand cases. > > > > Richard. > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > * tree-vect-slp-patterns.c (complex_operation_t,class > > ComplexPattern): > > > New. > > > > > > > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > > Nuernberg, Germany; GF: Felix Imend
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index b81cd6856ad4db5a0cda48ca214389f007e060da..65044a77d55e24cde6c663e81c11b66ad9650056 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -136,6 +136,27 @@ along with GCC; see the file COPYING3. If not see To add a new pattern, implement the vect_pattern class and add the type to slp_patterns. */ +/******************************************************************************* + * General helper types + ******************************************************************************/ + +/* The COMPLEX_OPERATION enum denotes the possible pair of operations that can + be matched when looking for expressions that we are interested matching for + complex numbers addition and mla. */ + +typedef enum _complex_operation : unsigned { + PLUS_PLUS, + MINUS_PLUS, + PLUS_MINUS, + MULT_MULT, + NEG_NEG, + CMPLX_NONE +} complex_operation_t; + +/* A simple pair type used for ordering of combine operations. */ + +typedef struct map_ { int a; int b; } map_t; + /******************************************************************************* * Simple vector pattern matcher ******************************************************************************/ @@ -276,6 +297,348 @@ vect_simple_pattern_match::build () return call_stmt; } +/* Checks to see of the expression EXPR is a gimple assign with code CODE + and if this is the case the two operands of EXPR is returned in OP1 and + OP2. + + If the matching and extraction is successful TRUE is returned otherwise + FALSE in which case the value of OP1 and OP2 will not have been touched. +*/ + +static bool +vect_match_expression_p (slp_tree node, tree_code code) +{ + if (!SLP_TREE_REPRESENTATIVE (node)) + return false; + + gimple* expr = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node)); + if (!is_gimple_assign (expr) + || gimple_expr_code (expr) != code) + return false; + + return true; +} + +/* Check if the given lane permute in PERMUTES matches an alternating sequence + of {P0 P1 P0 P1 ...}. This to account for unrolled loops. */ +static bool +vect_check_lane_permute (lane_permutation_t &permutes, + unsigned p0, unsigned p1) +{ + unsigned val[2] = {p0, p1}; + for (unsigned i = 0; i < permutes.length (); i++) + if (permutes[i].first != val[i % 2]) + return false; + + return true; +} + +/* This function will match two gimple expressions STMT_0 and STMT_1 in + parallel and returns the pair operation that represents the two + expressions in the two statements. The statements are located in NODE1 + and NODE2 at offset base + offset1 and base + offset2 respectively. + + If match is successful then the corresponding complex_operation is + returned and the arguments to the two matched operations are returned in OPS. + + If unsuccessful then CMPLX_NONE is returned and OPS is untouched. + + e.g. the following gimple statements + + stmt 0 _39 = _37 + _12; + stmt 1 _6 = _38 - _36; + + will return PLUS_MINUS along with OPS containing {_37, _12, _38, _36}. +*/ + +static complex_operation_t +vect_detect_pair_op (slp_tree node1, slp_tree node2, lane_permutation_t &lanes, + bool two_operands = true, vec<slp_tree> *ops = NULL) +{ + complex_operation_t result = CMPLX_NONE; +#define CHECK_FOR(x, y) \ + (vect_match_expression_p (node1, x) \ + && vect_match_expression_p (node2, y)) + + if (CHECK_FOR (MINUS_EXPR, PLUS_EXPR) + && (!two_operands || vect_check_lane_permute (lanes, 0, 1))) + result = MINUS_PLUS; + else if (CHECK_FOR (PLUS_EXPR, MINUS_EXPR) + && (!two_operands || vect_check_lane_permute (lanes, 0, 1))) + result = PLUS_MINUS; + else if (CHECK_FOR (PLUS_EXPR, PLUS_EXPR)) + result = PLUS_PLUS; + else if (CHECK_FOR (MULT_EXPR, MULT_EXPR)) + result = MULT_MULT; + else if (CHECK_FOR (NEGATE_EXPR, NEGATE_EXPR)) + result = NEG_NEG; +#undef CHECK_FOR + + if (result != CMPLX_NONE && ops != NULL) + { + ops->create (2); + ops->quick_push (node1); + ops->quick_push (node2); + } + return result; +} + +/* Overload of vect_detect_pair_op where the statements are assumed to be + one after the other. This inspects node[base] and node[base+1]. */ + +static complex_operation_t +vect_detect_pair_op (slp_tree node, bool two_operands = true, + vec<slp_tree> *ops = NULL) +{ + if (!two_operands && SLP_TREE_CODE (node) == VEC_PERM_EXPR) + return CMPLX_NONE; + + if (SLP_TREE_CHILDREN (node).length () != 2) + return CMPLX_NONE; + + vec<slp_tree> children = SLP_TREE_CHILDREN (node); + lane_permutation_t &lanes = SLP_TREE_LANE_PERMUTATION (node); + + return vect_detect_pair_op (children[0], children[1], lanes, two_operands, + ops); +} + +/* This function marks every statement that is being replaced during the + the pattern matching as PURE. Normally when replacing a statement due + to a pattern we add the statement to the STMT_VINFO_PATTERN_DEF_SEQ of + the pattern that is replacing them. In this case however this won't + work as when doing the replacement we are changing the nodes that are + used by the statements. This means that when vectorized the SSA chain + is different than in the BB. + + Declaring the statements as part of the sequence will then cause SSA + verification to fail as we may refer to statements that were not in the + original USE-DEF chain of the statement we are replacing. + + The downside of this approach is that the statements will still be + seen as relevant and so we will still generate code for them and they + will be in the output, unconnected until DSE. We could mark them as + irrelevant, but that is only safe if there are no more uses of the node + in the SLP graph (So perhaps this should be done in free_slp_tree + instead of here. */ + +static void +vect_mark_stmts_as_in_pattern (hash_set<slp_tree> *cache, slp_tree node) +{ + if (cache->contains (node)) + return; + + unsigned i; + stmt_vec_info stmt_info; + slp_tree child; + + cache->add (node); + + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info) + { + if (gimple_assign_load_p (STMT_VINFO_STMT (stmt_info))) + return; + + STMT_SLP_TYPE (stmt_info) = pure_slp; + } + + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) + vect_mark_stmts_as_in_pattern (cache, child); +} + + +/******************************************************************************* + * complex_pattern class + ******************************************************************************/ + +/* The complex_pattern class contains common code for pattern matchers that work + on complex numbers. These provide functionality to allow de-construction and + validation of sequences depicting/transforming REAL and IMAG pairs. */ + +class complex_pattern : public vect_pattern +{ + protected: + complex_pattern (slp_tree *node, vec_info *vinfo) + : vect_pattern (node, vinfo) + { } + + /* Create and store a new vect_pattern_match object with the current match + that was found. */ + + void save_match () + { + this->m_match + = new vect_simple_pattern_match (this->m_arity, this->m_ifn, + this->m_vinfo, + &SLP_TREE_CHILDREN (*this->m_node), + this->m_num_args); + } + + public: + bool validate_p (); +}; + +/* The post transform and validation function for the complex number + patterns. This will re-arrange the tree and re-organize the nodes such + that they can be used by the complex number instructions that are to be + created. It does this by doing the following steps: + + For each node that isn't already linear a new permute node is created which + when applied makes the node linear. If such permutation does not exist then + the function returns FALSE. For nodes that are already linear no work is + done. */ + +bool +complex_pattern::validate_p () +{ + if (!this->m_match) + return false; + + slp_tree node; + unsigned ix; + FOR_EACH_VEC_ELT (this->m_ops, ix, node) + { + auto_vec<slp_tree> nodes; + nodes.create (this->m_num_args); + slp_tree tmp = NULL; + + unsigned i; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, tmp) + { + slp_tree vnode = NULL; + if (vect_slp_make_linear (node, tmp, &vnode)) + nodes.quick_push (vnode); + else + { + nodes.release(); + return false; + } + } + + slp_tree new_node + = vect_create_new_slp_node (SLP_TREE_SCALAR_STMTS (node), + SLP_TREE_CHILDREN (node).length ()); + SLP_TREE_VECTYPE (new_node) = SLP_TREE_VECTYPE (node); + SLP_TREE_LANE_PERMUTATION (new_node) = SLP_TREE_LANE_PERMUTATION (node); + SLP_TREE_CODE (new_node) = SLP_TREE_CODE (node); + SLP_TREE_REF_COUNT (new_node) = SLP_TREE_REF_COUNT (node); + SLP_TREE_REPRESENTATIVE (new_node) = SLP_TREE_REPRESENTATIVE (node); + SLP_TREE_CHILDREN (new_node).safe_splice (nodes); + SLP_TREE_LANES (new_node) = SLP_TREE_LANES (node); + + SLP_TREE_CHILDREN (*this->m_node)[ix] = new_node; + } + + return true; +} + +/******************************************************************************* + * complex_operations_pattern class + ******************************************************************************/ + +/* This function combines all the existing pattern matchers above into one class + that shares the functionality between them. */ + +class complex_operations_pattern : public complex_pattern +{ + protected: + complex_operations_pattern (slp_tree *node, vec_info *vinfo) + : complex_pattern (node, vinfo) + { } + + vect_pattern *m_patt = NULL; + + public: + using complex_pattern::matches; + bool matches (); + const char* get_name (); + gcall *build (); + bool validate_p (); + uint8_t get_arity (); + bool is_optab_supported_p (tree, optimization_type); + internal_fn get_ifn (); + + ~complex_operations_pattern () + { + if (this->m_patt) + delete this->m_patt; + this->m_patt = NULL; + } + + static vect_pattern* create (slp_tree *node, vec_info *vinfo) + { + return new complex_operations_pattern (node, vinfo); + } +}; + +bool +complex_operations_pattern::matches () +{ + complex_operation_t op + = vect_detect_pair_op (*this->m_node, true, &this->m_ops); + + if (op == CMPLX_NONE) + return false; + + this->m_patt = NULL; + return false; +} + +const char* +complex_operations_pattern::get_name () +{ + if (!this->m_patt) + return "Complex Operations"; + + return this->m_patt->get_name (); +} + +gcall * +complex_operations_pattern::build () +{ + if (!this->m_patt) + return NULL; + + return this->m_patt->build (); +} + +bool +complex_operations_pattern::validate_p () +{ + if (!this->m_patt) + return false; + + return this->m_patt->validate_p (); +} + +uint8_t +complex_operations_pattern::get_arity () +{ + if (!this->m_patt) + return 0; + + return this->m_patt->get_arity (); +} + +bool +complex_operations_pattern::is_optab_supported_p (tree vectype, + optimization_type opt_type) +{ + if (!vectype || !this->m_patt) + return false; + + return this->m_patt->is_optab_supported_p (vectype, opt_type); +} + +internal_fn +complex_operations_pattern::get_ifn () +{ + if (!this->m_patt) + return IFN_LAST; + + return this->m_patt->get_ifn (); +} + /******************************************************************************* * Pattern matching definitions ******************************************************************************/ @@ -287,6 +650,7 @@ vect_pattern_decl_t slp_patterns[] order patterns from the largest to the smallest. Especially if they overlap in what they can detect. */ + SLP_PATTERN (complex_operations_pattern), }; #undef SLP_PATTERN