On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni <bilbotheelffri...@gmail.com> wrote: > I have few questions regarding genmatch: > > a) Why is 4 hard-coded here: ? > in write_nary_simplifiers: > fprintf (f, " tree captures[4] = {};\n");
Magic number (this must be big enough for all cases ...). Honestly this should be improved (but requires another scan over the matcher IL to figure out the max N used in @N). > b) Should we add syntax for a symbol to denote multiple operators ? > For exampleim in simplify_rotate: > (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == > bitsize of type of X). Something to enhance the IL with, yes. I'd say we support (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) thus, (define_op <operator-name> op...) > c) Remove for parsing capture in parse_expr since we reject outermost > captured expressions ? but parse_expr is also used for inner expressions, no? (plus (minus@2 @0 @1) @3) should still work > d) I am not able to follow this comment in match.pd: > /* Patterns required to avoid SCCVN testsuite regressions. */ > > /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more > complicated here, it does > Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) > (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) > if the new mask might be further optimized. */ > (match_and_simplify > (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) > if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0) > @0) The comment is literally copied from the case I extracted the (simplified) variant from fold-const.c. See lines 11961-12056 in fold-const.c. It'll be a challenge to implement the equivalent in a pattern ;) > > Decision Tree. > I have tried to come up with a prototype for decision tree (patch > attached). > For simplicity, it handles patterns involving only unary operators and > no predicates > and returns false when the pattern fails to match (no goto to match > another pattern). > I meant to post it only for illustration, and I have not really paid > attention to code quality (bad formatting, memory leaks, etc.). > > * Basic Idea > A pattern consists of following parts: match, ifexpr and result. > Let's call <ifexpr, result> as "simplification" operand. > The common prefix between different match operands would be represented > by same nodes in the decision tree. > > Example: > (negate (bit_not @0)) > S1 > > (negate (negate @0)) > S2 > > S1, S2 denote simplifications for the above patterns respectively. > > The decision tree would look something like > (that's the way it gets constructed with the patch): > > dummy/root > | > NEGATE_EXPR > / \ > BIT_NOT NEGATE_EXPR > | | > @0 @0 > | | > S1 S2 > > a) The children of an internal node are number of decisions that > can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR. > b) Simplification operand represents leaves of the decision tree > c) Instead of having list of heads, I have added one dummy node, > and heads become children of these dummy root node. > d) Code-gen for non-simplification operands involves generating, > "matching" code and for simplification operands involves generating > "transform" code > > * Overall Flow: > I guess we would build the decision tree from the AST. > So the flow would be like: > source -> struct simplify (match, ifexpr, result) -> decision tree -> c code. > > Something like (in main): > decision_tree dt; > while (there is another pattern) > { > simplify *s = parse_match_and_simplify (); > insert s into decision tree; > }; > So parsing routines are concerned with parsing and building the AST (operand), > and not with the decision tree. Is that fine ? Yes, that's good. > * Representation of decision tree. > A decision tree would need a way to represent language constructs > like capture, predicate, etc. so in some ways it would be similar to AST. > It > In the patch, I have created the following heirarchy: > dt_operand: represents a general base "operand" of in decision tree > dt_expr: for representing expression. Expression contains operation > to be performed (e_operation). > dt_capture: analogous to capture. > dt_simplify: for representing "simplification" operand. > simplification consists of ifexpr and result > dt_head: to represent "dummy" root. Maybe a separate class is not needed. > > * Constructing decision tree from AST > The algorithm i have used is similar to inserting string in a trie > outlined here: http://en.wikipedia.org/wiki/Trie > The difference shall be to traverse AST depth-first rather than > traversing the string. > Apart from that I guess it would be the same (naturally find and > compare operations would be different). > I haven't given much thought about this. Currently, I construct > decision tree for only patterns with unary operators in an > ugly way by "flattening" the AST by walking it in pre-order and > storing the nodes in vector in the order they are visited > > So to construct decision tree from the following AST: > negate > | > bit_not > | > @0 > > AST is flattened by walking in preorder (by walk_operand_preorder) and stored > as: [ expr, expr, capture ] > and this vector is used to construct decision tree. > I did it as a "quick and dirty" way, it has to be changed. > And it won't work for patterns with n-ary operators. > > We should visit each node of AST in preorder, and add that node > during traversal to decision tree. I am not yet clear on way of doing that. > > * Comparing operands > How do we compare operands ? We shall need to do this while inserting > in decision tree, since if the operand is already inserted we do not create > a new node. > In the patch (cmp_operand), I have used following rules: > a) if types not equal, they are clearly not equal. > b) if type is expr, compare operation->op->id > c) if type is capture, compare the number. > d) for predicate, compare on ident ? > > * Representing patterns with n-ary operators. > Consider following two match operands: > (plus (minus @0 @1) @1) > (plus (negate @0) @1) > > In decision tree, it would be represented as: > > *********plus********* > / \ / \ > minus @1 negate @1 > / \ | > @0 @1 @0 > > plus has got 4 children - (minus @0 @1), @1, (negate @0), @1. > However (minus @0 @1) and @1 are not separate "decisions", but > children of plus within the same expression. > > For calculating number of decisions at an expression node: > number of decisions = number of children / arity of operator. > in the above case, number of decision = 4 / 2 = 2 > > For other cases, for instance capture node (the children would be > simplification operand), > number of decisions = number of children. Hmm. I think it should be only two children from the plus, as pre-order for the first pattern is for example [plus minus @0 @1 @1] so you'd have plus / \ minus negate | | @0 @0 | | @1 @1 | @1 instead? Note that you probably have to deal with non-matching capture IDs by re-writing them on-the-fly. That is, the two pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...] should commonize with renaming the non-matching capture somehow. Richard. > * Code generation. > Code shall be generated by walking the decision tree. > The way it's constructed, there's no difference between code generation > for "matching" and code generation for "transform". For non-simplificaton > operands, "matching" code is generated, and for "simplification" operands, > "transform" code is generated. The tree shall be walked twice, > once to generate GIMPLE code and second time for GENERIC. > For simplicity, I currently return false whenever there's a fail in match, > instead of trying to match the next pattern. > > Code-gen for capture - same as capture::gen_gimple_match. > > Code-gen for predicate - I haven't added support for predicate in > decision tree yet, but I guess that would be the same as > predicate::gen_gimple_match > > Code-gen for expr. > There are two types of code-gen for expr. > The patch generates following code: > Type 1 - expr is child of root node. > the only code that gets generated is (in decision_tree::gen_gimple): > if (code == <expr code>) > { > tree captures[4] = {} > <generated code for children> > } > This is similar to generating matching code in write_nary_simplifiers. > > Type 2 - expr_1 is a child of another expr_0 node. > The code gets generated as follows (dt_expr::gen_gimple): > { > gimple def_stmt = SSA_NAME_DEF_STMT (op); > if (is_gimple_assign (def_stmt) > && gimple_assign_rhs_code (def_stmt) == <expr_1-code>) > { > tree op = gimple_assign_rhs1 (def_stmt); > if (valueize && TREE_CODE (op) == SSA_NAME) > { > op = valueize (op); > if (!op) return false; > } > <code-gen for children of expr_1 node> > } > > Example: > (negate (negate @0)) > S1 > > (negate (bit_not @0)) > S2 > > decision tree: > > dummy/root > | > NEGATE_EXPR > / \ > BIT_NOT NEGATE_EXPR > | | > @0 @0 > | | > S1 S2 > > // code-gen for NEGATE_EXPR (child of root): > if (code == NEGATE_EXPR) > { > tree captures[4] = {}; > // code gen for BIT_NOT_EXPR > { > gimple def_stmt = SSA_NAME_DEF_STMT (op0); > if (is_gimple_assign (def_stmt) > && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) > { > tree op = gimple_assign_rhs1 (def_stmt); > if (valueize && TREE_CODE (op) == SSA_NAME) > { > op = valueize (op); > if (!op) return false; > } > > // code-gen for @0, child of BIT_NOT_EXPR > if (!captures[0]) > captures[0] = op; > else if (captures[0] != op) > return false; > > // code-gen for S1, child of @0 > < same as code generated by .gen_gimple_transform > > return true; > } > // code gen for inner NEGATE_EXPR > { > gimple def_stmt = SSA_NAME_DEF_STMT (op0); > if (is_gimple_assign (def_stmt) > && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) > <rest similar to the BIT_NOT case> > } > > The following gets duplicated with the patch: > { > gimple_def_stmt = SSA_NAME_DEF_STMT (op0); > if (TREE_CODE (op0) != SSA_NAME) > return false; > if (is_gimple_assign (def_stmt) > && gimple_assign_rhs_code (def_stmt) == <expr-code>) > ... > } > > Improving code-gen for expr: > "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get duplicated, > while they could be factored out, similar to this: > > { > gimple def_stmt = SSA_NAME_DEF_STMT (op0); > if (TREE_CODE (op0) != SSA_NAME) > return false; > if (!is_gimple_assign (def_stmt)) > return false; > if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) > { > // code-gen for BIT_NOT_EXPR subtree > } > else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) > { > // code-gen for NEGATE_EXPR subtree > } > > For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) != SSA_NAME" > we could have that generated at the parent of expr's node rather than > at expr. However that would be incorrect for cases where not all children > of a node are expressions: > > Example: > // patterns only for illustration > (negate (bit_not @0)) > (negate @0) > > root > | > negate > / \ > bit_not @0 > | > @0 > > we cannot have the above code generated at negate, > since it's not applicable negate's 2nd child (@0). > > This can be done by grouping together children that are expressions. > However the patch does not do that. > > * Code-gen for simplification operand > This involves code-gen for ifexpr and result of pattern. > Calls gen_gimple_transform of ifexpr and result (dt_simplify::gen_gimple) > So this is really code-gen off AST Right (modulo replacing captures with their replacements). > * Matching multiple patterns > A pattern has following parts: match, ifexpr and result. > If pattern fails in match operand, I guess we can safely return false ? > We "club" together patterns that have same match operand, > and use goto, if one of them fails in their (ifexpr/result) and then goto the > (ifexpr/result) of the next operand. > > Example: > /* x & 0 -> 0 */ > (match_and_simplify > (bit_and @0 @1) > if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == integer_zero_node)) > { integer_zero_node; }) > > /* x & -1 -> x */ > (match_and_simplify > (bit_and @0 @1) > if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > && (@1 == integer_minus_one_node) > @0) > > For both patterns match is same. > > Decision Tree: > bit_and > / \ > @0 @1 > | > S1, S2 I think it's worth adding a diagnostic to genmach whenever exactly same matches appear. But I'd say generally we'd support this by testing the ifexpr of the next pattern. > S1 represents <ifexpr, result> of pattern-1, and S2 represents <ifexpr, > result> > of pattern-2 respectively. > S1, S2 would be stored as children of @1 (the last operand of n-ary operator), > in dt_operand::kids vector. > > The code would be generated as: > > matching code. > if (! pattern-1 ifexpr condition) > goto simplify2; // next pattern with the same "match" operand. > transform code for pattern 1 > > simplify2: > if (! pattern-2 ifexpr condition) > return false; // last pattern > transform code for pattern 2. > > If matching itself fails, that is neither of the decisions get matched, > I believe we can then return false as it cannot match any other pattern ? > > * patterns needing hacks like cond_expr or GENERIC support > I haven't given thought to this yet. I suppose we can look to handle > these after adding support for GENERIC. Instead of generating GENERIC > matching in > gimple_match_and_simplify, could we then call generic_match_and_simplify from > within gimple_match_and_simplify ? Yes (that's what's currently done btw). > * Tests > The patch transformed the following patterns: > > (match_and_simplify > (negate (bit_not @0)) > if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) > (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); })) > > (match_and_simplify > (negate (negate @0)) > @0) > > I have attached test-case I tried it with (negate.c) > > * Conclusion > Does it sound reasonable ? I am going to be re-writing the > decision tree from scratch, but is the basic idea fine ? Or should we > take a different approach ? > > Thanks and Regards, > Prathamesh