On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni
<bilbotheelffri...@gmail.com> wrote:
> I have few questions regarding genmatch:
>
> a) Why is 4 hard-coded here: ?
> in write_nary_simplifiers:
>  fprintf (f, "      tree captures[4] = {};\n");

Magic number (this must be big enough for all cases ...).  Honestly
this should be improved (but requires another scan over the matcher IL
to figure out the max N used in @N).

> b) Should we add syntax for a symbol to denote multiple operators ?
> For exampleim in simplify_rotate:
> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2 ==
> bitsize of type of X).

Something to enhance the IL with, yes.  I'd say we support

(define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR)

thus,

(define_op <operator-name> op...)

> c) Remove for parsing capture in parse_expr since we reject outermost
> captured expressions ?

but parse_expr is also used for inner expressions, no?

(plus (minus@2 @0 @1) @3)

should still work

> d) I am not able to follow this comment in match.pd:
> /* Patterns required to avoid SCCVN testsuite regressions.  */
>
> /* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
>    complicated here, it does
>      Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
>      (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
>      if the new mask might be further optimized.  */
> (match_and_simplify
>   (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
>   if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0)
>   @0)

The comment is literally copied from the case I extracted the
(simplified) variant from fold-const.c.  See lines 11961-12056 in fold-const.c.
It'll be a challenge to implement the equivalent in a pattern ;)

>
> Decision Tree.
>     I have tried to come up with a prototype for decision tree (patch 
> attached).
> For simplicity, it handles patterns involving only unary operators and
> no predicates
> and returns false when the pattern fails to match (no goto to match
> another pattern).
> I meant to post it only for illustration, and I have not really paid
> attention to code quality (bad formatting, memory leaks, etc.).
>
> * Basic Idea
> A pattern consists of following parts: match, ifexpr and result.
> Let's call <ifexpr, result> as "simplification" operand.
> The common prefix between different match operands would be represented
> by same nodes in the decision tree.
>
> Example:
> (negate (bit_not @0))
> S1
>
> (negate (negate @0))
> S2
>
> S1, S2 denote simplifications for the above patterns respectively.
>
> The decision tree would look something like
> (that's the way it gets constructed with the patch):
>
>                 dummy/root
>                         |
>            NEGATE_EXPR
>              /                  \
>      BIT_NOT           NEGATE_EXPR
>            |                         |
>          @0                     @0
>            |                         |
>          S1                      S2
>
> a) The children of an internal node are number of decisions that
> can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR.
> b) Simplification operand represents leaves of the decision tree
> c) Instead of having list of heads, I have added one dummy node,
> and heads become children of these dummy root node.
> d) Code-gen for non-simplification operands involves generating,
> "matching" code and for simplification operands involves generating
> "transform" code
>
> * Overall Flow:
> I guess we would build the decision tree from the AST.
> So the flow would be like:
> source -> struct simplify (match, ifexpr, result) -> decision tree -> c code.
>
> Something like (in main):
> decision_tree dt;
> while (there is another pattern)
> {
>   simplify *s = parse_match_and_simplify ();
>   insert s into decision tree;
> };
> So parsing routines are concerned with parsing and building the AST (operand),
> and not with the decision tree. Is that fine ?

Yes, that's good.

> * Representation of decision tree.
> A decision tree would need a way to represent language constructs
> like capture, predicate, etc. so in some ways it would be similar to AST.
> It
> In the patch, I have created the following heirarchy:
> dt_operand: represents a general base "operand" of in decision tree
> dt_expr: for representing expression. Expression contains operation
> to be performed (e_operation).
> dt_capture: analogous to capture.
> dt_simplify: for representing "simplification" operand.
> simplification consists of ifexpr and result
> dt_head: to represent "dummy" root. Maybe a separate class is not needed.
>
> * Constructing decision tree from AST
> The algorithm i have used is similar to inserting string in a trie
> outlined here: http://en.wikipedia.org/wiki/Trie
> The difference shall be to traverse AST depth-first rather than
> traversing the string.
> Apart from that I guess it would be the same (naturally find and
> compare operations would be different).
> I haven't given much thought about this. Currently, I construct
> decision tree for only patterns with unary operators in an
> ugly way by "flattening" the AST by walking it in pre-order and
> storing the nodes in vector in the order they are visited
>
> So to construct decision tree from the following AST:
>           negate
>               |
>           bit_not
>               |
>              @0
>
> AST is flattened by walking in preorder (by walk_operand_preorder) and stored
> as: [ expr, expr, capture ]
> and this vector is used to construct decision tree.
> I did it as a "quick and dirty" way, it has to be changed.
> And it won't work for patterns with n-ary operators.
>
> We should visit each node of AST in preorder, and add that node
> during traversal to decision tree. I am not yet clear on way of doing that.
>
> * Comparing operands
> How do we compare operands ? We shall need to do this while inserting
> in decision tree, since if the operand is already inserted we do not create
> a new node.
> In the patch (cmp_operand), I have used following rules:
> a) if types not equal, they are clearly not equal.
> b) if type is expr, compare operation->op->id
> c) if type is capture, compare the number.
> d) for predicate, compare on ident ?
>
> * Representing patterns with n-ary operators.
> Consider following two match operands:
> (plus (minus @0 @1) @1)
> (plus (negate @0) @1)
>
> In decision tree, it would be represented as:
>
>             *********plus*********
>                /    \          /        \
>           minus  @1  negate  @1
>             /     \           |
>         @0   @1       @0
>
> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1.
> However (minus @0 @1) and @1 are not separate "decisions", but
> children of plus within the same expression.
>
> For calculating number of decisions at an expression node:
> number of decisions =  number of children / arity of operator.
> in the above case, number of decision = 4 / 2 = 2
>
> For other cases, for instance capture node (the children would be
> simplification operand),
> number of decisions = number of children.

Hmm.  I think it should be only two children from the plus,
as pre-order for the first pattern is for example
[plus minus @0 @1 @1] so you'd have

   plus
   /     \
minus  negate
|           |
@0     @0
|           |
@1     @1
|
@1

instead?

Note that you probably have to deal with non-matching capture
IDs by re-writing them on-the-fly.  That is, the two
pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...]
should commonize with renaming the non-matching capture
somehow.

Richard.

> * Code generation.
> Code shall be generated by walking the decision tree.
> The way it's constructed, there's no difference between code generation
> for "matching" and code generation for "transform". For non-simplificaton
> operands, "matching" code is generated, and for "simplification" operands,
> "transform" code is generated. The tree shall be walked twice,
> once to generate GIMPLE code and second time for GENERIC.
> For simplicity, I currently return false whenever there's a fail in match,
> instead of trying to match the next pattern.
>
> Code-gen for capture - same as capture::gen_gimple_match.
>
> Code-gen for predicate -  I haven't added support for predicate in
> decision tree yet, but I guess that would be the same as
> predicate::gen_gimple_match
>
> Code-gen for expr.
> There are two types of code-gen for expr.
> The patch generates following code:
> Type 1 - expr is child of root node.
> the only code that gets generated is (in decision_tree::gen_gimple):
> if (code == <expr code>)
> {
> tree captures[4] = {}
> <generated code for children>
> }
> This is similar to generating matching code in write_nary_simplifiers.
>
> Type 2 - expr_1 is a child of another expr_0 node.
> The code gets generated as follows (dt_expr::gen_gimple):
> {
> gimple def_stmt = SSA_NAME_DEF_STMT (op);
> if (is_gimple_assign (def_stmt)
>     && gimple_assign_rhs_code (def_stmt) == <expr_1-code>)
> {
> tree op = gimple_assign_rhs1 (def_stmt);
> if (valueize && TREE_CODE (op) == SSA_NAME)
> {
>   op = valueize (op);
>   if (!op) return false;
> }
> <code-gen for children of expr_1 node>
> }
>
> Example:
> (negate (negate @0))
> S1
>
> (negate (bit_not @0))
> S2
>
> decision tree:
>
>                 dummy/root
>                         |
>            NEGATE_EXPR
>              /                  \
>      BIT_NOT           NEGATE_EXPR
>            |                         |
>          @0                     @0
>            |                         |
>          S1                      S2
>
> // code-gen for NEGATE_EXPR (child of root):
> if (code == NEGATE_EXPR)
> {
> tree captures[4] = {};
> // code gen for BIT_NOT_EXPR
> {
> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
> if (is_gimple_assign (def_stmt)
>     && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
> {
> tree op = gimple_assign_rhs1 (def_stmt);
> if (valueize && TREE_CODE (op) == SSA_NAME)
> {
>   op = valueize (op);
>   if (!op) return false;
> }
>
> // code-gen for @0, child of BIT_NOT_EXPR
> if (!captures[0])
>   captures[0] = op;
> else if (captures[0] != op)
>   return false;
>
> // code-gen for S1, child of @0
> < same as code generated by .gen_gimple_transform >
> return true;
> }
> // code gen for inner NEGATE_EXPR
> {
> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
> if (is_gimple_assign (def_stmt)
>     && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
> <rest similar to the BIT_NOT case>
> }
>
> The following gets duplicated with the patch:
> {
> gimple_def_stmt = SSA_NAME_DEF_STMT (op0);
> if (TREE_CODE (op0) != SSA_NAME)
>   return false;
> if (is_gimple_assign (def_stmt)
>     && gimple_assign_rhs_code (def_stmt) == <expr-code>)
> ...
> }
>
> Improving code-gen for expr:
> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get duplicated,
> while they could be factored out, similar to this:
>
> {
> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
> if (TREE_CODE (op0) != SSA_NAME)
>   return false;
> if (!is_gimple_assign (def_stmt))
>   return false;
> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
> {
>   // code-gen for BIT_NOT_EXPR subtree
> }
> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
> {
>   // code-gen for NEGATE_EXPR subtree
> }
>
> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) != SSA_NAME"
> we could have that generated at the parent of expr's node rather than
> at expr. However that would be incorrect for cases where not all children
> of a node are expressions:
>
> Example:
> // patterns only for illustration
> (negate (bit_not @0))
> (negate @0)
>
>                    root
>                      |
>                   negate
>                     /       \
>                 bit_not   @0
>                     |
>                   @0
>
> we cannot have the above code generated at negate,
> since it's not applicable negate's 2nd child (@0).
>
> This can be done by grouping together children that are expressions.
> However the patch does not do that.
>
> * Code-gen for simplification operand
> This involves code-gen for ifexpr and result of pattern.
> Calls gen_gimple_transform of ifexpr and result (dt_simplify::gen_gimple)
> So this is really code-gen off AST

Right (modulo replacing captures with their replacements).

> * Matching multiple patterns
> A pattern has following parts: match, ifexpr and result.
> If pattern fails in match operand, I guess we can safely return false ?
> We "club" together patterns that have same match operand,
> and use goto, if one of them fails in their (ifexpr/result) and then goto the
> (ifexpr/result) of the next operand.
>
> Example:
> /* x & 0 -> 0 */
> (match_and_simplify
>   (bit_and @0 @1)
>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == integer_zero_node))
>   { integer_zero_node; })
>
> /* x & -1 -> x */
> (match_and_simplify
>   (bit_and @0 @1)
>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>      && (@1 == integer_minus_one_node)
>   @0)
>
> For both patterns match is same.
>
> Decision Tree:
>                 bit_and
>                 /        \
>              @0      @1
>                            |
>                         S1,  S2

I think it's worth adding a diagnostic to genmach whenever exactly
same matches appear.  But I'd say generally we'd support this
by testing the ifexpr of the next pattern.

> S1 represents <ifexpr, result> of pattern-1, and S2 represents <ifexpr, 
> result>
> of pattern-2 respectively.
> S1, S2 would be stored as children of @1 (the last operand of n-ary operator),
> in dt_operand::kids vector.
>
> The code would be generated as:
>
> matching code.
> if (! pattern-1 ifexpr condition)
>   goto simplify2;  // next pattern with the same "match" operand.
> transform code for pattern 1
>
> simplify2:
> if (! pattern-2 ifexpr condition)
>   return false;  // last pattern
> transform code for pattern 2.
>
> If matching itself fails, that is neither of the decisions get matched,
> I believe we can then return false as it cannot match any other pattern ?
>
> * patterns needing hacks like cond_expr or GENERIC support
> I haven't given thought to this yet. I suppose we can look to handle
> these after adding support for GENERIC. Instead of generating GENERIC
> matching in
> gimple_match_and_simplify, could we then call generic_match_and_simplify from
> within gimple_match_and_simplify ?

Yes (that's what's currently done btw).

> * Tests
> The patch transformed the following patterns:
>
> (match_and_simplify
>   (negate (bit_not @0))
>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>   (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); }))
>
> (match_and_simplify
>   (negate (negate @0))
>   @0)
>
> I have attached test-case I tried it with (negate.c)
>
> * Conclusion
> Does it sound reasonable ? I am going to be re-writing the
> decision tree from scratch, but is the basic idea fine ? Or should we
> take a different approach ?
>
> Thanks and Regards,
> Prathamesh

Reply via email to