On 6/11/14, Prathamesh Kulkarni <bilbotheelffri...@gmail.com> wrote: > On 6/11/14, Richard Biener <richard.guent...@gmail.com> wrote: >> On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni >>>>> <bilbotheelffri...@gmail.com> wrote: >>>>>> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener >>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni >>>>>>>> <bilbotheelffri...@gmail.com> wrote: >>>>>>>>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener >>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni >>>>>>>>>> <bilbotheelffri...@gmail.com> wrote: >>>>>>>>>>> I have few questions regarding genmatch: >>>>>>>>>>> >>>>>>>>>>> a) Why is 4 hard-coded here: ? >>>>>>>>>>> in write_nary_simplifiers: >>>>>>>>>>> fprintf (f, " tree captures[4] = {};\n"); >>>>>>>>>> >>>>>>>>>> Magic number (this must be big enough for all cases ...). >>>>>>>>>> Honestly >>>>>>>>>> this should be improved (but requires another scan over the >>>>>>>>>> matcher >>>>>>>>>> IL >>>>>>>>>> to figure out the max N used in @N). >>>>>>>>>> >>>>>>>>>>> b) Should we add syntax for a symbol to denote multiple >>>>>>>>>>> operators >>>>>>>>>>> ? >>>>>>>>>>> For exampleim in simplify_rotate: >>>>>>>>>>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 >>>>>>>>>>> == >>>>>>>>>>> bitsize of type of X). >>>>>>>>>> >>>>>>>>>> Something to enhance the IL with, yes. I'd say we support >>>>>>>>>> >>>>>>>>>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) >>>>>>>>>> >>>>>>>>>> thus, >>>>>>>>>> >>>>>>>>>> (define_op <operator-name> op...) >>>>>>>>>> >>>>>>>>>>> c) Remove for parsing capture in parse_expr since we reject >>>>>>>>>>> outermost >>>>>>>>>>> captured expressions ? >>>>>>>>>> >>>>>>>>>> but parse_expr is also used for inner expressions, no? >>>>>>>>>> >>>>>>>>>> (plus (minus@2 @0 @1) @3) >>>>>>>>>> >>>>>>>>>> should still work >>>>>>>>>> >>>>>>>>>>> d) I am not able to follow this comment in match.pd: >>>>>>>>>>> /* Patterns required to avoid SCCVN testsuite regressions. */ >>>>>>>>>>> >>>>>>>>>>> /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >>>>>>>>>>> complicated here, it does >>>>>>>>>>> Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) >>>>>>>>>>> (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) >>>>>>>>>>> if the new mask might be further optimized. */ >>>>>>>>>>> (match_and_simplify >>>>>>>>>>> (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) >>>>>>>>>>> if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) >>>>>>>>>>> == 0) >>>>>>>>>>> @0) >>>>>>>>>> >>>>>>>>>> The comment is literally copied from the case I extracted the >>>>>>>>>> (simplified) variant from fold-const.c. See lines 11961-12056 in >>>>>>>>>> fold-const.c. >>>>>>>>>> It'll be a challenge to implement the equivalent in a pattern ;) >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Decision Tree. >>>>>>>>>>> I have tried to come up with a prototype for decision tree >>>>>>>>>>> (patch attached). >>>>>>>>>>> For simplicity, it handles patterns involving only unary >>>>>>>>>>> operators >>>>>>>>>>> and >>>>>>>>>>> no predicates >>>>>>>>>>> and returns false when the pattern fails to match (no goto to >>>>>>>>>>> match >>>>>>>>>>> another pattern). >>>>>>>>>>> I meant to post it only for illustration, and I have not really >>>>>>>>>>> paid >>>>>>>>>>> attention to code quality (bad formatting, memory leaks, etc.). >>>>>>>>>>> >>>>>>>>>>> * Basic Idea >>>>>>>>>>> A pattern consists of following parts: match, ifexpr and result. >>>>>>>>>>> Let's call <ifexpr, result> as "simplification" operand. >>>>>>>>>>> The common prefix between different match operands would be >>>>>>>>>>> represented >>>>>>>>>>> by same nodes in the decision tree. >>>>>>>>>>> >>>>>>>>>>> Example: >>>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>>> S1 >>>>>>>>>>> >>>>>>>>>>> (negate (negate @0)) >>>>>>>>>>> S2 >>>>>>>>>>> >>>>>>>>>>> S1, S2 denote simplifications for the above patterns >>>>>>>>>>> respectively. >>>>>>>>>>> >>>>>>>>>>> The decision tree would look something like >>>>>>>>>>> (that's the way it gets constructed with the patch): >>>>>>>>>>> >>>>>>>>>>> dummy/root >>>>>>>>>>> | >>>>>>>>>>> NEGATE_EXPR >>>>>>>>>>> / \ >>>>>>>>>>> BIT_NOT NEGATE_EXPR >>>>>>>>>>> | | >>>>>>>>>>> @0 @0 >>>>>>>>>>> | | >>>>>>>>>>> S1 S2 >>>>>>>>>>> >>>>>>>>>>> a) The children of an internal node are number of decisions that >>>>>>>>>>> can be taken at that node. In the above case it's 2 for outer >>>>>>>>>>> NEGATE_EXPR. >>>>>>>>>>> b) Simplification operand represents leaves of the decision tree >>>>>>>>>>> c) Instead of having list of heads, I have added one dummy node, >>>>>>>>>>> and heads become children of these dummy root node. >>>>>>>>>>> d) Code-gen for non-simplification operands involves generating, >>>>>>>>>>> "matching" code and for simplification operands involves >>>>>>>>>>> generating >>>>>>>>>>> "transform" code >>>>>>>>>>> >>>>>>>>>>> * Overall Flow: >>>>>>>>>>> I guess we would build the decision tree from the AST. >>>>>>>>>>> So the flow would be like: >>>>>>>>>>> source -> struct simplify (match, ifexpr, result) -> decision >>>>>>>>>>> tree >>>>>>>>>>> -> c code. >>>>>>>>>>> >>>>>>>>>>> Something like (in main): >>>>>>>>>>> decision_tree dt; >>>>>>>>>>> while (there is another pattern) >>>>>>>>>>> { >>>>>>>>>>> simplify *s = parse_match_and_simplify (); >>>>>>>>>>> insert s into decision tree; >>>>>>>>>>> }; >>>>>>>>>>> So parsing routines are concerned with parsing and building the >>>>>>>>>>> AST (operand), >>>>>>>>>>> and not with the decision tree. Is that fine ? >>>>>>>>>> >>>>>>>>>> Yes, that's good. >>>>>>>>>> >>>>>>>>>>> * Representation of decision tree. >>>>>>>>>>> A decision tree would need a way to represent language >>>>>>>>>>> constructs >>>>>>>>>>> like capture, predicate, etc. so in some ways it would be >>>>>>>>>>> similar >>>>>>>>>>> to AST. >>>>>>>>>>> It >>>>>>>>>>> In the patch, I have created the following heirarchy: >>>>>>>>>>> dt_operand: represents a general base "operand" of in decision >>>>>>>>>>> tree >>>>>>>>>>> dt_expr: for representing expression. Expression contains >>>>>>>>>>> operation >>>>>>>>>>> to be performed (e_operation). >>>>>>>>>>> dt_capture: analogous to capture. >>>>>>>>>>> dt_simplify: for representing "simplification" operand. >>>>>>>>>>> simplification consists of ifexpr and result >>>>>>>>>>> dt_head: to represent "dummy" root. Maybe a separate class is >>>>>>>>>>> not >>>>>>>>>>> needed. >>>>>>>>>>> >>>>>>>>>>> * Constructing decision tree from AST >>>>>>>>>>> The algorithm i have used is similar to inserting string in a >>>>>>>>>>> trie >>>>>>>>>>> outlined here: http://en.wikipedia.org/wiki/Trie >>>>>>>>>>> The difference shall be to traverse AST depth-first rather than >>>>>>>>>>> traversing the string. >>>>>>>>>>> Apart from that I guess it would be the same (naturally find and >>>>>>>>>>> compare operations would be different). >>>>>>>>>>> I haven't given much thought about this. Currently, I construct >>>>>>>>>>> decision tree for only patterns with unary operators in an >>>>>>>>>>> ugly way by "flattening" the AST by walking it in pre-order and >>>>>>>>>>> storing the nodes in vector in the order they are visited >>>>>>>>>>> >>>>>>>>>>> So to construct decision tree from the following AST: >>>>>>>>>>> negate >>>>>>>>>>> | >>>>>>>>>>> bit_not >>>>>>>>>>> | >>>>>>>>>>> @0 >>>>>>>>>>> >>>>>>>>>>> AST is flattened by walking in preorder (by >>>>>>>>>>> walk_operand_preorder) >>>>>>>>>>> and stored >>>>>>>>>>> as: [ expr, expr, capture ] >>>>>>>>>>> and this vector is used to construct decision tree. >>>>>>>>>>> I did it as a "quick and dirty" way, it has to be changed. >>>>>>>>>>> And it won't work for patterns with n-ary operators. >>>>>>>>>>> >>>>>>>>>>> We should visit each node of AST in preorder, and add that node >>>>>>>>>>> during traversal to decision tree. I am not yet clear on way of >>>>>>>>>>> doing that. >>>>>>>>>>> >>>>>>>>>>> * Comparing operands >>>>>>>>>>> How do we compare operands ? We shall need to do this while >>>>>>>>>>> inserting >>>>>>>>>>> in decision tree, since if the operand is already inserted we do >>>>>>>>>>> not create >>>>>>>>>>> a new node. >>>>>>>>>>> In the patch (cmp_operand), I have used following rules: >>>>>>>>>>> a) if types not equal, they are clearly not equal. >>>>>>>>>>> b) if type is expr, compare operation->op->id >>>>>>>>>>> c) if type is capture, compare the number. >>>>>>>>>>> d) for predicate, compare on ident ? >>>>>>>>>>> >>>>>>>>>>> * Representing patterns with n-ary operators. >>>>>>>>>>> Consider following two match operands: >>>>>>>>>>> (plus (minus @0 @1) @1) >>>>>>>>>>> (plus (negate @0) @1) >>>>>>>>>>> >>>>>>>>>>> In decision tree, it would be represented as: >>>>>>>>>>> >>>>>>>>>>> *********plus********* >>>>>>>>>>> / \ / \ >>>>>>>>>>> minus @1 negate @1 >>>>>>>>>>> / \ | >>>>>>>>>>> @0 @1 @0 >>>>>>>>>>> >>>>>>>>>>> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1. >>>>>>>>>>> However (minus @0 @1) and @1 are not separate "decisions", but >>>>>>>>>>> children of plus within the same expression. >>>>>>>>>>> >>>>>>>>>>> For calculating number of decisions at an expression node: >>>>>>>>>>> number of decisions = number of children / arity of operator. >>>>>>>>>>> in the above case, number of decision = 4 / 2 = 2 >>>>>>>>>>> >>>>>>>>>>> For other cases, for instance capture node (the children would >>>>>>>>>>> be >>>>>>>>>>> simplification operand), >>>>>>>>>>> number of decisions = number of children. >>>>>>>>>> >>>>>>>>>> Hmm. I think it should be only two children from the plus, >>>>>>>>>> as pre-order for the first pattern is for example >>>>>>>>>> [plus minus @0 @1 @1] so you'd have >>>>>>>>>> >>>>>>>>>> plus >>>>>>>>>> / \ >>>>>>>>>> minus negate >>>>>>>>>> | | >>>>>>>>>> @0 @0 >>>>>>>>>> | | >>>>>>>>>> @1 @1 >>>>>>>>>> | >>>>>>>>>> @1 >>>>>>>>>> >>>>>>>>>> instead? >>>>>>>>>> >>>>>>>>>> Note that you probably have to deal with non-matching capture >>>>>>>>>> IDs by re-writing them on-the-fly. That is, the two >>>>>>>>>> pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...] >>>>>>>>>> should commonize with renaming the non-matching capture >>>>>>>>>> somehow. >>>>>>>>> Um, could you please elaborate on that (non-matching capture ?), >>>>>>>>> I am not sure if I understood it fully, thanks. >>>>>>>> >>>>>>>> I'm worried about >>>>>>>> >>>>>>>> (plus (minus@1 @2 @3) @2) >>>>>>>> >>>>>>>> vs. >>>>>>>> >>>>>>>> (plus (minus @1 @2) @1) >>>>>>>> >>>>>>>> (just made-up examples, of course). Here we'd like to >>>>>>>> commonize the path checking for (plus (minus ...) ... but >>>>>>>> of course we can't easily process the captures at that point >>>>>>>> as the first has two captures (it captures the minus itself while >>>>>>>> the 2nd doesn't) and the captures that are at the same position >>>>>>>> use different indexes. To finally match the minus we have to >>>>>>>> verify that in one case @2 == @2 and in the other case that >>>>>>>> @1 == @1. >>>>>>>> >>>>>>>> Re-numbering capture indexes in pre-order would improve things >>>>>>>> for the case where capture positions are in the same places but >>>>>>>> the indexes do not match. It won't handle the above case >>>>>>>> of course. >>>>>> Thanks, I understand it now -:) >>>>> >>>>> I suppose that it would be best (in the end) to not insert the >>>>> captures into the decision tree separately but leave them >>>>> "combined" and then only after the decision tree is fully built >>>>> decide on re-numbering. And treat those that are required for >>>>> the matching process differently (those that assert that two >>>>> ops are equal). >>>>> >>>>> I think we can defer the issue to later. >>>>> >>>>>>>> >>>>>>>>> I have attached patch for patterns with binary operators (in >>>>>>>>> principle >>>>>>>>> should work for n-ary operators, >>>>>>>>> but have only tested upto 2), that creates decision tree for >>>>>>>>> patterns (excluding predicates, and multiple matching). I don't >>>>>>>>> intend >>>>>>>>> to commit this (contains many hacks!). >>>>>>>>> >>>>>>>>> * Constructing decision tree from AST >>>>>>>>> We can define our decision tree to store prefixes of preorder >>>>>>>>> traversals of diffferent AST. >>>>>>>>> Insertion happens as follows (decision_tree::insert) >>>>>>>>> a) Get preorder traversal of AST into vector. >>>>>>>>> b) Insert AST nodes from vector into decision tree. >>>>>>>>> Currently, I obtain preorder traversal into a vector. We can later >>>>>>>>> change it to compute >>>>>>>>> next preorder successor lazily. >>>>>>>> >>>>>>>> Just make the code easy to understand, optimizing it should be >>>>>>>> secondary. >>>>>>>> >>>>>>>>> * Code-gen >>>>>>>>> For n-ary operators, the patch generates code as follows: >>>>>>>>> if (code == <expr code>) >>>>>>>>> { >>>>>>>>> match operand 0 >>>>>>>>> match operand 1 >>>>>>>>> .... >>>>>>>>> match operand n-1 >>>>>>>>> transform >>>>>>>>> return true; >>>>>>>>> } >>>>>>>>> >>>>>>>>> * Adding parent, level, pos fields to AST (operand). >>>>>>>>> One change (hack), I have made to code-gen, is naming of operands >>>>>>>>> that are >>>>>>>>> assigned to gimple_assign_rhs in code-gen of expressions. >>>>>>>>> I am not sure if this is really needed. >>>>>>>>> >>>>>>>>> in AST, we have following representation for expressions: >>>>>>>>> expr- >>>>>>>>> / \ >>>>>>>>> left right >>>>>>>>> operand operand >>>>>>>>> >>>>>>>>> code-gen off AST (expr::gen_gimple_match) produces code like: >>>>>>>>> { >>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>>> // code-gen for left operand >>>>>>>>> } >>>>>>>>> { >>>>>>>>> tree op = gimple_assign_rhs2 (def_stmt); >>>>>>>>> // code-gen for right operand >>>>>>>>> } >>>>>>>>> >>>>>>>>> We can do this since, in expr::gen_gimple_match, we call >>>>>>>>> left_operand->gen_gimple_match, >>>>>>>>> come back to expr::gen_gimple_match (after >>>>>>>>> left_operand->gen_gimple_match() returns), and >>>>>>>>> then call right_operand->gen_gimple_match(). >>>>>>>>> >>>>>>>>> However in decision tree, the expression gets represented as >>>>>>>>> follows: >>>>>>>>> expr >>>>>>>>> | >>>>>>>>> left-operand >>>>>>>>> | >>>>>>>>> right-operand >>>>>>>>> >>>>>>>>> dt_expr::gen_gimple calls left_operand->gen_gimple, which calls >>>>>>>>> right_operand->gen_gimple, >>>>>>>>> so we return to dt_expr::gen_gimple, after code is generated for >>>>>>>>> the >>>>>>>>> whole subtree of expr. >>>>>>>>> >>>>>>>>> So I assign operands at the start before generating code for the >>>>>>>>> subtree: >>>>>>>>> tree op00 = gimple_assign_rhs1 (def_stmt); >>>>>>>>> tree op10 = gimple_assign_rhs2 (def_stmt); >>>>>>>>> <code for left-operand> >>>>>>>>> <code for right-operand> >>>>>>>>> each of the operands know their positions, so they know which >>>>>>>>> operand >>>>>>>>> to use (get_op_name()). >>>>>>>>> the names are generated as: >>>>>>>>> op<position><level>, position = index of operand in it's parent's >>>>>>>>> expr's vector (vec<operand *> ops). >>>>>>>>> level: level of AST, at which the operand is stored. >>>>>>>>> >>>>>>>>> For this I added three fields to operand (unsigned pos, and >>>>>>>>> unsigned >>>>>>>>> level, operand *parent), and accordingly >>>>>>>>> made changes to expr::append_op. In a way this is abusing the AST. >>>>>>>>> This probably works (works with test-cases i tried), but I don't >>>>>>>>> like it much. >>>>>>>> >>>>>>>> The naming is ok, but indeed it doesn't feel very clean. Somehow >>>>>>>> this >>>>>>>> belongs to the decision tree instead... (but then you need to >>>>>>>> lookup >>>>>>>> the decision tree operand info from code-gen which happens off the >>>>>>>> AST...). >>>>>>>> >>>>>>>>> * Order of matching. >>>>>>>>> Currently the order of matching operands is strictly 0, 1, .. n - >>>>>>>>> 1 >>>>>>>>> for n-ary operator. >>>>>>>>> However this might not be the best choice. >>>>>>>>> >>>>>>>>> For example, consider following patterns: >>>>>>>>> (operator op1 op) S1 >>>>>>>>> (operator op2 op) S2 >>>>>>>>> >>>>>>>>> Both have common operator, and 1st common operand, they differ in >>>>>>>>> 0th operand. >>>>>>>>> With patch, the code generated is: >>>>>>>>> if (code == <expr-code>) >>>>>>>>> { >>>>>>>>> match op1 >>>>>>>>> match op >>>>>>>>> S1 >>>>>>>>> >>>>>>>>> match op2 >>>>>>>>> match op >>>>>>>>> S2 >>>>>>>>> } >>>>>>>>> >>>>>>>>> A better ordering would be to match the 1st operand and then >>>>>>>>> respective 0th operands of both patterns: >>>>>>>>> >>>>>>>>> if (code == <expr-code> >>>>>>>>> { >>>>>>>>> match op >>>>>>>>> match op1 >>>>>>>>> S1 >>>>>>>>> match op0 >>>>>>>>> S2 >>>>>>>>> } >>>>>>>>> >>>>>>>>> This complicates insertion into decision tree. >>>>>>>> >>>>>>>> I'd say we leave "optimizing" the decision tree to a later stage - >>>>>>>> it's a global >>>>>>>> optimization problem after all (I believe you can't do optimally >>>>>>>> with >>>>>>>> just >>>>>>>> looking at the present decision tree and the AST you want to >>>>>>>> insert). >>>>>>>> >>>>>>>>> * hack: only one generated gimple_match_and_simplify function >>>>>>>>> In the patch, I generate code for gimple_match_and_simplify with 3 >>>>>>>>> operands version (op0, op1, op2) >>>>>>>>> and the other two (unary and binary versions), call it with >>>>>>>>> NULL_TREE >>>>>>>>> for extra operands. >>>>>>>>> I will soon change that. >>>>>>>>> >>>>>>>>> * Testing patterns. >>>>>>>>> I think I have written most of the test-cases wrongly in match-2.c >>>>>>>>> For example this test-case doesn't match >>>>>>>>> (i haven't checked in "fix testsuite fallout" patch, so this might >>>>>>>>> not be true >>>>>>>>> anymore) >>>>>>>>> >>>>>>>>> (match_and_simplify >>>>>>>>> plus (minus @0 @1) @1) >>>>>>>>> @0) >>>>>>>>> >>>>>>>>> for this test-case: >>>>>>>>> /* (x - y) + y -> x */ >>>>>>>>> int f6(int x, int y) >>>>>>>>> { >>>>>>>>> int t1 = x - y; >>>>>>>>> return t1 + y; >>>>>>>>> } >>>>>>>>> /* { dg-final { scan-tree-dump "gimple_match_and_simplified to >>>>>>>>> \[^\n\r\]*= x_\\d\+\\(D\\)" "forwprop1" } } */ >>>>>>>>> >>>>>>>>> I get following output (tree-forwprop-details): >>>>>>>>> >>>>>>>>> ;; Function f6 (f, funcdef_no=0, decl_uid=1744, symbol_order=0) >>>>>>>>> >>>>>>>>> gimple_match_and_simplified to _4 = y_2(D) + t1_3; >>>>>>>>> f (int x, int y) >>>>>>>>> { >>>>>>>>> int t1; >>>>>>>>> int _4; >>>>>>>>> >>>>>>>>> <bb 2>: >>>>>>>>> t1_3 = x_1(D) - y_2(D); >>>>>>>>> _4 = x_1(D); >>>>>>>>> return _4; >>>>>>>>> >>>>>>>>> } >>>>>>>>> >>>>>>>>> Shouldn't it show: gimple_match_and_simplified to _4 = x_1(D) >>>>>>>>> instead ? >>>>>>>> >>>>>>>> Yeah. I suppose you hit the issue that we don't commutate >>>>>>>> patterns, >>>>>>>> so try with a commutated pattern inserted. >>>>>>>> >>>>>>>>> The issue is this test-case fails in isolation, however it passes >>>>>>>>> when >>>>>>>>> it is placed with other >>>>>>>>> test cases that PASS and have same regex in scan-tree-dump as this >>>>>>>>> test-case: >>>>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>>>>>> x_\\d\+\\(D\\)" >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> I think we need to (at-least) deal with following problems for >>>>>>>>> decision tree: >>>>>>>> >>>>>>>> I have to leave for a while now, I'll come back later and have a >>>>>>>> detailed >>>>>>>> look at the patch to try answer the questions below. >>>>>>> >>>>>>> Now for the remainder. >>>>>>> >>>>>>>>> a) Representation and construction of decision tree from AST - >>>>>>>>> We do this by storing prefixes of preorder traversals of AST in >>>>>>>>> decision tree. Do we finalize on that ? >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>>>> b) Order of operand matching >>>>>>> >>>>>>> I think that's the same as a) - the visiting order of the AST >>>>>>> determines >>>>>>> the order of operand matching. >>>>>>> >>>>>>>>> c) Generating patterns for built-in functions (I guess this will >>>>>>>>> be >>>>>>>>> similar to expr-gen). >>>>>>> >>>>>>> It should really be transparent. >>>>>>> >>>>>>>>> d) Multiple matching patterns >>>>>>> >>>>>>> The decision tree phase leaves us with a sequence of >>>>>>> >>>>>>> if (if-expr of pattern1) >>>>>>> transform-pattern1 >>>>>>> else (if-expr of pattern2) >>>>>>> transform-pattern2 >>>>>>> ... >>>>>>> >>>>>>> etc. thus that part should be basically unchanged (only the match >>>>>>> part is driven by a decision tree). >>>>>>> >>>>>>>>> e) Commutative ops >>>>>>> >>>>>>> I'd do these as a pass over the AST of each pattern, duplicating >>>>>>> patterns. >>>>>>> >>>>>>> Btw, I see you mirror the AST structure with >>>>>>> dt_{node,operand,expr,...}. >>>>>>> It seems to me that you can simply stick an AST operand * into >>>>>>> dt_node and be done with just having dt_node? That is, the AST >>>>>>> operand refered to contains all necessary information already. >>>>>>> >>>>>>>>> >>>>>>>>> Pattern Enhancements: >>>>>>>>> Some of these were already discussed, I am jotting them here: >>>>>>>>> a) Symbol for multiple operators >>>>>>>>> b) grouping patterns by ifexpr >>>>>>>>> c) should we have MATCH_FAILED or FAIL to explicitly denote >>>>>>>>> failure >>>>>>>>> in >>>>>>>>> manual transform (c_expr) ? >>>>>>>>> d) making operators to be commutative. >>>>>>> >>>>>>> I'd say d) would be really nice to have. Either with a grammar >>>>>>> extension to say which expression should have commutation applied >>>>>>> to (for example: (plus! (minus @1 @2) @2) would say that the >>>>>>> operands of plus should be commutated), or with a simple heuristic. >>>>>>> I'd say explicitly specifying is better. >>>>>>> >>>>>>> a) will probably be needed to avoid pattern explosion. But similar >>>>>>> to >>>>>>> d) >>>>>>> I'd do it as a pre-pass on the AST, duplicating the patterns. It's >>>>>>> also a good question how to define its semantic. The mode-iterators >>>>>>> that the machine description supports would suggest we do instead >>>>>>> sth like >>>>>>> >>>>>>> (for OP in (plus minus mult) >>>>>>> (match_and_simplify >>>>>>> (OP @1 integer_zerop) >>>>>>> ....)) >>>>>>> >>>>>>> c) should be easy, I'm going to try sth like that - also setting >>>>>>> temporaries >>>>>>> from the ifexpr and re-using them in the result like >>>>>>> >>>>>>> (match_and_simplify >>>>>>> (plus @1 @2) >>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@2)) >>>>>>> && find_better_ops (&@3, &@4, @1, @2)) >>>>>>> (plus @3 @4)) >>>>>>> >>>>>>> which also requires us to scan the AST for the highest capture index >>>>>>> (a good thing anyway, given my hack to statically set the size of >>>>>>> the captures array ;)) >>>>>>> >>>>>>> b) I'm not sure it's worth the trouble (apart from doing less typing >>>>>>> in match.pd). We're probably going to have GENERIC defined >>>>>>> when creating the generic routines and GIMPLE when creating >>>>>>> the gimple routines so we can easily disable patterns for one or >>>>>>> the other by doing >>>>>>> >>>>>>> #ifdef GIMPLE >>>>>>> (match_and_simplify >>>>>>> ...) >>>>>>> #endif >>>>>>> >>>>>>>>> Once the decision tree is done, I can try working on these. >>>>>>>>> Till Sunday, I will attempt to have another prototype, that covers >>>>>>>>> most of the patterns in match.pd >>>>>>>>> (except for cond_expr, and those requiring GENERIC support), and >>>>>>>>> accompanying correct test-case >>>>>>>>> for each pattern. And next-week start working on a fair patch. >>>>>>> >>>>>>> Sounds good. >>>>>> I have attached a patch, that fires most of the patterns from >>>>>> match.pd >>>>>> except for those that include >>>>>> non-matching capture or require GENERIC support, and added a >>>>>> test-case >>>>>> match-decision-tree.c >>>>>> (fires around 33 patterns). >>>>>> >>>>>> * Representation of decision tree: >>>>>> I clubbed dt_capture, dt_predicate, dt_expr into dt_operand, and >>>>>> added >>>>>> new gen_gimple_match() function in AST >>>>>> to generates matching code. dt_operand contains operand *op as >>>>>> member, >>>>>> and merely calls op->gen_gimple_match. >>>>>> (that's the intent anyway, in the patch, i needed to put some hacks >>>>>> to >>>>>> get it working). >>>>>> In the patch, i have kept: dt_node, dt_operand (dt_node), dt_simplify >>>>>> (dt_node), to separate matching operand >>>>>> from simplification (ifexpr, result). >>>>>> >>>>>> * Constructing decision tree from AST >>>>>> Same as previous patch, stores prefixes of preorder traversals of >>>>>> AST. >>>>>> decision_tree::print() pretty-prints the decision tree. >>>>>> >>>>>> I hope we won't need to change much up-to this point in the patch. >>>>>> >>>>>> * Code-gen >>>>>> code gen of (operator op0 op1 ..opN) simplify takes the following >>>>>> form: >>>>>> if (code == <expr code>) >>>>>> { >>>>>> match op0 >>>>>> match op1 >>>>>> ... >>>>>> match opN >>>>>> simplify >>>>>> return true; >>>>>> } >>>>>> >>>>>> a) Scope of temporaries (the ones that are assigned to >>>>>> gimple_assign_rhsX): >>>>>> for code-gen of expressions off the AST, we create a temporary: >>>>>> >>>>>> { >>>>>> tree op = gimple_assign_rhsX (); >>>>>> generate code for X'th operand >>>>>> } >>>>>> >>>>>> In the decision tree, an expression can only directly access it's 1st >>>>>> operand (since it's preorder traversal). >>>>>> So I decided to first create all temporaries and then generate code >>>>>> for operands: >>>>>> >>>>>> { >>>>>> tree op0 = gimple_assign_rhs1 (); >>>>>> tree op1 = gimple_assign_rhs2 (); >>>>>> tree opX-1 = gimple_assign_rhsX (); >>>>>> >>>>>> code for matching 0th operand >>>>>> code for matching 1st operand >>>>>> ... >>>>>> code for matching Nth operand >>>>>> } >>>>>> >>>>>> Unfortunately, this creates a problem if the operand itself is an >>>>>> expression, >>>>>> since it's temporaries would also be named op0, op1 ... they would >>>>>> shadow the outer op0, op1 .. temporaries. >>>>>> So we need distinct names for each temporary (for all temps within >>>>>> one >>>>>> pattern). >>>>>> The temporary name is created by: >>>>>> op<position of child operand><level of tree>; >>>>>> and each child of AST knows it's position and level, so it can >>>>>> correctly access it's temp. >>>>>> This requires adding members operand *parent, unsigned pos, unsigned >>>>>> level to struct operand. >>>>>> I have done it this way in the patch, however it doesn't feel clean, >>>>>> I >>>>>> will try to see if we can change this. >>>>>> >>>>>> b) Short-ranged gotos >>>>>> consider: (negate @0) S1 >>>>>> The code is generated as (with the patch): >>>>>> >>>>>> if (code == PLUS) >>>>>> { >>>>>> if (!captures [0]) >>>>>> { >>>>>> captures[0] = op0; >>>>>> goto L0; >>>>>> } >>>>>> else if (captures[0] == op0) >>>>>> { >>>>>> L0: >>>>>> // simplification code S1 >>>>>> return true; >>>>>> } >>>>>> } >>>>>> >>>>>> Instead of: >>>>>> if (!captures[0]) >>>>>> captures[0] = op0; >>>>>> else if (captures[0] != op0) >>>>>> goto L0; >>>>>> >>>>>> // Simplification code S1 >>>>>> return true; >>>>>> >>>>>> L0: // match next pattern >>>>>> >>>>>> Generating code the former way is more easier, since location of >>>>>> label >>>>>> immediately follows >>>>>> capture code (if (!captures[0]) ...). In the latter case, I probably >>>>>> need to store label name in the sibling's node in >>>>>> the decision tree and then retrieve that. >>>>>> Similarly for valueize (expr::gen_valueize). >>>>> >>>>> But then why not generate >>>>> >>>>> if (!captures[0] >>>>> || captures[0] == op0) >>>>> { >>>>> captures[0] = op0; >>>>> // simplification code S1 >>>>> return true; >>>>> } >>>>> >>>>> and go without label and goto? Or did I miss something? >>>>> >>>>>> c) Shared captures between all patterns: >>>>>> In the patch all patterns share the capture array tree captures[4] = >>>>>> {} >>>>>> (since all match patterns get interleaved in generated code off >>>>>> decision tree). >>>>>> This requires extra code to be generated to make sure captures are >>>>>> correctly restored for another pattern. >>>> >>>> Btw, I see that you back-track the decision tree. That is, for >>>> >>>> root >>>> |--operand: MINUS_EXPR >>>> |----operand: PLUS_EXPR >>>> |------operand: @0 >>>> |--------operand: @1 >>>> |----------operand: @0 >>>> |------------simplify >>>> |----------operand: @1 >>>> |------------simplify >>>> >>>> you do >>>> >>>> if (code == MINUS_EXPR) >>>> { >>>> tree captures[4] = {}; >>>> if (TREE_CODE (op0) == SSA_NAME) >>>> { >>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>> if (is_gimple_assign (def_stmt) && gimple_assign_rhs_code >>>> (def_stmt) == PLUS_EXPR) >>>> { >>>> tree op01 = gimple_assign_rhs1 (def_stmt); >>>> if (valueize && TREE_CODE (op01) == SSA_NAME) >>>> op01 = valueize (op01); >>>> else >>>> goto L0; >>>> if (op01) >>>> { >>>> L0: >>>> tree op11 = gimple_assign_rhs2 (def_stmt); >>>> if (valueize && TREE_CODE (op11) == SSA_NAME) >>>> op11 = valueize (op11); >>>> else >>>> goto L1; >>>> if (op11) >>>> { >>>> L1: >>>> tree captures_0 = captures[0]; >>>> if (!captures[0]) >>>> { >>>> captures[0] = op01; >>>> goto L2; >>>> } >>>> else if (captures[0] == op01) >>>> { >>>> L2: >>>> tree captures_1 = captures[1]; >>>> if (!captures[1]) >>>> { >>>> captures[1] = op11; >>>> goto L3; >>>> } >>>> else if (captures[1] == op11) >>>> { >>>> L3: >>>> tree captures_2 = captures[0]; >>>> if (!captures[0]) >>>> { >>>> captures[0] = op1; >>>> goto L4; >>>> } >>>> else if (captures[0] == op1) >>>> { >>>> L4: >>>> res_ops[0] = captures[1]; >>>> *res_code = TREE_CODE (res_ops[0]); >>>> return true; >>>> } >>>> captures [0] = captures_2; >>>> tree captures_3 = captures[1]; >>>> if (!captures[1]) >>>> { >>>> captures[1] = op1; >>>> goto L5; >>>> } >>>> else if (captures[1] == op1) >>>> { >>>> L5: >>>> res_ops[0] = captures[0]; >>>> *res_code = TREE_CODE (res_ops[0]); >>>> return true; >>>> } >>>> captures [1] = captures_3; >>>> } >>>> captures [1] = captures_1; >>>> } >>>> captures [0] = captures_0; >>>> } >>>> } >>>> } >>>> } >>>> >>>> but if you processed all children of a decision tree node and didn't >>>> hit one of the simplifies (which return true) then you know nothing >>>> else will match and you can just return false. That early out seems >>>> to be missing completely and we fall through processing all siblings >>>> of the parents. >>>> >>>> But maybe I am missing something? I see that if I make capture >>>> IDs not matching that we'd miss to detect things then (as a >>>> "first" capture always "matches"). >>>> >>>> root >>>> |--operand: MINUS_EXPR >>>> |----operand: PLUS_EXPR >>>> |------operand: @0 >>>> |--------operand: @1 >>>> |----------operand: @0 >>>> |------------simplify >>>> |------operand: @1 >>>> |--------operand: @0 >>>> |----------operand: @0 >>>> |------------simplify >>>> >>>> So I suppose we really have to distinguish "first" captures from >>>> "matching" captures, basically not treating the "first" ones as >>>> nodes of the decision tree and only populating the capture >>>> array once we hit the code generation part of a pattern (well, >>>> or the if-expr). >>> >>> It's probably a good idea to pre-parse the AST to classify >>> captures and matches differently. Given for example >>> >>> (plus @0 (minus@0 @1 @2)) >>> >>> this would say (well, if we support this), that the plus has two >>> "same" operands (on GIMPLE the "same" valueized SSA name) >>> and that this operand is defined by a (minus @1 @2). But with >>> the current operator ordering in the decision tree we'd meet >>> the no-op capture first and the match in an odd position. >>> >>> In the RTL machine description the matches are more explicit >>> like >>> >>> (plus (match @0) (minus@0 @1 @2)) >>> >>> so maybe we want to auto-detect those and rewrite the AST. >>> Just distinguish between capture-def and capture-ref, where >>> expression captures are always -def but leafs may be -refs as well. >>> >>> In the decision tree builder we'd "ignore" -defs and only make >>> -refs explicit (and the -refs would need to "materialize" the -defs >>> just like the ifexpr or transform stage). >>> >>> So I'd say add a flag to struct capture telling whether it's a ref >>> and compute that via an AST traversal (that should then error >>> for invalid refs). >>> >>> In the decision tree I'd keep a vector of operand * and record >>> a name of a temporary the operand can be accessed via >>> (in case any of the AST operands represented by the decision tree >>> node has a capture associated). >>> >>> You'd still have "empty" leaf captures that would then only record >>> the value into a temporary. >>> >>> Eventually you need a back-ref to the decision tree node for >>> each capture when you insert a new AST traversal so you >>> can decide whether a match (a capture-ref) refers to the same >>> decision tree node or not. >>> >>> Richard. >> >> Ok, let me re-phrase after some more thinking and talking with >> a collegue. >> >> When you do the AST traversal computing the array of AST operands * >> you'd initialize book-keeping like the following >> >> for operand in AST do >> if (operand is a capture) >> { >> if (this capture index was not yet seen) >> { >> record for this pattern (in its simplify node created at the >> decision tree leaf) that the capture index refers to the >> operand with the current index >> if (capture has a sub-expression) >> recurse >> else >> insert "true" op (matches everything) >> } >> else >> { >> generate a decision tree node that matches the decision >> tree index of the capture >> recurse >> } >> } >> else >> record op and recurse >> >> So when you generate code you implicitely capture all operands >> in automatic variables named like 'captureN' with 'N' being the >> current level of the decision tree (the index of the operand array >> from the AST walk). Once you hit a 'match-X' node you >> simply match against the operand from the refered to decision >> tree index. >> >> Once you reach the ifexpr point you populate the operands[] array >> from the side-information you initialized in the AST traversal and >> the automatic variables initialized from the decision tree walk. >> >> So >> >> (plus (minus@2 @1 @3) @2) >> >> would have the decision tree >> >> plus - minus - 'true' - 'true' - match(1) - simplify >> >> and the generated code would look like (simplified) >> >> [can't capture outermost expr] >> if (code == plus) >> { >> op1 = gimple_assign_rhs1 (plus-stmt); >> if (TREE_CODE (op1) == minus) >> { >> op2 = gimple_assign_rhs1 (minus-stmt); >> if (true) >> { >> op3 = gimple_assign_rhs2 (minus-stmt); >> if (true) >> { >> op4 = gimple_assign_rhs2 (plus-stmt); >> if (op4 == op1) >> { >> operands[0] = op2; >> operands[1] = op1; >> operands[2] = op3; > Did you mean captures ? > captures[0] = op2; > captures[1] = op1; > captures[2] = op3 >> <simplify> >> >> where the 'true' cases would be matched last (as "else") in case there >> are multiplie choices here if you'd have also >> >> (plus @1 (minus @2 @3)) >> >> the decision tree would become >> >> plus - minus - 'true' - 'true' - match(1) - simplify >> \ >> 'true' - minus - 'true' - 'true' - simplify >> > Thanks, I will get started on this. Today I added a patch that handles > "matching-captures" but does not commonize, I am not not posting since > it's of no > use now. The above algorithm will also get rid of generating temporary > as op<pos><level>. > > suppose these are patterns: > (plus @0 @1) S1 > (plus @1 @0) S2 > > then how would be decision tree represented as ? > > plus - true - true - S1 > \ > S2 > > I suppose pattern-1 would always match (even with earlier code gen. Maybe I am missing something, but won't this create a problem for c_expr since in c_expr we replace @n verbatim with captures[n] ? Or maybe we can find which opX was the capture assigned to, and replace @n with opX, and thus get rid of captures array altogether ?
Example: (negate (negate@0 @1)) -> if (@0) @1 decision tree: negate - negate - true - simplify if (code == NEGATE) { if (gimple_assign_rhs_code (def_stmt) == NEGATE) { tree op1 = gimple_assign_rhs1 (def_stmt); if (op1) { if (true) { if (cond (op0)) { *res_code = TREE_CODE (op1); res_ops[0] = op1; return true; } } } >> Richard. >> >>>> There are also opportunities to optimize the generated code, but >>>> basic correctness first I guess. >>>> >>>> I suppose we have to work a bit on the capture stuff. >>>> >>>> Btw, you can easily play with the code generator by doing inside >>>> the gcc build directory >>>> >>>> obj/gcc> build/genmatch test.pd > test.c >>>> >>>> with a small test.pd. I used the following for the above examples: >>>> >>>> (match_and_simplify >>>> (MINUS_EXPR (PLUS_EXPR @0 @1) @0) >>>> @1) >>>> (match_and_simplify >>>> (MINUS_EXPR (PLUS_EXPR @1 @0) @0) >>>> @1) >> >>>> Richard. >>>> >>>>>> I will change this to have capture per pattern >>>>>> tree captures1[4] = {}; // for pattern-1 >>>>>> tree captures2[4] = {}; >>>>> >>>>> Hmm, is this the matching captures issue I mentioned? Btw, I see >>>>> you do >>>>> >>>>> +void >>>>> +walk_operand_preorder(vec<operand *>& operands, operand *op) >>>>> +{ >>>>> + if (op->type == operand::OP_CAPTURE || op->type == >>>>> operand::OP_PREDICATE || op->type == operand::OP_C_EXPR) >>>>> + { >>>>> + operands.safe_push (op); >>>>> + return; >>>>> + } >>>>> >>>>> but that leaves captured expressions as a single operand? >>>>> >>>>> (plus (minus@1 @2 @3) @2) >>>>> >>>>> would have a decision tree >>>>> >>>>> plus -> minus -> @2 >>>>> >>>>> correct? >>>>> >>>>>> d) Matching multiple patterns. >>>>>> Code for patterns with same match, but different transforms is >>>>>> generated as follows: >>>>>> code for match operand. >>>>>> if (if-expr of pattern-1) >>>>>> { >>>>>> code for result of pattern-1 >>>>>> return true; >>>>>> } >>>>>> if (if-expr of pattern-2) >>>>>> { >>>>>> code for result of pattern-2 >>>>>> return true; >>>>>> } >>>>> >>>>> good. >>>>> >>>>>> ... >>>>>> Should we emit a warning for patterns that have same match operand >>>>>> but >>>>>> no if-expr and no manual transform ? >>>>>> >>>>>> for eg: >>>>>> (match_and_simplify >>>>>> (plus (minus @0 @1) @1) >>>>>> @0 >>>>>> >>>>>> (match_and_simplify >>>>>> (plus (minus @0 @1) @1) >>>>>> @1 // just for illustration >>>>>> >>>>>> Since the matching is ambiguous (same match, no if-expr, no manual >>>>>> transform). >>>>> >>>>> Yeah, I suppose we should. >>>>> >>>>>> we are left to choose between result of pattern-1 and result of >>>>>> pattern-2. >>>>>> We can emit warning and choose result of pattern-1 (first-match rule >>>>>> as in flex). >>>>>> >>>>>> e) Non-matching captures: >>>>>> Haven't thought about this yet. >>>>>> >>>>>> * Should we add "negative" predicates that match only if the >>>>>> predicate >>>>>> fails ? >>>>>> for eg: >>>>>> (match_and_simplify >>>>>> trunc_mod integer_zerop@0 !integer_zerop) >>>>>> @0) >>>>> >>>>> well, there could simply be a integer_not_zerop predicate. >>>>> >>>>>> * Testing >>>>>> Sorry to bring this up again, but I am still not clear what regex to >>>>>> write in scan-tree-dump. >>>>>> >>>>>> Suppose we have these two patterns in match.pd: >>>>>> /* (x + y) - y -> x */ >>>>>> (match_and_simplify >>>>>> (minus (plus @0 @1) @1) >>>>>> @0) >>>>>> >>>>>> /* (x - y) + y -> x */ >>>>>> (match_and_simplify >>>>>> (plus (minus @0 @1) @1) >>>>>> @0) >>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>>> x_\\d\+\\(D\\)" >>>>>> >>>>>> I have following test-cases: >>>>>> int f1(int x, int y) >>>>>> { >>>>>> int t1 = x + y; >>>>>> return t1 - y; >>>>>> } >>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>>> x_\\d\+\\(D\\)" >>>>>> >>>>>> int f2(int x, int y) >>>>>> { >>>>>> int t1 = x - y; >>>>>> return t1 + y; >>>>>> } >>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>>> x_\\d\+\\(D\\)" >>>>>> >>>>>> both the test-cases have same regex. >>>>>> Tested in isolation f1 passes (first pattern if fired) and f2 fails >>>>>> (second pattern doesn't fire, it does after >>>>>> adding it's commutative variant, but that's irrelevant for this >>>>>> case). >>>>>> >>>>>> However when both test-cases are put in one file both the test cases >>>>>> PASS. >>>>>> I think that's because both of them have same regex: \[^\n\r\]*= >>>>>> x_\\d\+\\(D\\) >>>>>> so in f1 and f2's regex both match the dump for f1 function in >>>>>> forwprop dump file: >>>>>> "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\) >>>>>> >>>>>> As a quick hack i rewrote f1 and f2 as: >>>>>> int f1(int x, int y) >>>>>> { >>>>>> int t1 = x + y; >>>>>> int f1_val = t1 - y; >>>>>> return f1_val; >>>>>> } >>>>>> scan-tree-dump "gimple_match_and_simplified to f1_val_\\d\+ = >>>>>> x_\\d\+\\(D\\)" >>>>>> >>>>>> int f2(int x, int y) >>>>>> { >>>>>> int t1 = x - y; >>>>>> int f2_val = t1 + y; >>>>>> return f2_val; >>>>>> } >>>>>> scan-tree-dump "gimple_match_and_simplified to f2_val_\\d\+ = >>>>>> x_\\d\+\\(D\\)" >>>>>> so both f1 and f2's scan-tree-dump have different regexes. >>>>>> and f2's regex does not match dump of f1's function. >>>>>> This matches all patterns in match-decision-tree.c however this is >>>>>> not >>>>>> ideal, >>>>>> since it does not check for matching dump across newlines. >>>>>> Could you suggest a better way ? >>>>> >>>>> There isn't a good better way (the others explicitely do _not_ match >>>>> against >>>>> a newline - see the ^ in the \[\] group). Well, apart from splitting >>>>> the testcase >>>>> into multiple files of course. >>>>> >>>>> Richard. >>>>> >>>>>> Thanks and Regards, >>>>>> Prathamesh >>>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>>>> Thanks and Regards, >>>>>>>>> Prathamesh >>>>>>>>>> >>>>>>>>>> Richard. >>>>>>>>>> >>>>>>>>>>> * Code generation. >>>>>>>>>>> Code shall be generated by walking the decision tree. >>>>>>>>>>> The way it's constructed, there's no difference between code >>>>>>>>>>> generation >>>>>>>>>>> for "matching" and code generation for "transform". For >>>>>>>>>>> non-simplificaton >>>>>>>>>>> operands, "matching" code is generated, and for "simplification" >>>>>>>>>>> operands, >>>>>>>>>>> "transform" code is generated. The tree shall be walked twice, >>>>>>>>>>> once to generate GIMPLE code and second time for GENERIC. >>>>>>>>>>> For simplicity, I currently return false whenever there's a fail >>>>>>>>>>> in match, >>>>>>>>>>> instead of trying to match the next pattern. >>>>>>>>>>> >>>>>>>>>>> Code-gen for capture - same as capture::gen_gimple_match. >>>>>>>>>>> >>>>>>>>>>> Code-gen for predicate - I haven't added support for predicate >>>>>>>>>>> in >>>>>>>>>>> decision tree yet, but I guess that would be the same as >>>>>>>>>>> predicate::gen_gimple_match >>>>>>>>>>> >>>>>>>>>>> Code-gen for expr. >>>>>>>>>>> There are two types of code-gen for expr. >>>>>>>>>>> The patch generates following code: >>>>>>>>>>> Type 1 - expr is child of root node. >>>>>>>>>>> the only code that gets generated is (in >>>>>>>>>>> decision_tree::gen_gimple): >>>>>>>>>>> if (code == <expr code>) >>>>>>>>>>> { >>>>>>>>>>> tree captures[4] = {} >>>>>>>>>>> <generated code for children> >>>>>>>>>>> } >>>>>>>>>>> This is similar to generating matching code in >>>>>>>>>>> write_nary_simplifiers. >>>>>>>>>>> >>>>>>>>>>> Type 2 - expr_1 is a child of another expr_0 node. >>>>>>>>>>> The code gets generated as follows (dt_expr::gen_gimple): >>>>>>>>>>> { >>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op); >>>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == <expr_1-code>) >>>>>>>>>>> { >>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME) >>>>>>>>>>> { >>>>>>>>>>> op = valueize (op); >>>>>>>>>>> if (!op) return false; >>>>>>>>>>> } >>>>>>>>>>> <code-gen for children of expr_1 node> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Example: >>>>>>>>>>> (negate (negate @0)) >>>>>>>>>>> S1 >>>>>>>>>>> >>>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>>> S2 >>>>>>>>>>> >>>>>>>>>>> decision tree: >>>>>>>>>>> >>>>>>>>>>> dummy/root >>>>>>>>>>> | >>>>>>>>>>> NEGATE_EXPR >>>>>>>>>>> / \ >>>>>>>>>>> BIT_NOT NEGATE_EXPR >>>>>>>>>>> | | >>>>>>>>>>> @0 @0 >>>>>>>>>>> | | >>>>>>>>>>> S1 S2 >>>>>>>>>>> >>>>>>>>>>> // code-gen for NEGATE_EXPR (child of root): >>>>>>>>>>> if (code == NEGATE_EXPR) >>>>>>>>>>> { >>>>>>>>>>> tree captures[4] = {}; >>>>>>>>>>> // code gen for BIT_NOT_EXPR >>>>>>>>>>> { >>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) >>>>>>>>>>> { >>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME) >>>>>>>>>>> { >>>>>>>>>>> op = valueize (op); >>>>>>>>>>> if (!op) return false; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> // code-gen for @0, child of BIT_NOT_EXPR >>>>>>>>>>> if (!captures[0]) >>>>>>>>>>> captures[0] = op; >>>>>>>>>>> else if (captures[0] != op) >>>>>>>>>>> return false; >>>>>>>>>>> >>>>>>>>>>> // code-gen for S1, child of @0 >>>>>>>>>>> < same as code generated by .gen_gimple_transform > >>>>>>>>>>> return true; >>>>>>>>>>> } >>>>>>>>>>> // code gen for inner NEGATE_EXPR >>>>>>>>>>> { >>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) >>>>>>>>>>> <rest similar to the BIT_NOT case> >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> The following gets duplicated with the patch: >>>>>>>>>>> { >>>>>>>>>>> gimple_def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME) >>>>>>>>>>> return false; >>>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == <expr-code>) >>>>>>>>>>> ... >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Improving code-gen for expr: >>>>>>>>>>> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get >>>>>>>>>>> duplicated, >>>>>>>>>>> while they could be factored out, similar to this: >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME) >>>>>>>>>>> return false; >>>>>>>>>>> if (!is_gimple_assign (def_stmt)) >>>>>>>>>>> return false; >>>>>>>>>>> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) >>>>>>>>>>> { >>>>>>>>>>> // code-gen for BIT_NOT_EXPR subtree >>>>>>>>>>> } >>>>>>>>>>> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) >>>>>>>>>>> { >>>>>>>>>>> // code-gen for NEGATE_EXPR subtree >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) != >>>>>>>>>>> SSA_NAME" >>>>>>>>>>> we could have that generated at the parent of expr's node rather >>>>>>>>>>> than >>>>>>>>>>> at expr. However that would be incorrect for cases where not all >>>>>>>>>>> children >>>>>>>>>>> of a node are expressions: >>>>>>>>>>> >>>>>>>>>>> Example: >>>>>>>>>>> // patterns only for illustration >>>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>>> (negate @0) >>>>>>>>>>> >>>>>>>>>>> root >>>>>>>>>>> | >>>>>>>>>>> negate >>>>>>>>>>> / \ >>>>>>>>>>> bit_not @0 >>>>>>>>>>> | >>>>>>>>>>> @0 >>>>>>>>>>> >>>>>>>>>>> we cannot have the above code generated at negate, >>>>>>>>>>> since it's not applicable negate's 2nd child (@0). >>>>>>>>>>> >>>>>>>>>>> This can be done by grouping together children that are >>>>>>>>>>> expressions. >>>>>>>>>>> However the patch does not do that. >>>>>>>>>>> >>>>>>>>>>> * Code-gen for simplification operand >>>>>>>>>>> This involves code-gen for ifexpr and result of pattern. >>>>>>>>>>> Calls gen_gimple_transform of ifexpr and result >>>>>>>>>>> (dt_simplify::gen_gimple) >>>>>>>>>>> So this is really code-gen off AST >>>>>>>>>> >>>>>>>>>> Right (modulo replacing captures with their replacements). >>>>>>>>>> >>>>>>>>>>> * Matching multiple patterns >>>>>>>>>>> A pattern has following parts: match, ifexpr and result. >>>>>>>>>>> If pattern fails in match operand, I guess we can safely return >>>>>>>>>>> false ? >>>>>>>>>>> We "club" together patterns that have same match operand, >>>>>>>>>>> and use goto, if one of them fails in their (ifexpr/result) and >>>>>>>>>>> then goto the >>>>>>>>>>> (ifexpr/result) of the next operand. >>>>>>>>>>> >>>>>>>>>>> Example: >>>>>>>>>>> /* x & 0 -> 0 */ >>>>>>>>>>> (match_and_simplify >>>>>>>>>>> (bit_and @0 @1) >>>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == >>>>>>>>>>> integer_zero_node)) >>>>>>>>>>> { integer_zero_node; }) >>>>>>>>>>> >>>>>>>>>>> /* x & -1 -> x */ >>>>>>>>>>> (match_and_simplify >>>>>>>>>>> (bit_and @0 @1) >>>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) >>>>>>>>>>> && (@1 == integer_minus_one_node) >>>>>>>>>>> @0) >>>>>>>>>>> >>>>>>>>>>> For both patterns match is same. >>>>>>>>>>> >>>>>>>>>>> Decision Tree: >>>>>>>>>>> bit_and >>>>>>>>>>> / \ >>>>>>>>>>> @0 @1 >>>>>>>>>>> | >>>>>>>>>>> S1, S2 >>>>>>>>>> >>>>>>>>>> I think it's worth adding a diagnostic to genmach whenever >>>>>>>>>> exactly >>>>>>>>>> same matches appear. But I'd say generally we'd support this >>>>>>>>>> by testing the ifexpr of the next pattern. >>>>>>>>>> >>>>>>>>>>> S1 represents <ifexpr, result> of pattern-1, and S2 represents >>>>>>>>>>> <ifexpr, result> >>>>>>>>>>> of pattern-2 respectively. >>>>>>>>>>> S1, S2 would be stored as children of @1 (the last operand of >>>>>>>>>>> n-ary operator), >>>>>>>>>>> in dt_operand::kids vector. >>>>>>>>>>> >>>>>>>>>>> The code would be generated as: >>>>>>>>>>> >>>>>>>>>>> matching code. >>>>>>>>>>> if (! pattern-1 ifexpr condition) >>>>>>>>>>> goto simplify2; // next pattern with the same "match" >>>>>>>>>>> operand. >>>>>>>>>>> transform code for pattern 1 >>>>>>>>>>> >>>>>>>>>>> simplify2: >>>>>>>>>>> if (! pattern-2 ifexpr condition) >>>>>>>>>>> return false; // last pattern >>>>>>>>>>> transform code for pattern 2. >>>>>>>>>>> >>>>>>>>>>> If matching itself fails, that is neither of the decisions get >>>>>>>>>>> matched, >>>>>>>>>>> I believe we can then return false as it cannot match any other >>>>>>>>>>> pattern ? >>>>>>>>>>> >>>>>>>>>>> * patterns needing hacks like cond_expr or GENERIC support >>>>>>>>>>> I haven't given thought to this yet. I suppose we can look to >>>>>>>>>>> handle >>>>>>>>>>> these after adding support for GENERIC. Instead of generating >>>>>>>>>>> GENERIC >>>>>>>>>>> matching in >>>>>>>>>>> gimple_match_and_simplify, could we then call >>>>>>>>>>> generic_match_and_simplify from >>>>>>>>>>> within gimple_match_and_simplify ? >>>>>>>>>> >>>>>>>>>> Yes (that's what's currently done btw). >>>>>>>>>> >>>>>>>>>>> * Tests >>>>>>>>>>> The patch transformed the following patterns: >>>>>>>>>>> >>>>>>>>>>> (match_and_simplify >>>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>>>>>>>>>> (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); })) >>>>>>>>>>> >>>>>>>>>>> (match_and_simplify >>>>>>>>>>> (negate (negate @0)) >>>>>>>>>>> @0) >>>>>>>>>>> >>>>>>>>>>> I have attached test-case I tried it with (negate.c) >>>>>>>>>>> >>>>>>>>>>> * Conclusion >>>>>>>>>>> Does it sound reasonable ? I am going to be re-writing the >>>>>>>>>>> decision tree from scratch, but is the basic idea fine ? Or >>>>>>>>>>> should >>>>>>>>>>> we >>>>>>>>>>> take a different approach ? >>>>>>>>>>> >>>>>>>>>>> Thanks and Regards, >>>>>>>>>>> Prathamesh >> >