Re: Best way to compute cost of a sequence of gimple stmt
On Wed, Jun 11, 2014 at 7:45 AM, Thomas Preud'homme wrote: >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: Tuesday, June 10, 2014 5:16 PM >> > >> In general this is impossible to do. I don't have a good answer on >> how to determine whether (unaligned) load + bswap is faster than >> doing sth else - but there is a very good chance that the original >> code is even worse. For the unaligned load you can expect >> an optimal code sequence to be generated - likewise for the bswap. >> Now - if you want to do the best for the combination of both I'd >> say you add support to the expr.c bitfield extraction code to do >> the bswap on-the-fly and use TER to see that you are doing the >> bswap on a memory source. > > Oh I see. Doing it there would mean instead of two independent > operations you'd do the best combination possible, is that right? Yes (but probably it's not worth the trouble). >> >> There is only two choices - disable unaligned-load + bswap on >> SLOW_UNALIGNED_ACCESS targets or not. Doing sth more >> fancy won't do the trick and isn't worth the trouble IMHO. > > There is some other reason to compute the cost that I didn't > mention. For instance, you suggested to recognize partial > load (+bswap). Quoting you: > >> unsigned foo (unsigned char *x) >> { >> return x[0] << 24 | x[2] << 8 | x[3]; >> } >> >> ? We could do an unsigned int load from x and zero byte 3 >> with an AND. > > Even with aligned access, the above might be slower if x[0] was > already loaded previously and sits in a register. Well, I think the pattern detection makes sure that it only follows single-use chains? Or rather that all original loads and bit operations are dead after the transform (not exactly following single-use chains if you'd consider to eventually handle x[0] << 24 | x[0] << 16 | x[2] << 8 | x[3] as a valid permutation)? > I'm tempted to use a simple heuristic such as comparing the > number of loads before and after, adding one if the load is > unaligned. So in the above example, supposing that there is > some computation done around x[0] before the return line, > we'd have 2 loads before Vs 2 x is unaligned and we would > cancel the optimization. If x is aligned the optimization would > proceed. > > Do you thing this approach is also too much trouble or would > not work? I'm not sure. For noop-loads I'd keep them unconditionally, even if unaligned. I'd disable unaligned-load + bswap for now. People interested and sitting on such a target should do the measurements and decide if it's worth the trouble (is arm affected?). But I see that the code currently does not limit itself to single-use chains and thus may end up keeping the whole original code life by unrelated uses. So a good thing would be to impose proper restrictions here. For example, in find_bswap_or_nop_1 do if (TREE_CODE (rhs1) != SSA_NAME || !has_single_use (rhs1)) Richard. > Best regards, > > Thomas > > >
RE: Best way to compute cost of a sequence of gimple stmt
> From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Wednesday, June 11, 2014 4:09 PM > > > > Oh I see. Doing it there would mean instead of two independent > > operations you'd do the best combination possible, is that right? > > Yes (but probably it's not worth the trouble). I understood that. > > > I'm tempted to use a simple heuristic such as comparing the > > number of loads before and after, adding one if the load is > > unaligned. So in the above example, supposing that there is > > some computation done around x[0] before the return line, > > we'd have 2 loads before Vs 2 x is unaligned and we would > > cancel the optimization. If x is aligned the optimization would > > proceed. > > > > Do you thing this approach is also too much trouble or would > > not work? > > I'm not sure. For noop-loads I'd keep them unconditionally, even if > unaligned. I'd disable unaligned-load + bswap for now. People > interested and sitting on such a target should do the measurements > and decide if it's worth the trouble (is arm affected?). Yes it is. > > But I see that the code currently does not limit itself to single-use > chains and thus may end up keeping the whole original code life > by unrelated uses. So a good thing would be to impose proper > restrictions here. For example, in find_bswap_or_nop_1 do > > if (TREE_CODE (rhs1) != SSA_NAME > || !has_single_use (rhs1)) But then the example in gcc.dg/optimize-bswapdi-2.c would not work for instance. Same for swap32_b in gcc.dg/optimize-bswapsi-1.c To make it work you'd need to check that there is no use outside the sets of statements that form the bitwise OR operation you are considering. Best regards, Thomas
Re: [GSoC] decision tree first steps
On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener wrote: > On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni > wrote: >> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener >> wrote: >>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener >>> wrote: On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni wrote: > On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener > wrote: >> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni >> wrote: >>> I have few questions regarding genmatch: >>> >>> a) Why is 4 hard-coded here: ? >>> in write_nary_simplifiers: >>> fprintf (f, " tree captures[4] = {};\n"); >> >> Magic number (this must be big enough for all cases ...). Honestly >> this should be improved (but requires another scan over the matcher IL >> to figure out the max N used in @N). >> >>> b) Should we add syntax for a symbol to denote multiple operators ? >>> For exampleim in simplify_rotate: >>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == >>> bitsize of type of X). >> >> Something to enhance the IL with, yes. I'd say we support >> >> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) >> >> thus, >> >> (define_op op...) >> >>> c) Remove for parsing capture in parse_expr since we reject outermost >>> captured expressions ? >> >> but parse_expr is also used for inner expressions, no? >> >> (plus (minus@2 @0 @1) @3) >> >> should still work >> >>> d) I am not able to follow this comment in match.pd: >>> /* Patterns required to avoid SCCVN testsuite regressions. */ >>> >>> /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >>>complicated here, it does >>> Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) >>> (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) >>> if the new mask might be further optimized. */ >>> (match_and_simplify >>> (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) >>> if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0) >>> @0) >> >> The comment is literally copied from the case I extracted the >> (simplified) variant from fold-const.c. See lines 11961-12056 in >> fold-const.c. >> It'll be a challenge to implement the equivalent in a pattern ;) >> >>> >>> Decision Tree. >>> I have tried to come up with a prototype for decision tree (patch >>> attached). >>> For simplicity, it handles patterns involving only unary operators and >>> no predicates >>> and returns false when the pattern fails to match (no goto to match >>> another pattern). >>> I meant to post it only for illustration, and I have not really paid >>> attention to code quality (bad formatting, memory leaks, etc.). >>> >>> * Basic Idea >>> A pattern consists of following parts: match, ifexpr and result. >>> Let's call as "simplification" operand. >>> The common prefix between different match operands would be represented >>> by same nodes in the decision tree. >>> >>> Example: >>> (negate (bit_not @0)) >>> S1 >>> >>> (negate (negate @0)) >>> S2 >>> >>> S1, S2 denote simplifications for the above patterns respectively. >>> >>> The decision tree would look something like >>> (that's the way it gets constructed with the patch): >>> >>> dummy/root >>> | >>>NEGATE_EXPR >>> / \ >>> BIT_NOT NEGATE_EXPR >>>| | >>> @0 @0 >>>| | >>> S1 S2 >>> >>> a) The children of an internal node are number of decisions that >>> can be taken at that node. In the above case it's 2 for outer >>> NEGATE_EXPR. >>> b) Simplification operand represents leaves of the decision tree >>> c) Instead of having list of heads, I have added one dummy node, >>> and heads become children of these dummy root node. >>> d) Code-gen for non-simplification operands involves generating, >>> "matching" code and for simplification operands involves generating >>> "transform" code >>> >>> * Overall Flow: >>> I guess we would build the decision tree from the AST. >>> So the flow would be like: >>> source -> struct simplify (match, ifexpr, result) -> decision tree -> c >>> code. >>> >>> Something like (in main): >>> decision_tree dt; >>> while (there is another pattern) >>> { >>> simplify *s = parse_match_and_simplify (); >>> insert s into decision tree; >>> }; >>> So parsing routines are concerned with parsing and building the AST >>> (operand), >>> and not with the d
Re: Best way to compute cost of a sequence of gimple stmt
On Wed, Jun 11, 2014 at 10:27 AM, Thomas Preud'homme wrote: >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: Wednesday, June 11, 2014 4:09 PM > >> > >> > Oh I see. Doing it there would mean instead of two independent >> > operations you'd do the best combination possible, is that right? >> >> Yes (but probably it's not worth the trouble). > > I understood that. > >> >> > I'm tempted to use a simple heuristic such as comparing the >> > number of loads before and after, adding one if the load is >> > unaligned. So in the above example, supposing that there is >> > some computation done around x[0] before the return line, >> > we'd have 2 loads before Vs 2 x is unaligned and we would >> > cancel the optimization. If x is aligned the optimization would >> > proceed. >> > >> > Do you thing this approach is also too much trouble or would >> > not work? >> >> I'm not sure. For noop-loads I'd keep them unconditionally, even if >> unaligned. I'd disable unaligned-load + bswap for now. People >> interested and sitting on such a target should do the measurements >> and decide if it's worth the trouble (is arm affected?). > > Yes it is. > >> >> But I see that the code currently does not limit itself to single-use >> chains and thus may end up keeping the whole original code life >> by unrelated uses. So a good thing would be to impose proper >> restrictions here. For example, in find_bswap_or_nop_1 do >> >> if (TREE_CODE (rhs1) != SSA_NAME >> || !has_single_use (rhs1)) > > But then the example in gcc.dg/optimize-bswapdi-2.c would not > work for instance. Same for swap32_b in gcc.dg/optimize-bswapsi-1.c > > To make it work you'd need to check that there is no use outside the > sets of statements that form the bitwise OR operation you are > considering. Yes, of course (also for generic shuffles that may have duplicate entries). Richard. > Best regards, > > Thomas > >
Re: [GSoC] decision tree first steps
On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener wrote: > On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener > wrote: >> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni >> wrote: >>> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener >>> wrote: On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener wrote: > On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni > wrote: >> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener >> wrote: >>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni >>> wrote: I have few questions regarding genmatch: a) Why is 4 hard-coded here: ? in write_nary_simplifiers: fprintf (f, " tree captures[4] = {};\n"); >>> >>> Magic number (this must be big enough for all cases ...). Honestly >>> this should be improved (but requires another scan over the matcher IL >>> to figure out the max N used in @N). >>> b) Should we add syntax for a symbol to denote multiple operators ? For exampleim in simplify_rotate: (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == bitsize of type of X). >>> >>> Something to enhance the IL with, yes. I'd say we support >>> >>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) >>> >>> thus, >>> >>> (define_op op...) >>> c) Remove for parsing capture in parse_expr since we reject outermost captured expressions ? >>> >>> but parse_expr is also used for inner expressions, no? >>> >>> (plus (minus@2 @0 @1) @3) >>> >>> should still work >>> d) I am not able to follow this comment in match.pd: /* Patterns required to avoid SCCVN testsuite regressions. */ /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more complicated here, it does Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) if the new mask might be further optimized. */ (match_and_simplify (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0) @0) >>> >>> The comment is literally copied from the case I extracted the >>> (simplified) variant from fold-const.c. See lines 11961-12056 in >>> fold-const.c. >>> It'll be a challenge to implement the equivalent in a pattern ;) >>> Decision Tree. I have tried to come up with a prototype for decision tree (patch attached). For simplicity, it handles patterns involving only unary operators and no predicates and returns false when the pattern fails to match (no goto to match another pattern). I meant to post it only for illustration, and I have not really paid attention to code quality (bad formatting, memory leaks, etc.). * Basic Idea A pattern consists of following parts: match, ifexpr and result. Let's call as "simplification" operand. The common prefix between different match operands would be represented by same nodes in the decision tree. Example: (negate (bit_not @0)) S1 (negate (negate @0)) S2 S1, S2 denote simplifications for the above patterns respectively. The decision tree would look something like (that's the way it gets constructed with the patch): dummy/root | NEGATE_EXPR / \ BIT_NOT NEGATE_EXPR | | @0 @0 | | S1 S2 a) The children of an internal node are number of decisions that can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR. b) Simplification operand represents leaves of the decision tree c) Instead of having list of heads, I have added one dummy node, and heads become children of these dummy root node. d) Code-gen for non-simplification operands involves generating, "matching" code and for simplification operands involves generating "transform" code * Overall Flow: I guess we would build the decision tree from the AST. So the flow would be like: source -> struct simplify (match, ifexpr, result) -> decision tree -> c code. Something like (in main): decision_tree dt; while (there is another pattern) { simplify *s = parse_match_and_si
Re: broken links?
On 06/10/2014 06:20 PM, Hebenstreit, Michael wrote: Either something is broken on my web-access or the links on https://gcc.gnu.org/install/prerequisites.html pointing to ftp://gcc.gnu.org/pub/gcc/infrastructure/ are gone - can't find the files anywhere else :( Thanks for all your efforts Michael Michael Hebenstreit Senior Cluster Architect Intel Corporation, MS: RR1-105/H14 Software and Services Group/DCE 4100 Sara Road Tel.: +1 505-794-3144 Rio Rancho, NM 87124 UNITED STATES E-mail: michael.hebenstr...@intel.com Hello, yesterday someone noticed on IRC a similar problem caused by switching web pages from http to https. More precisely, webkit based browsers have problems with combination of http and https links for the same site, Chrome console error from https://gcc.gnu.org/: [blocked] The page at 'https://gcc.gnu.org/' was loaded over HTTPS, but ran insecure content from 'http://gcc.gnu.org/gcc.css': this content should also be loaded over HTTPS. I suggest to change all 'http://' and 'https://' links to '//'. Thanks, Martin
RE: implicit gnat_malloc seen as vararg function
Hi, I have more info concerning my gnat_malloc problem. I watched the code in gcc/ada/gcc-interface/trans.c and found the location where malloc_decl tree is built. In gigi ()function (trans.c:411), the ftype for malloc_decl is done this way : ftype = build_function_type_list (ptr_void_type_node, sizetype, NULL_TREE); I looked at the code of build_function_type_list_1 in tree.c and found that stdargs function are built with their chained list of argument ending with a void_list_node : { last = args; args = nreverse (args); TREE_CHAIN (last) = void_list_node; } But I also noticed that void_list_node was a null pointer!! instead of being a node with TREE_VALUE = void_type_node and no TREE_CHAIN (as described in tree.h) So I wondered, where should void_list_node be initialized in GNAT frontend and noticed that the initialization of void_list_node was done AFTER its use to declare ftype for malloc_decl. ftype for malloc_decl is initialized in gigi() function at trans.c:411 whereas void_list_node in only initialized in gigi() function at trans.c:665 by a call to gnat_install_builtins() ---> install_builtin_elementary_types() --> void_list_node = build_void_list_node (); It seems like a bug. Is it a known one ? someone has the proper fix ? I put the call to gnat_install_builtins() higher inside gigi() and it solve my current problem but I'm not sure exactly when it has to be called and if there are bad side effects. Regards, Selim Belbachir
ACM SIGPLAN Programming Languages Software Award
It gives me great pleasure to announce that GCC has won the ACM SIGPLAN Programming Languages Software Award Congratulations to the entire GCC Community! - David
Re: [GSoC] decision tree first steps
On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener wrote: > On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener > wrote: >> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener >> wrote: >>> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni >>> wrote: On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener wrote: > On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener > wrote: >> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni >> wrote: >>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener >>> wrote: On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni wrote: > I have few questions regarding genmatch: > > a) Why is 4 hard-coded here: ? > in write_nary_simplifiers: > fprintf (f, " tree captures[4] = {};\n"); Magic number (this must be big enough for all cases ...). Honestly this should be improved (but requires another scan over the matcher IL to figure out the max N used in @N). > b) Should we add syntax for a symbol to denote multiple operators ? > For exampleim in simplify_rotate: > (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == > bitsize of type of X). Something to enhance the IL with, yes. I'd say we support (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) thus, (define_op op...) > c) Remove for parsing capture in parse_expr since we reject outermost > captured expressions ? but parse_expr is also used for inner expressions, no? (plus (minus@2 @0 @1) @3) should still work > d) I am not able to follow this comment in match.pd: > /* Patterns required to avoid SCCVN testsuite regressions. */ > > /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >complicated here, it does > Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) > (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) > if the new mask might be further optimized. */ > (match_and_simplify > (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) > if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0) > @0) The comment is literally copied from the case I extracted the (simplified) variant from fold-const.c. See lines 11961-12056 in fold-const.c. It'll be a challenge to implement the equivalent in a pattern ;) > > Decision Tree. > I have tried to come up with a prototype for decision tree (patch > attached). > For simplicity, it handles patterns involving only unary operators and > no predicates > and returns false when the pattern fails to match (no goto to match > another pattern). > I meant to post it only for illustration, and I have not really paid > attention to code quality (bad formatting, memory leaks, etc.). > > * Basic Idea > A pattern consists of following parts: match, ifexpr and result. > Let's call as "simplification" operand. > The common prefix between different match operands would be > represented > by same nodes in the decision tree. > > Example: > (negate (bit_not @0)) > S1 > > (negate (negate @0)) > S2 > > S1, S2 denote simplifications for the above patterns respectively. > > The decision tree would look something like > (that's the way it gets constructed with the patch): > > dummy/root > | >NEGATE_EXPR > / \ > BIT_NOT NEGATE_EXPR >| | > @0 @0 >| | > S1 S2 > > a) The children of an internal node are number of decisions that > can be taken at that node. In the above case it's 2 for outer > NEGATE_EXPR. > b) Simplification operand represents leaves of the decision tree > c) Instead of having list of heads, I have added one dummy node, > and heads become children of these dummy root node. > d) Code-gen for non-simplification operands involves generating, > "matching" code and for simplification operands involves generating > "transform" code > > * Overall Flow: > I guess we would build the decision tree from the AST. > So the flow would be like: > source -> struct simplify (match, ifexpr, result) -> decision tree
Re: [GSoC] decision tree first steps
On 6/11/14, Richard Biener wrote: > On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener > wrote: >> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener >> wrote: >>> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener >>> wrote: On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni wrote: > On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener > wrote: >> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener >> wrote: >>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni >>> wrote: On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener wrote: > On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni > wrote: >> I have few questions regarding genmatch: >> >> a) Why is 4 hard-coded here: ? >> in write_nary_simplifiers: >> fprintf (f, " tree captures[4] = {};\n"); > > Magic number (this must be big enough for all cases ...). > Honestly > this should be improved (but requires another scan over the matcher > IL > to figure out the max N used in @N). > >> b) Should we add syntax for a symbol to denote multiple operators >> ? >> For exampleim in simplify_rotate: >> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == >> bitsize of type of X). > > Something to enhance the IL with, yes. I'd say we support > > (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) > > thus, > > (define_op op...) > >> c) Remove for parsing capture in parse_expr since we reject >> outermost >> captured expressions ? > > but parse_expr is also used for inner expressions, no? > > (plus (minus@2 @0 @1) @3) > > should still work > >> d) I am not able to follow this comment in match.pd: >> /* Patterns required to avoid SCCVN testsuite regressions. */ >> >> /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >>complicated here, it does >> Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) >> (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) >> if the new mask might be further optimized. */ >> (match_and_simplify >> (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) >> if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) >> == 0) >> @0) > > The comment is literally copied from the case I extracted the > (simplified) variant from fold-const.c. See lines 11961-12056 in > fold-const.c. > It'll be a challenge to implement the equivalent in a pattern ;) > >> >> Decision Tree. >> I have tried to come up with a prototype for decision tree >> (patch attached). >> For simplicity, it handles patterns involving only unary operators >> and >> no predicates >> and returns false when the pattern fails to match (no goto to >> match >> another pattern). >> I meant to post it only for illustration, and I have not really >> paid >> attention to code quality (bad formatting, memory leaks, etc.). >> >> * Basic Idea >> A pattern consists of following parts: match, ifexpr and result. >> Let's call as "simplification" operand. >> The common prefix between different match operands would be >> represented >> by same nodes in the decision tree. >> >> Example: >> (negate (bit_not @0)) >> S1 >> >> (negate (negate @0)) >> S2 >> >> S1, S2 denote simplifications for the above patterns >> respectively. >> >> The decision tree would look something like >> (that's the way it gets constructed with the patch): >> >> dummy/root >> | >>NEGATE_EXPR >> / \ >> BIT_NOT NEGATE_EXPR >>| | >> @0 @0 >>| | >> S1 S2 >> >> a) The children of an internal node are number of decisions that >> can be taken at that node. In the above case it's 2 for outer >> NEGATE_EXPR. >> b) Simplification operand represents leaves of the decision tree >> c) Instead of having list of heads, I have added one dummy node, >> and heads become children of these dummy root node. >> d) Code-gen for non-simplification operands involves generating, >> "matching" code and for simplification operands involves >>>
Re: [GSoC] decision tree first steps
On 6/11/14, Prathamesh Kulkarni wrote: > On 6/11/14, Richard Biener wrote: >> On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener >> wrote: >>> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener >>> wrote: On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener wrote: > On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni > wrote: >> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener >> wrote: >>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener >>> wrote: On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni wrote: > On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener > wrote: >> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni >> wrote: >>> I have few questions regarding genmatch: >>> >>> a) Why is 4 hard-coded here: ? >>> in write_nary_simplifiers: >>> fprintf (f, " tree captures[4] = {};\n"); >> >> Magic number (this must be big enough for all cases ...). >> Honestly >> this should be improved (but requires another scan over the >> matcher >> IL >> to figure out the max N used in @N). >> >>> b) Should we add syntax for a symbol to denote multiple >>> operators >>> ? >>> For exampleim in simplify_rotate: >>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 >>> == >>> bitsize of type of X). >> >> Something to enhance the IL with, yes. I'd say we support >> >> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) >> >> thus, >> >> (define_op op...) >> >>> c) Remove for parsing capture in parse_expr since we reject >>> outermost >>> captured expressions ? >> >> but parse_expr is also used for inner expressions, no? >> >> (plus (minus@2 @0 @1) @3) >> >> should still work >> >>> d) I am not able to follow this comment in match.pd: >>> /* Patterns required to avoid SCCVN testsuite regressions. */ >>> >>> /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >>>complicated here, it does >>> Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) >>> (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) >>> if the new mask might be further optimized. */ >>> (match_and_simplify >>> (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) >>> if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) >>> == 0) >>> @0) >> >> The comment is literally copied from the case I extracted the >> (simplified) variant from fold-const.c. See lines 11961-12056 in >> fold-const.c. >> It'll be a challenge to implement the equivalent in a pattern ;) >> >>> >>> Decision Tree. >>> I have tried to come up with a prototype for decision tree >>> (patch attached). >>> For simplicity, it handles patterns involving only unary >>> operators >>> and >>> no predicates >>> and returns false when the pattern fails to match (no goto to >>> match >>> another pattern). >>> I meant to post it only for illustration, and I have not really >>> paid >>> attention to code quality (bad formatting, memory leaks, etc.). >>> >>> * Basic Idea >>> A pattern consists of following parts: match, ifexpr and result. >>> Let's call as "simplification" operand. >>> The common prefix between different match operands would be >>> represented >>> by same nodes in the decision tree. >>> >>> Example: >>> (negate (bit_not @0)) >>> S1 >>> >>> (negate (negate @0)) >>> S2 >>> >>> S1, S2 denote simplifications for the above patterns >>> respectively. >>> >>> The decision tree would look something like >>> (that's the way it gets constructed with the patch): >>> >>> dummy/root >>> | >>>NEGATE_EXPR >>> / \ >>> BIT_NOT NEGATE_EXPR >>>| | >>> @0 @0 >>>| | >>> S1 S2 >>> >>> a) The children of an internal node are number of decisions that >>> can be taken at that node. In the above case it's 2 for outer >>> NEGATE_EXPR. >>> b) Simplification operand represents leaves of the decision tree >>> c) Instead of having list of heads, I have added one dummy node, >>>
gcc-4.9-20140611 is now available
Snapshot gcc-4.9-20140611 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20140611/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 211481 You'll find: gcc-4.9-20140611.tar.bz2 Complete GCC MD5=992e436cde7387effcc5ead9dc9253ec SHA1=c202aaf3293c3d13f4e53322bffe4540d69fbb4d Diffs from 4.9-20140604 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.