Re: RFC: Doc update for attribute
After thinking about this some more, I believe I have some better text. Previously I used the word "discouraged" to describe this practice. The existing docs use the term "avoid." I believe what you want is something more like the attached. Direct and clear, just like docs should be. If you are ok with this, I'll send it to gcc-patches. dw +While it +is discouraged, it is possible to write your own prologue/epilogue code +using asm and use ``C'' code in the middle. I wouldn't remove the last sentence since IMO it's not the intent of the feature to ever support that and the compiler doesn't guarantee it and may result in wrong code given that `naked' is a fragile low-level feature. I'm assuming you meant "would remove." I wasn't comfortable including that sentence, but I was following the existing docs. Since they said you could "only" use basic asm, following that with a warning to "avoid" locals/if/etc was really confusing without this text. Also, as ugly as this is, apparently some people really do this (comment 6): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43404#c6 We don't have to doc every crazy thing people try to do with gcc. But since it's out there, maybe we should this time? If only to discourage it. I'm *slightly* more in favor of keeping it. But if you still feel it should go, it's gone. Index: extend.texi === --- extend.texi (revision 210624) +++ extend.texi (working copy) @@ -3332,16 +3332,15 @@ @item naked @cindex function without a prologue/epilogue code -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU -ports to indicate that the specified function does not need prologue/epilogue -sequences generated by the compiler. -It is up to the programmer to provide these sequences. The -only statements that can be safely included in naked functions are -@code{asm} statements that do not have operands. All other statements, -including declarations of local variables, @code{if} statements, and so -forth, should be avoided. Naked functions should be used to implement the -body of an assembly function, while allowing the compiler to construct -the requisite function declaration for the assembler. +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, +RL78, RX and SPU ports. It allows the compiler to construct the +requisite function declaration, while allowing the body of the +function to be assembly code. The specified function will not have +prologue/epilogue sequences generated by the compiler. Only Basic +@code{asm} statements can safely be included in naked functions +(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of +Basic @code{asm} and ``C'' code may appear to work, they cannot be +depended upon to work reliably and are not supported. @item near @cindex functions that do not handle memory bank switching on 68HC11/68HC12 @@ -6269,6 +6268,8 @@ efficient code, and in most cases it is a better solution. When writing inline assembly language outside of C functions, however, you must use Basic @code{asm}. Extended @code{asm} statements have to be inside a C function. +Functions declared with the @code{naked} attribute also require Basic +@code{asm} (@pxref{Function Attributes}). Under certain circumstances, GCC may duplicate (or remove duplicates of) your assembly code when optimizing. This can lead to unexpected duplicate @@ -6388,6 +6389,8 @@ Note that Extended @code{asm} statements must be inside a function. Only Basic @code{asm} may be outside functions (@pxref{Basic Asm}). +Functions declared with the @code{naked} attribute also require Basic +@code{asm} (@pxref{Function Attributes}). While the uses of @code{asm} are many and varied, it may help to think of an @code{asm} statement as a series of low-level instructions that convert input
Re: RFC: Doc update for attribute
Am 05/16/2014 07:16 PM, schrieb Carlos O'Donell: On 05/12/2014 11:13 PM, David Wohlferd wrote: After updating gcc's docs about inline asm, I'm trying to improve some of the related sections. One that I feel has problems with clarity is __attribute__ naked. I have attached my proposed update. Comments/corrections are welcome. In a related question: To better understand how this attribute is used, I looked at the Linux kernel. While the existing docs say "only ... asm statements that do not have operands" can safely be used, Linux routinely uses asm WITH operands. That's a bug. Period. You must not use naked with an asm that has operands. Any kind of operand might inadvertently cause the compiler to generate code and that would violate the requirements of the attribute and potentially generate an ICE. There is target hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS that is intended to cater that case. For example, the documentation indicates it only works with optimization turned off. But I don't know how reliable it is in general. For avr target it works as expected. https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-TARGET_005fALLOCATE_005fSTACK_005fSLOTS_005fFOR_005fARGS-4969 Johann
Roadmap for 4.9.1, 4.10.0 and onwards?
Hi, I've been tracking the latest releases of gcc since 4.7 or so (variously interested in C++1y support, cilk and openmp). One thing I've found hard to locate is information about planned inclusions for future releases. As much relies on unpredictable community contributions I don't expect there to be a concrete or reliable plan. However, equally I'm sure the steering committee have some ideas over what ought to be upcoming releases. Is this published anywhere? For example if I look at: https://gcc.gnu.org/projects/cxx1y.html There are 3 items marked "no" under C++14 support. Which if any are tabled for 4.10.0?More generally what targets (obviously subject to change) are there for 4.10.0? or 4.9.1? Regards, Bruce.
Supported targets
Hi, Slightly related to my previous question about the roadmap. I have two quite old targets based on (so far as I know) standard linux distributions. Should they still be supported? RHEL4 (kernel 2.6.9-55.ELsmp): I was able to compile 4.8.1 successfully when it was released. 4.9.0 fails as below. RHEL4 is end of life (but not extended life). My feeling is this ought to work and is probably a regression I should report? SUSE LINUX Enterprise Server 9 (i586) (kernal 2.6.5-7.111-smp) I was able to compile gcc 4.7.0 successfully when it was released. I had less luck with 4.8.0. 4.9.0 fails as below. However, this machine/distribution is so old it is not unreasonable to say it should be scrapped. My main targets are RHEL5 and RHEL6 which work perfectly. I also tried bootstrapping using 4.8.1 to build 4.9.0 on RHEL4 and 4.7.0 to build 4.9.0 on the Suse box rather than the ancient system installed versions (RHEL4 = gcc 3.4.6, Suse 9 = 3.3.3) but without success. Regards, Bruce. RHEL4 (kernel 2.6.9-55.ELsmp): [snip] ../../../../gcc-4.9.0/libsanitizer/include/system/linux/aio_abi.h:2:32: fatal error: linux/aio_abi.h: No such file or director y #include_next ^ compilation terminated. make[3]: *** [sanitizer_platform_limits_linux.lo] Error 1 make[3]: Leaving directory `/development/brucea/gcc/build/build/x86_64-unknown-linux-gnu/libsanitizer/sanitizer_common' make[2]: *** [install-recursive] Error 1 [snip] SUSE LINUX Enterprise Server 9 (i586) (kernal 2.6.5-7.111-smp) [snip] /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../../../../i686-pc-linux-gnu/bin/ld: /home/brucea/gcc4 .9/lib/libmpfr.so: undefined reference to symbol '___tls_get_addr@@GLIBC_2.3' /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../../../../i686-pc-linux-gnu/bin/ld: note: '___tls_get _addr@@GLIBC_2.3' is defined in DSO /lib/ld-linux.so.2 so try adding it to the linker command line /lib/ld-linux.so.2: could not read symbols: Invalid operation collect2: error: ld returned 1 exit status make[3]: *** [cc1] Error 1 [snip] Requires a later version of glibc?
Re: Supported targets
> [snip] > /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../.. > /../../i686-pc-linux-gnu/bin/ld: /home/brucea/gcc4 .9/lib/libmpfr.so: > undefined reference to symbol '___tls_get_addr@@GLIBC_2.3' > /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../. > ./../../i686-pc-linux-gnu/bin/ld: note: '___tls_get _addr@@GLIBC_2.3' is > defined in DSO /lib/ld-linux.so.2 so try adding it to the linker command > line /lib/ld-linux.so.2: could not read symbols: Invalid operation > collect2: error: ld returned 1 exit status > make[3]: *** [cc1] Error 1 > > [snip] > > > Requires a later version of glibc? Yes, glibc 2.4 is required for GCC 4.9 because of this. -- Eric Botcazou
Re: Supported targets
On 20 May 2014 11:26, Bruce Adams wrote: > > RHEL4 (kernel 2.6.9-55.ELsmp): > > > I was able to compile 4.8.1 successfully when it was released. 4.9.0 fails as > below. > RHEL4 is end of life (but not extended life). > > My feeling is this ought to work and is probably a regression I should report? Yes, I think it should be reported if it isn't in Bugzilla yet. You can use --disable-libsanitizer to build GCC without the failing library.
Re: Supported targets
On 20 May 2014 11:55, Eric Botcazou wrote: >> [snip] >> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../.. >> /../../i686-pc-linux-gnu/bin/ld: /home/brucea/gcc4 .9/lib/libmpfr.so: >> undefined reference to symbol '___tls_get_addr@@GLIBC_2.3' >> /development/dev1/brucea/gcc4.7/bin/../lib/gcc/i686-pc-linux-gnu/4.7.0/../. >> ./../../i686-pc-linux-gnu/bin/ld: note: '___tls_get _addr@@GLIBC_2.3' is >> defined in DSO /lib/ld-linux.so.2 so try adding it to the linker command >> line /lib/ld-linux.so.2: could not read symbols: Invalid operation >> collect2: error: ld returned 1 exit status >> make[3]: *** [cc1] Error 1 >> >> [snip] >> >> >> Requires a later version of glibc? > > Yes, glibc 2.4 is required for GCC 4.9 because of this. Should that be noted at https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ?
Re: Supported targets
> > Yes, glibc 2.4 is required for GCC 4.9 because of this. > > Should that be noted at > https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ? Probably, unless someone knows how to work around it. We traced it to the missing AS_NEEDED in /usr/lib/libc.so: /* GNU ld script Use the shared library, but some functions are only in the static library, so try that secondarily. */ OUTPUT_FORMAT(elf32-i386) GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a AS_NEEDED ( /lib/ld- linux.so.2 ) ) -- Eric Botcazou
Re: Supported targets
On Tue, May 20, 2014 at 01:14:24PM +0200, Eric Botcazou wrote: > > > Yes, glibc 2.4 is required for GCC 4.9 because of this. > > > > Should that be noted at > > https://gcc.gnu.org/install/specific.html#x-x-linux-gnu ? > > Probably, unless someone knows how to work around it. We traced it to the > missing AS_NEEDED in /usr/lib/libc.so: > > /* GNU ld script >Use the shared library, but some functions are only in >the static library, so try that secondarily. */ > OUTPUT_FORMAT(elf32-i386) > GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a AS_NEEDED ( /lib/ld- > linux.so.2 ) ) But that should be generally needed only when linking with -Wl,-z,defs , without it the linker shouldn't care. Jakub
Re: Supported targets
> But that should be generally needed only when linking with -Wl,-z,defs , > without it the linker shouldn't care. Yet using a local libc.so with the missing AS_NEEDED is a (poor) workaround. -- Eric Botcazou
Re: [GSoC] writing test-case
On Mon, May 19, 2014 at 5:51 PM, Michael Matz wrote: > Hi, > > On Thu, 15 May 2014, Richard Biener wrote: > >> To me predicate (and capture without expression or predicate) >> differs from expression in that predicate is clearly a leaf of the >> expression tree while we have to recurse into expression operands. >> >> Now, if we want to support applying predicates to the midst of an >> expression, like >> >> (plus predicate(minus @0 @1) >> @2) >> (...) >> >> then this would no longer be true. At the moment you'd write >> >> (plus (minus@3 @0 @1) >> @2) >> if (predicate (@3)) >> (...) >> >> which makes it clearer IMHO (with the decision tree building >> you'd apply the predicates after matching the expression tree >> anyway I suppose, so code generation would be equivalent). > > Syntaxwise I had this idea for adding generic predicates to expressions: > > (plus (minus @0 @1):predicate > @2) > (...) So you'd write (plus @0 :integer_zerop) instead of (plus @0 integer_zerop) ? > If prefix or suffix doesn't matter much, but using a different syntax > to separate expression from predicate seems to make things clearer. > Optionally adding things like and/or for predicates might also make sense: > > (plus (minus @0 @1):positive_p(@0) || positive_p(@1) > @2) > (...) negation whould be more useful I guess. You open up a can of worms with ordering though: (plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0)) which might be declared invalid or is equivalent to (plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0) ? Note that your predicate placement doesn't match placement of captures for non-innermost expressions. capturing the outer plus would be (plus@3 (minus @0 @1) @2) not (plus (minus @0 @1) @2)@3 so maybe apply predicates there as well: (plus:operand_equal_p (@1, @2, 0) (minus @0 @1) @2) But I still think that doing all predicates within a if-expr makes the pattern less convoluted. Enabling/disabling a whole set of patterns with a common condition might still be a worthwhile addition. Richard. > > Ciao, > Michael.
Re: [GSoC] first phase
On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni wrote: > Hi, >Unfortunately I shall need to take this week off, due to university exams, > which are up-to 27th May. I will start working from 28th on pattern > matching with decision tree, and try to cover up for the first week. I > am extremely sorry about this. > I thought I would be able to do both during exam week, but the exam > load has become too much -:( Ok. > In the first phase (up-to 23rd June), I hope to get genmatch ready: > a) pattern matching with decision tree. > b) Add patterns to test genmatch. > c) Depending upon the patterns, extending the meta-description > d) Other fixes: > > * capturing outermost expressions. > For example this pattern does not get simplified > (match_and_simplify > (plus@2 (negate @0) @1) > if (!TYPE_SATURATING (TREE_TYPE (@2))) > (minus @1 @0)) > I guess this happens because in write_nary_simplifiers: > if (s->match->type != OP_EXPR) > continue; Yeah. > Maybe this is not correct way to fix this, should we also pass lhs to > generated gimple_match_and_simplify ? I guess that would be the capture > for outermost expression. Unfortunately it is not available for all API entries. The type of the expression is, though. I lean towards rejecting the capture at parsing time and providing a "special" capture (for example @@, or just @0, or @T to denote it's a type, or just refer "magically" to 'type'). That is, (match_and_simplify (plus (negate @0) @1) if (!TYPE_SATURATING (type)) (minus @1 @0)) works for me. > For above pattern, I guess @2 represents lhs. > > So for this test-case: > int foo (int x, int y) > { > int t1 = -x; > int t2 = t1 + y; > return t2; > } > t2 would be @2, t1 would be @0 and y would be @1. > Is that correct ? > This would create issues when lhs is NULL, for example, > in call to built-in functions ? Yeah, or if the machinery is called via gimple_build () where there is no existing lhs. > * avoid using statement expressions for code gen of expression > * rewriting code-generator using visitor classes, and other refactoring > (using std::string for example), etc. > > I have a very rough time-line in mind, for completing tasks: > 28th may - 31st may > a) Have test-case for each pattern present (except COND_EXPR) in match.pd > I guess most of it is already done, a few patterns are remaining. Good. > b) Small fixes (for example, those mentioned above). Good. > c) Have an initial idea/prototype for implementing decision tree > > 1st June - 15th June > a) Implementing decision tree > b) Adding patterns in match.pd to test the decision tree in match.pd, > and accompanying test-cases in tree-ssa/match-*.c > > 16th June - 23rd June > a) Support for GENERIC code generation. > b) Refactoring and backup time for backlog. > > GENERIC code generation: > I am a bit confused about this. Currently, pattern matching is > implemented for GENERIC. However I believe simplification is done on > GIMPLE. > For example: > (match_and_simplify > (plus (negate @0) @1) > (minus @0 @1)) > If given input is GENERIC , it would do matching on GENERIC, but shall > transform (minus @0 @1) to it's GIMPLE equivalent. > Is that correct ? Correct. Err, not sure what it will do - I implemented it only to support the weird cases where GENERIC is nested inside GIMPLE, like for a_2 = b_3 < 0 ? c_4 : d_5; thus the comment in match.pd: /* Due to COND_EXPRs weirdness in GIMPLE the following won't work without some hacks in the code generator. */ (match_and_simplify (cond (bit_not @0) @1 @2) (cond @0 @2 @1)) the code generator would need to know that COND_EXPR has a GENERIC op0 ... same applies to REALPART_EXPR, but there the hacks are already in place ;) > > * Should we have a separate GENERIC match-and-simplify API like for gimple > instead of having GENERIC matching in gimple_match_and_simplify ? Yes. The GENERIC API follows the API of fold_{unary,binary,ternary}. I suppose we simply provide a slightly different name for them (but use the original API for recursing and call ourselves from the original API). > * Do we add another pattern type, something like > generic_match_and_simplify that will do the transform on GENERIC > for example: > (generic_match_and_simplify > (plus (negate @0) @1) > (minus @0 @1)) > would produce GENERIC equivalent of (minus @0 @1). > > or maybe keep match_and_simplify, and tell the transform operand > to produce GENERIC. > Something like: > (match_and_simplify > (plus (negate @0) @1) > GENERIC: (minus @0 @1)) we simply process each pattern twice, once we generate the GIMPLE match-and-simplify routine and once we generate the GENERIC match-and-simplify routine. The patterns are supposed to be the same for both and always apply to both. > Another thing I would like to do in first phase is figure out dependencies > of tree-ssa-forwprop on GENERIC folding (for instance fold_comparison > patterns). Yeah. Having patterns for comparison simpli
Re: [GSoC] writing test-case
Hi, On Tue, 20 May 2014, Richard Biener wrote: > > Syntaxwise I had this idea for adding generic predicates to expressions: > > > > (plus (minus @0 @1):predicate > > @2) > > (...) > > So you'd write > > (plus @0 :integer_zerop) > > instead of > > (plus @0 integer_zerop) > > ? plus is binary, where is your @1? If you want to not capture the second operand but still have it tested for a predicates, then yes, the first form it would be. > > > If prefix or suffix doesn't matter much, but using a different syntax > > to separate expression from predicate seems to make things clearer. > > Optionally adding things like and/or for predicates might also make sense: > > > > (plus (minus @0 @1):positive_p(@0) || positive_p(@1) > > @2) > > (...) > > negation whould be more useful I guess. You open up a can of > worms with ordering though: > > (plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0)) > > which might be declared invalid or is equivalent to It wouldn't necessarily be invalid, the predicate would apply to @2; but check operands 1 and 0 as well, which might be surprising. In this case it might indeed be equivalent to : > (plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0) > Note that your predicate placement doesn't match placement of > captures for non-innermost expressions. capturing the outer > plus would be > > (plus@3 (minus @0 @1) @2) You're right, I'd allow placing the predicate directly behind the capture, i.e.: (plus@3:predicate (minus @0 @1) @2) > But I still think that doing all predicates within a if-expr makes the > pattern less convoluted. I think it simply depends on the scope of the predicate. If it's a predicate applying to multiple operands from different nested level an if-expr is clearer (IMHO). If it applies to one operand it seems more natural to place it directly next to that operand. I.e.: (minus @0 @1:non_negative) // better vs. (minus @0 @1) (if (non_negative (@1)) But: (plus@3 (minus @0 @1) @2) // better (if (operand_equal_p (@1, @2, 0)) vs: (plus@3:operand_equal_p (@1, @2, 0) (minus @0 @1) @2) That is we could require that predicates that are applied with ':' need to be unary and apply to the one expression to which they are bound. > Enabling/disabling a whole set of patterns with a common condition > might still be a worthwhile addition. Right, but that seems orthogonal to the above? Ciao, Michael.
Re: [GSoC] first phase
On Tue, May 20, 2014 at 5:46 PM, Richard Biener wrote: > On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni > wrote: >> Hi, >>Unfortunately I shall need to take this week off, due to university exams, >> which are up-to 27th May. I will start working from 28th on pattern >> matching with decision tree, and try to cover up for the first week. I >> am extremely sorry about this. >> I thought I would be able to do both during exam week, but the exam >> load has become too much -:( > > Ok. > >> In the first phase (up-to 23rd June), I hope to get genmatch ready: >> a) pattern matching with decision tree. >> b) Add patterns to test genmatch. >> c) Depending upon the patterns, extending the meta-description >> d) Other fixes: >> >> * capturing outermost expressions. >> For example this pattern does not get simplified >> (match_and_simplify >> (plus@2 (negate @0) @1) >> if (!TYPE_SATURATING (TREE_TYPE (@2))) >> (minus @1 @0)) >> I guess this happens because in write_nary_simplifiers: >> if (s->match->type != OP_EXPR) >> continue; > > Yeah. > >> Maybe this is not correct way to fix this, should we also pass lhs to >> generated gimple_match_and_simplify ? I guess that would be the capture >> for outermost expression. > > Unfortunately it is not available for all API entries. The type of the > expression is, though. > > I lean towards rejecting the capture at parsing time and providing > a "special" capture (for example @@, or just @0, or @T to denote > it's a type, or just refer "magically" to 'type'). That is, > > (match_and_simplify > (plus (negate @0) @1) > if (!TYPE_SATURATING (type)) > (minus @1 @0)) > > works for me. > >> For above pattern, I guess @2 represents lhs. >> >> So for this test-case: >> int foo (int x, int y) >> { >> int t1 = -x; >> int t2 = t1 + y; >> return t2; >> } >> t2 would be @2, t1 would be @0 and y would be @1. >> Is that correct ? >> This would create issues when lhs is NULL, for example, >> in call to built-in functions ? > > Yeah, or if the machinery is called via gimple_build () where > there is no existing lhs. > >> * avoid using statement expressions for code gen of expression >> * rewriting code-generator using visitor classes, and other refactoring >> (using std::string for example), etc. >> >> I have a very rough time-line in mind, for completing tasks: >> 28th may - 31st may >> a) Have test-case for each pattern present (except COND_EXPR) in match.pd >> I guess most of it is already done, a few patterns are remaining. > > Good. > >> b) Small fixes (for example, those mentioned above). > > Good. > >> c) Have an initial idea/prototype for implementing decision tree >> >> 1st June - 15th June >> a) Implementing decision tree >> b) Adding patterns in match.pd to test the decision tree in match.pd, >> and accompanying test-cases in tree-ssa/match-*.c >> >> 16th June - 23rd June >> a) Support for GENERIC code generation. >> b) Refactoring and backup time for backlog. >> >> GENERIC code generation: >> I am a bit confused about this. Currently, pattern matching is >> implemented for GENERIC. However I believe simplification is done on >> GIMPLE. >> For example: >> (match_and_simplify >> (plus (negate @0) @1) >> (minus @0 @1)) >> If given input is GENERIC , it would do matching on GENERIC, but shall >> transform (minus @0 @1) to it's GIMPLE equivalent. >> Is that correct ? > > Correct. Err, not sure what it will do - I implemented it only to support > the weird cases where GENERIC is nested inside GIMPLE, like for > a_2 = b_3 < 0 ? c_4 : d_5; thus the comment in match.pd: > > /* Due to COND_EXPRs weirdness in GIMPLE the following won't work >without some hacks in the code generator. */ > (match_and_simplify > (cond (bit_not @0) @1 @2) > (cond @0 @2 @1)) > > the code generator would need to know that COND_EXPR has > a GENERIC op0 ... same applies to REALPART_EXPR, but there > the hacks are already in place ;) > >> >> * Should we have a separate GENERIC match-and-simplify API like for gimple >> instead of having GENERIC matching in gimple_match_and_simplify ? > > Yes. The GENERIC API follows the API of fold_{unary,binary,ternary}. > I suppose we simply provide a slightly different name for them > (but use the original API for recursing and call ourselves from the original > API). > >> * Do we add another pattern type, something like >> generic_match_and_simplify that will do the transform on GENERIC >> for example: >> (generic_match_and_simplify >> (plus (negate @0) @1) >> (minus @0 @1)) >> would produce GENERIC equivalent of (minus @0 @1). >> >> or maybe keep match_and_simplify, and tell the transform operand >> to produce GENERIC. >> Something like: >> (match_and_simplify >> (plus (negate @0) @1) >> GENERIC: (minus @0 @1)) > > we simply process each pattern twice, once we generate the > GIMPLE match-and-simplify routine and once we generate the > GENERIC match-and-simplify routine. The patterns are supposed > to be the same for bo
Re: [GSoC] first phase
On Tue, May 20, 2014 at 2:59 PM, Prathamesh Kulkarni wrote: > On Tue, May 20, 2014 at 5:46 PM, Richard Biener > wrote: >> On Mon, May 19, 2014 at 7:30 PM, Prathamesh Kulkarni >> wrote: >>> Hi, >>>Unfortunately I shall need to take this week off, due to university >>> exams, >>> which are up-to 27th May. I will start working from 28th on pattern >>> matching with decision tree, and try to cover up for the first week. I >>> am extremely sorry about this. >>> I thought I would be able to do both during exam week, but the exam >>> load has become too much -:( >> >> Ok. >> >>> In the first phase (up-to 23rd June), I hope to get genmatch ready: >>> a) pattern matching with decision tree. >>> b) Add patterns to test genmatch. >>> c) Depending upon the patterns, extending the meta-description >>> d) Other fixes: >>> >>> * capturing outermost expressions. >>> For example this pattern does not get simplified >>> (match_and_simplify >>> (plus@2 (negate @0) @1) >>> if (!TYPE_SATURATING (TREE_TYPE (@2))) >>> (minus @1 @0)) >>> I guess this happens because in write_nary_simplifiers: >>> if (s->match->type != OP_EXPR) >>> continue; >> >> Yeah. >> >>> Maybe this is not correct way to fix this, should we also pass lhs to >>> generated gimple_match_and_simplify ? I guess that would be the capture >>> for outermost expression. >> >> Unfortunately it is not available for all API entries. The type of the >> expression is, though. >> >> I lean towards rejecting the capture at parsing time and providing >> a "special" capture (for example @@, or just @0, or @T to denote >> it's a type, or just refer "magically" to 'type'). That is, >> >> (match_and_simplify >> (plus (negate @0) @1) >> if (!TYPE_SATURATING (type)) >> (minus @1 @0)) >> >> works for me. >> >>> For above pattern, I guess @2 represents lhs. >>> >>> So for this test-case: >>> int foo (int x, int y) >>> { >>> int t1 = -x; >>> int t2 = t1 + y; >>> return t2; >>> } >>> t2 would be @2, t1 would be @0 and y would be @1. >>> Is that correct ? >>> This would create issues when lhs is NULL, for example, >>> in call to built-in functions ? >> >> Yeah, or if the machinery is called via gimple_build () where >> there is no existing lhs. >> >>> * avoid using statement expressions for code gen of expression >>> * rewriting code-generator using visitor classes, and other refactoring >>> (using std::string for example), etc. >>> >>> I have a very rough time-line in mind, for completing tasks: >>> 28th may - 31st may >>> a) Have test-case for each pattern present (except COND_EXPR) in match.pd >>> I guess most of it is already done, a few patterns are remaining. >> >> Good. >> >>> b) Small fixes (for example, those mentioned above). >> >> Good. >> >>> c) Have an initial idea/prototype for implementing decision tree >>> >>> 1st June - 15th June >>> a) Implementing decision tree >>> b) Adding patterns in match.pd to test the decision tree in match.pd, >>> and accompanying test-cases in tree-ssa/match-*.c >>> >>> 16th June - 23rd June >>> a) Support for GENERIC code generation. >>> b) Refactoring and backup time for backlog. >>> >>> GENERIC code generation: >>> I am a bit confused about this. Currently, pattern matching is >>> implemented for GENERIC. However I believe simplification is done on >>> GIMPLE. >>> For example: >>> (match_and_simplify >>> (plus (negate @0) @1) >>> (minus @0 @1)) >>> If given input is GENERIC , it would do matching on GENERIC, but shall >>> transform (minus @0 @1) to it's GIMPLE equivalent. >>> Is that correct ? >> >> Correct. Err, not sure what it will do - I implemented it only to support >> the weird cases where GENERIC is nested inside GIMPLE, like for >> a_2 = b_3 < 0 ? c_4 : d_5; thus the comment in match.pd: >> >> /* Due to COND_EXPRs weirdness in GIMPLE the following won't work >>without some hacks in the code generator. */ >> (match_and_simplify >> (cond (bit_not @0) @1 @2) >> (cond @0 @2 @1)) >> >> the code generator would need to know that COND_EXPR has >> a GENERIC op0 ... same applies to REALPART_EXPR, but there >> the hacks are already in place ;) >> >>> >>> * Should we have a separate GENERIC match-and-simplify API like for gimple >>> instead of having GENERIC matching in gimple_match_and_simplify ? >> >> Yes. The GENERIC API follows the API of fold_{unary,binary,ternary}. >> I suppose we simply provide a slightly different name for them >> (but use the original API for recursing and call ourselves from the original >> API). >> >>> * Do we add another pattern type, something like >>> generic_match_and_simplify that will do the transform on GENERIC >>> for example: >>> (generic_match_and_simplify >>> (plus (negate @0) @1) >>> (minus @0 @1)) >>> would produce GENERIC equivalent of (minus @0 @1). >>> >>> or maybe keep match_and_simplify, and tell the transform operand >>> to produce GENERIC. >>> Something like: >>> (match_and_simplify >>> (plus (negate @0) @1) >>> GENERIC: (minus @0 @1)) >>
Re: [GSoC] writing test-case
On Tue, May 20, 2014 at 2:20 PM, Michael Matz wrote: > Hi, > > On Tue, 20 May 2014, Richard Biener wrote: > >> > Syntaxwise I had this idea for adding generic predicates to expressions: >> > >> > (plus (minus @0 @1):predicate >> > @2) >> > (...) >> >> So you'd write >> >> (plus @0 :integer_zerop) >> >> instead of >> >> (plus @0 integer_zerop) >> >> ? > > plus is binary, where is your @1? I know it's zero so I don't need it captured. (match_and_simplify (plus @0 integer_zerop) @0) mind that all predicates apply to leafs only at the moment. > If you want to not capture the second > operand but still have it tested for a predicates, then yes, the first > form it would be. Ok. >> >> > If prefix or suffix doesn't matter much, but using a different syntax >> > to separate expression from predicate seems to make things clearer. >> > Optionally adding things like and/or for predicates might also make sense: >> > >> > (plus (minus @0 @1):positive_p(@0) || positive_p(@1) >> > @2) >> > (...) >> >> negation whould be more useful I guess. You open up a can of >> worms with ordering though: >> >> (plus (minus @0 @1) @2:operand_equal_p (@1, @2, 0)) >> >> which might be declared invalid or is equivalent to > > It wouldn't necessarily be invalid, the predicate would apply to @2; > but check operands 1 and 0 as well, which might be surprising. In this > case it might indeed be equivalent to : > >> (plus (minus @0 @1) @2):operand_equal_p (@1, @2, 0) > > > >> Note that your predicate placement doesn't match placement of >> captures for non-innermost expressions. capturing the outer >> plus would be >> >> (plus@3 (minus @0 @1) @2) > > > You're right, I'd allow placing the predicate directly behind the capture, > i.e.: > > (plus@3:predicate (minus @0 @1) @2) > >> But I still think that doing all predicates within a if-expr makes the >> pattern less convoluted. > > I think it simply depends on the scope of the predicate. If it's a > predicate applying to multiple operands from different nested level an > if-expr is clearer (IMHO). If it applies to one operand it seems more > natural to place it directly next to that operand. I.e.: > > (minus @0 @1:non_negative) // better > > vs. > > (minus @0 @1) > (if (non_negative (@1)) > > But: > > (plus@3 (minus @0 @1) @2) // better > (if (operand_equal_p (@1, @2, 0)) > > vs: > > (plus@3:operand_equal_p (@1, @2, 0) (minus @0 @1) @2) > > That is we could require that predicates that are applied with ':' need to > be unary and apply to the one expression to which they are bound. Your example is to leafs which we already support as (minus @0 non_negative@1) any good example that is not convoluted where that applies to a non-leaf position? >> Enabling/disabling a whole set of patterns with a common condition >> might still be a worthwhile addition. > > Right, but that seems orthogonal to the above? Right. Richard. > > Ciao, > Michael.
Re: RFC: Doc update for attribute
On 05/20/2014 03:02 AM, David Wohlferd wrote: > After thinking about this some more, I believe I have some better > text. Previously I used the word "discouraged" to describe this > practice. The existing docs use the term "avoid." I believe what you > want is something more like the attached. Direct and clear, just like > docs should be. David, Thanks for the new patch. > If you are ok with this, I'll send it to gcc-patches. Looks good to me. Cheers, Carlos. > Index: extend.texi > === > --- extend.texi (revision 210624) > +++ extend.texi (working copy) > @@ -3332,16 +3332,15 @@ > > @item naked > @cindex function without a prologue/epilogue code > -Use this attribute on the ARM, AVR, MCORE, MSP430, NDS32, RL78, RX and SPU > -ports to indicate that the specified function does not need prologue/epilogue > -sequences generated by the compiler. > -It is up to the programmer to provide these sequences. The > -only statements that can be safely included in naked functions are > -@code{asm} statements that do not have operands. All other statements, > -including declarations of local variables, @code{if} statements, and so > -forth, should be avoided. Naked functions should be used to implement the > -body of an assembly function, while allowing the compiler to construct > -the requisite function declaration for the assembler. > +This attribute is available on the ARM, AVR, MCORE, MSP430, NDS32, > +RL78, RX and SPU ports. It allows the compiler to construct the > +requisite function declaration, while allowing the body of the > +function to be assembly code. The specified function will not have > +prologue/epilogue sequences generated by the compiler. Only Basic > +@code{asm} statements can safely be included in naked functions > +(@pxref{Basic Asm}). While using Extended @code{asm} or a mixture of > +Basic @code{asm} and ``C'' code may appear to work, they cannot be > +depended upon to work reliably and are not supported. > > @item near > @cindex functions that do not handle memory bank switching on 68HC11/68HC12 > @@ -6269,6 +6268,8 @@ > efficient code, and in most cases it is a better solution. When writing > inline assembly language outside of C functions, however, you must use Basic > @code{asm}. Extended @code{asm} statements have to be inside a C function. > +Functions declared with the @code{naked} attribute also require Basic > +@code{asm} (@pxref{Function Attributes}). > > Under certain circumstances, GCC may duplicate (or remove duplicates of) > your > assembly code when optimizing. This can lead to unexpected duplicate > @@ -6388,6 +6389,8 @@ > > Note that Extended @code{asm} statements must be inside a function. Only > Basic @code{asm} may be outside functions (@pxref{Basic Asm}). > +Functions declared with the @code{naked} attribute also require Basic > +@code{asm} (@pxref{Function Attributes}). > > While the uses of @code{asm} are many and varied, it may help to think of an > @code{asm} statement as a series of low-level instructions that convert > input
Re: RFC: Doc update for attribute
On 05/20/2014 03:59 AM, Georg-Johann Lay wrote: > Am 05/16/2014 07:16 PM, schrieb Carlos O'Donell: >> On 05/12/2014 11:13 PM, David Wohlferd wrote: >>> After updating gcc's docs about inline asm, I'm trying to >>> improve some of the related sections. One that I feel has >>> problems with clarity is __attribute__ naked. >>> >>> I have attached my proposed update. Comments/corrections are >>> welcome. >>> >>> In a related question: >>> >>> To better understand how this attribute is used, I looked at the >>> Linux kernel. While the existing docs say "only ... asm >>> statements that do not have operands" can safely be used, Linux >>> routinely uses asm WITH operands. >> >> That's a bug. Period. You must not use naked with an asm that has >> operands. Any kind of operand might inadvertently cause the >> compiler to generate code and that would violate the requirements >> of the attribute and potentially generate an ICE. > > There is target hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS that is > intended to cater that case. For example, the documentation > indicates it only works with optimization turned off. But I don't > know how reliable it is in general. For avr target it works as > expected. > > https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-TARGET_005fALLOCATE_005fSTACK_005fSLOTS_005fFOR_005fARGS-4969 It's still a bug for now. That hook is there because we've allowed bad code to exist for so long that at this point we must for legacy reasons allow some type of input arguments in the asm. However, that doesn't mean we should actively promote this feature or let users use it (until we fix it). Ideally you do want to use the named input arguments as "r" types to avoid needing to know the exact registers used in the call sequence. Referencing the variables by name and letting gcc emit the right register is useful, but only if it works consistently and today it doesn't. Features that fail to work depending on the optimization level should not be promoted in the documentation. We should document what works and file bugs or fix what doesn't work. Cheers, Carlos.
Weird startup issue with -fsplit-stack
Hello, I'm trying to support -fsplit-stack in GNU Emacs. The most important problem is that GC uses conservative scanning of a C stack, so I need to iterate over stack segments. I'm doing this by using __splitstack_find, as described in libgcc/generic-morestack.c; but now I'm facing the weird issue with startup: Core was generated by `./temacs --batch --load loadup bootstrap'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486 486 pushq %rax (gdb) bt 10 #0 __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486 #1 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #2 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #3 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #4 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #5 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #6 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #7 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #8 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #9 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 (More stack frames follow...) (gdb) bt -10 #87310 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87311 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87312 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87313 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87314 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87315 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87316 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87317 0x005f15df in __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 #87318 0x003791a21d65 in __libc_start_main (main=0x4d111d , argc=5, argv=0x7fffacc868d8, init=, fini=, rtld_fini=, stack_end=0x7fffacc868c8) at libc-start.c:285 #87319 0x00405f69 in _start () (gdb) Unfortunately I was unable to reproduce this issue with small test programs, so there is no simple and easy-to-use recipe. Anyway, if someone would like to try: bzr branch bzr://bzr.savannah.gnu.org/emacs/trunk cd trunk cat /path/to/emacs_split_stack.patch | patch -p0 # 'configure' options for 'smallest possible' configuration CPPFLAGS='-DSPLIT_STACK=1' CFLAGS='-O0 -g3 -fsplit-stack' ./configure --prefix=/some/dir --without-all --without-x --disable-acl make I'm using (homebrew) GCC 4.9.0 and (stock) gold 2.24 on a Fedora 20 system. Dmitry === modified file 'src/alloc.c' --- src/alloc.c 2014-05-19 19:19:05 + +++ src/alloc.c 2014-05-20 14:01:56 + @@ -4932,11 +4932,28 @@ #endif /* not GC_SAVE_REGISTERS_ON_STACK */ #endif /* not HAVE___BUILTIN_UNWIND_INIT */ - /* This assumes that the stack is a contiguous region in memory. If - that's not the case, something has to be done here to iterate - over the stack segments. */ +#ifdef SPLIT_STACK + + /* This assumes gcc >= 4.6.0 with -fsplit-stack + and corresponding support in libgcc. */ + { +size_t stack_size; +extern void * __splitstack_find (void *, void *, size_t *, + void **, void **, void **); +void *next_segment = NULL, *next_sp = NULL, *initial_sp = NULL, *stack; + +while ((stack = __splitstack_find (next_segment, next_sp, &stack_size, + &next_segment, &next_sp, &initial_sp))) + mark_memory (stack, (char *) stack + stack_size); + } + +#else /* not SPLIT_STACK */ + + /* This assumes that the stack is a contiguous region in memory. */ mark_memory (stack_base, end); +#endif /* SPLIT_STACK */ + /* Allow for marking a secondary stack, like the register stack on the ia64. */ #ifdef GC_MARK_SECONDARY_STACK
Re: soft-fp functions support without using libgcc
>>If you have a working compiler that is missing some functions >>provided by libgcc, that should be sufficient to build libgcc. Meaning that even if i am unable build libgcc to my new architecture, I should be able to able to provide soft-fp support to the architecture? Btw i get the following error when i build gcc: configure:2627: error: in `/target-arch/target-arch-gcc/builddir/target-arch/libgcc': configure:2630: error: cannot compute suffix of object files: cannot compile And regarding soft-fp, I get the following error when i use soft-fp functions in a test program: : In function `test': (.text+0x0): undefined reference to `__floatsisf' In function `test': : In function `test': (.text+0x2c): undefined reference to `__mulsf3' : In function `test': (.text+0x2e): undefined reference to `__fixsfsi' Is this due to libgcc build fail or it just linking error? >>In other words, if you want soft-fp for IEEE float, the job should be very >>simple because that has already been done. If you want soft-fp for CDC 6000 >>float, you have to do a full implementation of that. Actually i want soft-fp for standard IEEE 754 Sheheryar On Fri, May 16, 2014 at 6:34 PM, wrote: > > On May 16, 2014, at 12:25 PM, Ian Bolton wrote: > >>> On Fri, May 16, 2014 at 6:34 AM, Sheheryar Zahoor Qazi >>> wrote: I am trying to provide soft-fp support to a an 18-bit soft-core processor architecture at my university. But the problem is that libgcc has not been cross-compiled for my target architecture and >>> some functions are missing so i cannot build libgcc.I believe soft-fp is compiled in libgcc so i am usable to invoke soft-fp functions from libgcc. It is possible for me to provide soft-fp support without using >>> libgcc. How should i proceed in defining the functions? Any idea? And does >>> any archoitecture provide floating point support withoput using libgcc? >>> >>> I'm sorry, I don't understand the premise of your question. It is not >>> necessary to build libgcc before building libgcc. That would not make >>> sense. If you have a working compiler that is missing some functions >>> provided by libgcc, that should be sufficient to build libgcc. >> >> If you replace "cross-compiled" with "ported", I think it makes senses. >> Can one provide soft-fp support without porting libgcc for their >> architecture? > > By definition, in soft-fp you have to implement the FP operations in > software. That’s not quite the same as porting libgcc to the target > architecture. It should translate to porting libgcc (the FP emulation part) > to the floating point format being used. > > In other words, if you want soft-fp for IEEE float, the job should be very > simple because that has already been done. If you want soft-fp for CDC 6000 > float, you have to do a full implementation of that. > > paul >
Re: negative latencies
On 05/19/2014 02:13 AM, shmeel gutl wrote: > Are there hooks in gcc to deal with negative latencies? In other > words, an architecture that permits an instruction to use a result > from an instruction that will be issued later. > Could you explain more on *an example* what are you trying to achieve with the negative latency. Scheduler is based on a critical path algorithm. Generally speaking latency time can be negative for this algorithm. But I guess that is not what you are asking. > At first glance it seems that it will will break a few things. > 1) The definition of dependencies cannot come from the simple ordering > of rtl. > 2) The scheduling problem starts to look like "get off the train 3 > stops before me". > 3) The definition of live ranges needs to use actual instruction > timing information, not just instruction sequencing. > > The hooks in the scheduler seem to be enough to stop damage but not > enough to take advantage of this "feature". >
Re: Roadmap for 4.9.1, 4.10.0 and onwards?
On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote: > Hi, > I've been tracking the latest releases of gcc since 4.7 or so (variously > interested in C++1y support, cilk and openmp). > One thing I've found hard to locate is information about planned inclusions > for future releases. > As much relies on unpredictable community contributions I don't expect there > to be a concrete or reliable plan. > However, equally I'm sure the steering committee have some ideas over what > ought > to be upcoming releases. As a whole, the steering committee does not have any idea, because GCC development is based upon volunteer contributions. However, some members of the steering committee might work in large organization having a team of GCC contributors. That team might have its own (private) agenda. But every patch has to be approved by someone else. So I don't think that the steering committee knows a lot more than you and me. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
RE: Roadmap for 4.9.1, 4.10.0 and onwards?
> -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf > Of Basile Starynkevitch > Sent: 20 May 2014 16:29 > To: Bruce Adams > Cc: gcc@gcc.gnu.org > Subject: Re: Roadmap for 4.9.1, 4.10.0 and onwards? > > On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote: > > Hi, > > I've been tracking the latest releases of gcc since 4.7 or so > (variously interested in C++1y support, cilk and openmp). > > One thing I've found hard to locate is information about planned > inclusions for future releases. > > As much relies on unpredictable community contributions I don't > expect there to be a concrete or reliable plan. > > > However, equally I'm sure the steering committee have some ideas > over > > what ought to be upcoming releases. > > As a whole, the steering committee does not have any idea, because GCC > development is based upon volunteer contributions. > I understand the argument but I am not sure it's the way to go. Even if the project is based on volunteer contributions it would be interesting to have a tentative roadmap. This, I would think, would also help possible beginner volunteers know where to start if they wanted to contribute to the project. So the roadmap could be a list of features (big or small) of bug fixes that we would like fixed for a particular version. Even if we don't want to name it roadmap it would still be interesting to have a list of things that are being worked on or on the process of being merged into mainline and therefore will make it to the next major version. That being said I know it's hard to set sometime apart to write this kind of thing given most of us prefer to be hacking on GCC. From a newcomer point of view, however, not having things like a roadmap makes it look like the project is heading nowhere.
Re: soft-fp functions support without using libgcc
On Tue, May 20, 2014 at 7:37 AM, Sheheryar Zahoor Qazi wrote: >>>If you have a working compiler that is missing some functions >>>provided by libgcc, that should be sufficient to build libgcc. > Meaning that even if i am unable build libgcc to my new architecture, > I should be able to able to provide soft-fp support to the > architecture? You need to build soft-fp as part of libgcc. What I am saying is that you don't need soft-fp support in order to build libgcc. > Btw i get the following error when i build gcc: > configure:2627: error: in > `/target-arch/target-arch-gcc/builddir/target-arch/libgcc': > configure:2630: error: cannot compute suffix of object files: cannot compile You need to look in target-arch/libgcc/config.log to see what the problem is. > And regarding soft-fp, I get the following error when i use soft-fp > functions in a test program: > : In function `test': > (.text+0x0): undefined reference to `__floatsisf' > In function `test': > : In function `test': > (.text+0x2c): undefined reference to `__mulsf3' > : In function `test': > (.text+0x2e): undefined reference to `__fixsfsi' > > Is this due to libgcc build fail or it just linking error? It's because libgcc was not built. Ian
Re: Roadmap for 4.9.1, 4.10.0 and onwards?
- Original Message - > From: Paulo Matos > To: Basile Starynkevitch ; Bruce Adams > > Cc: "gcc@gcc.gnu.org" > Sent: Tuesday, May 20, 2014 5:04 PM > Subject: RE: Roadmap for 4.9.1, 4.10.0 and onwards? > >> -Original Message- >> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf >> Of Basile Starynkevitch >> Sent: 20 May 2014 16:29 >> To: Bruce Adams >> Cc: gcc@gcc.gnu.org >> Subject: Re: Roadmap for 4.9.1, 4.10.0 and onwards? >> >> On Tue, 2014-05-20 at 11:09 +0100, Bruce Adams wrote: >> > Hi, >> > I've been tracking the latest releases of gcc since 4.7 or so >> (variously interested in C++1y support, cilk and openmp). >> > One thing I've found hard to locate is information about planned >> inclusions for future releases. >> > As much relies on unpredictable community contributions I don't >> expect there to be a concrete or reliable plan. >> >> > However, equally I'm sure the steering committee have some ideas >> over >> > what ought to be upcoming releases. >> >> As a whole, the steering committee does not have any idea, because GCC >> development is based upon volunteer contributions. >> > > I understand the argument but I am not sure it's the way to go. Even if the > project is based on volunteer contributions it would be interesting to have a > tentative roadmap. This, I would think, would also help possible beginner > volunteers know where to start if they wanted to contribute to the project. > So > the roadmap could be a list of features (big or small) of bug fixes that we > would like fixed for a particular version. Even if we don't want to name it > roadmap it would still be interesting to have a list of things that are being > worked on or on the process of being merged into mainline and therefore will > make it to the next major version. > > That being said I know it's hard to set sometime apart to write this kind of > thing given most of us prefer to be hacking on GCC. From a newcomer point of > view, however, not having things like a roadmap makes it look like the > project > is heading nowhere. > If you think of gcc as a large distributed agile project the road map may be buried somewhere in the bug database. Perhaps its a matter of mining the relevant details or encouraging practices that make them mineable? The bugzilla has fields for assignee, priority and target milestone that could be used as hints. The trouble is its very low level. The intent is buried in the communities subjective interpretation of priority. I don't know how well that mirrors the actual values in the priority fields. I wouldn't expect it to without a conscious effort. If I search for "ALL cilk 4.9" or "ALL cilk" it is still not obvious that the cilk branch was merged into main prior to release 4.9.0. Though that could be down to my unfamiliarity with more complex queries in bugzilla. Regards, Bruce.
Re: Roadmap for 4.9.1, 4.10.0 and onwards?
On 05/20/14 04:09, Bruce Adams wrote: Hi, I've been tracking the latest releases of gcc since 4.7 or so (variously interested in C++1y support, cilk and openmp). One thing I've found hard to locate is information about planned inclusions for future releases. As much relies on unpredictable community contributions I don't expect there to be a concrete or reliable plan. However, equally I'm sure the steering committee have some ideas over what ought to be upcoming releases. Is this published anywhere? The steering committee doesn't get involved in that aspect of development. It's just not in the committee's charter. There is no single roadmap for the GCC project and that's a direct result of the decentralized development. Looking forward to the next major GCC release (4.10 or 5.0): At a high level, wrapping up the C++11 ABI transition is high on the list for the next major GCC release. As is the ongoing efforts to clean up the polymorphism in gimple (and maybe RTL). Those aren't really user visible features, but they're a ton of work. I'm hoping the Intel team can push the last remaining Cilk+ feature through (Cilk_for). Jakub is working on Fortran support for OpenMP4. Others are working on OpenACC support. Richi's work on folding looks promising, but I'm not sure of its relative priority. There's work to bring AArch64 and Power 8 to first class support... Honza's work on IPA, etc etc. C++14 support will continue to land as bits are written. I'm certainly missing lots of important stuff... WRT to gcc-4.9.1, like most (all?) point releases, it's primarily meant to address bugs in the prior release. I wouldn't expect significant features to be appearing in 4.9.x releases. Jeff
Re: Weird startup issue with -fsplit-stack
On Tue, May 20, 2014 at 7:18 AM, Dmitry Antipov wrote: > > I'm trying to support -fsplit-stack in GNU Emacs. The most important problem > is that > GC uses conservative scanning of a C stack, so I need to iterate over stack > segments. > I'm doing this by using __splitstack_find, as described in > libgcc/generic-morestack.c; > but now I'm facing the weird issue with startup: > > Core was generated by `./temacs --batch --load loadup bootstrap'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486 > 486 pushq %rax > (gdb) bt 10 > #0 __morestack () at ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:486 > #1 0x005f15df in __morestack () at > ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 > #2 0x005f15df in __morestack () at > ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 > #3 0x005f15df in __morestack () at > ../../../gcc-4.9.0/libgcc/config/i386/morestack.S:502 This is the call to __morestack_block_signals in morestack.S. It should only be possible if __morestack_block_signals or something it calls directly has a split stack. __morestack_block_signals has the no_split_stack attribute, meaning that it should never call __morestack. __morestack_block_signals only calls pthread_sigmark or sigprocmask, neither of which should be compiled with -fsplit-stack. So something has gone wrong, but I don't know what. I would recommend tracing the code instruction by instruction to see why __morestack_block_signals calls back into __morestack. Or, if that analysis is wrong, see what else is happening. I can advise but I don't have time to look at this in detail. Sorry. Ian
Re: Roadmap for 4.9.1, 4.10.0 and onwards?
> If I search for "ALL cilk 4.9" or "ALL cilk" it is still not obvious that the > cilk branch > was merged into main prior to release 4.9.0. Though that could be down to my > unfamiliarity with more complex queries in bugzilla. Our bugzilla is usually used for tracking bugs, not merging of feature branches. https://gcc.gnu.org/gcc-4.9/changes.html#c-family announces the addition of Cilk Plus. Merges of major new features should probably also be announced on https://gcc.gnu.org/news.html
Re: negative latencies
On 20-May-14 06:13 PM, Vladimir Makarov wrote: On 05/19/2014 02:13 AM, shmeel gutl wrote: Are there hooks in gcc to deal with negative latencies? In other words, an architecture that permits an instruction to use a result from an instruction that will be issued later. Could you explain more on *an example* what are you trying to achieve with the negative latency. Scheduler is based on a critical path algorithm. Generally speaking latency time can be negative for this algorithm. But I guess that is not what you are asking. The architecture has an exposed pipeline where instructions read registers during the required cycle. So if one instruction produces its results in the third pipeline stage and a second instruction reads the register in the sixth pipeline stage, the second instruction can read the results of the first instruction even if it is issued three cycles earlier. The problem that I see is that the haifa scheduler schedules one cycle at a time, in a forward order, by picking from a list of instructions that can be scheduled without delays. So, in the above example, if instruction one is scheduled during cycle 3, it can't schedule instruction two during cycle 0, 1, or 2 because its producer dependency (instruction one) hasn't been scheduled yet. It won't be able to schedule it until cycle 3. So I am asking if there is an existing mechanism to back schedule instruction two once instruction one is issued. Thanks, Shmeel At first glance it seems that it will will break a few things. 1) The definition of dependencies cannot come from the simple ordering of rtl. 2) The scheduling problem starts to look like "get off the train 3 stops before me". 3) The definition of live ranges needs to use actual instruction timing information, not just instruction sequencing. The hooks in the scheduler seem to be enough to stop damage but not enough to take advantage of this "feature".
Re: Zero/Sign extension elimination using value ranges
On 20/05/14 16:52, Jakub Jelinek wrote: > On Tue, May 20, 2014 at 12:27:31PM +1000, Kugan wrote: >> 1. Handling NOP_EXPR or CONVERT_EXPR that are in the IL because they >> are required for type correctness. We have two cases here: >> >> A) Mode is smaller than word_mode. This is usually from where the >> zero/sign extensions are showing up in final assembly. >> For example : >> int = (int) short >> which usually expands to >> (set (reg:SI ) >> (sext:SI (subreg:HI (reg:SI >> We can expand this >> (set (reg:SI ) (((reg:SI >> >> If following is true: >> 1. Value stored in RHS and LHS are of the same signedness >> 2. Type can hold the value. i.e., In cases like char = (char) short, we >> check that the value in short is representable char type. (i.e. look at >> the value range in RHS SSA_NAME and see if that can be represented in >> types of LHS without overflowing) >> >> Subreg here is not a paradoxical subreg. We are removing the subreg and >> zero/sign extend here. >> >> I am assuming here that QI/HI registers are represented in SImode >> (basically word_mode) with zero/sign extend is used as in >> (zero_extend:SI (subreg:HI (reg:SI 117)). > > Wouldn't it be better to just set proper flags on the SUBREG based on value > range info (SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P)? > Then not only the optimizers could eliminate in zext/sext when possible, but > all other optimizations could benefit from that. Thanks for the comments. Here is an attempt (attached) that sets SUBREG_PROMOTED_VAR_P based on value range into. Is this the good place to do this ? Thanks, Kugan diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index b7f6360..d23ae76 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3120,6 +3120,60 @@ expand_return (tree retval) } } + +static bool +is_assign_promotion_redundant (struct separate_ops *ops) +{ + double_int type_min, type_max; + double_int min, max; + bool uns = TYPE_UNSIGNED (ops->type); + double_int msb; + + /* We remove extension for integral stmts. */ + if (!INTEGRAL_TYPE_P (ops->type)) +return false; + + if (TREE_CODE_CLASS (ops->code) == tcc_unary) +{ + switch (ops->code) + { + case CONVERT_EXPR: + case NOP_EXPR: + + /* Get the value range. */ + if (TREE_CODE (ops->op0) != SSA_NAME + || POINTER_TYPE_P (TREE_TYPE (ops->op0)) + || get_range_info (ops->op0, &min, &max) != VR_RANGE) + return false; + + msb = double_int_one.rshift (TYPE_PRECISION (TREE_TYPE (ops->op0))); + if (!uns && min.cmp (msb, uns) == 1 + && max.cmp (msb, uns) == 1) + { + min = min.sext (TYPE_PRECISION (TREE_TYPE (ops->op0))); + max = max.sext (TYPE_PRECISION (TREE_TYPE (ops->op0))); + } + + /* Signedness of LHS and RHS should match or value range of RHS +should be all positive values to make zero/sign extension redundant. */ + if ((uns != TYPE_UNSIGNED (TREE_TYPE (ops->op0))) + && (min.cmp (double_int_zero, TYPE_UNSIGNED (TREE_TYPE (ops->op0))) == -1)) + return false; + + type_max = tree_to_double_int (TYPE_MAX_VALUE (ops->type)); + type_min = tree_to_double_int (TYPE_MIN_VALUE (ops->type)); + + /* If rhs value range fits lhs type, zero/sign extension is + redundant. */ + if (max.cmp (type_max, uns) != 1 + && (type_min.cmp (min, uns)) != 1) + return true; + } +} + + return false; +} + /* A subroutine of expand_gimple_stmt, expanding one gimple statement STMT that doesn't require special handling for outgoing edges. That is no tailcalls and no GIMPLE_COND. */ @@ -3240,6 +3294,12 @@ expand_gimple_stmt_1 (gimple stmt) } ops.location = gimple_location (stmt); + if (promoted && is_assign_promotion_redundant (&ops)) + { + promoted = false; + SUBREG_PROMOTED_VAR_P (target) = 0; + } + /* If we want to use a nontemporal store, force the value to register first. If we store into a promoted register, don't directly expand to target. */
Re: Roadmap for 4.9.1, 4.10.0 and onwards?
> On 05/20/14 04:09, Bruce Adams wrote: > >Hi, I've been tracking the latest releases of gcc since 4.7 or so > >(variously interested in C++1y support, cilk and openmp). One thing > >I've found hard to locate is information about planned inclusions for > >future releases. As much relies on unpredictable community > >contributions I don't expect there to be a concrete or reliable plan. > >However, equally I'm sure the steering committee have some ideas over > >what ought to be upcoming releases. Is this published anywhere? > The steering committee doesn't get involved in that aspect of > development. It's just not in the committee's charter. > > There is no single roadmap for the GCC project and that's a direct > result of the decentralized development. > > Looking forward to the next major GCC release (4.10 or 5.0): > > At a high level, wrapping up the C++11 ABI transition is high on the > list for the next major GCC release. As is the ongoing efforts to > clean up the polymorphism in gimple (and maybe RTL). Those aren't > really user visible features, but they're a ton of work. > > I'm hoping the Intel team can push the last remaining Cilk+ feature > through (Cilk_for). Jakub is working on Fortran support for > OpenMP4. Others are working on OpenACC support. > > Richi's work on folding looks promising, but I'm not sure of its > relative priority. There's work to bring AArch64 and Power 8 to > first class support... Honza's work on IPA, etc etc. For IPA/FDO I think we are on track to merge some of more interesting Google's changes (autoFDO, perhaps LIPO and other FDO improvements) and Martin's pass for merging identical code. I am personally trying to focus on two things - first is to cleanup APIs of symbol table and IPA infrastructure after the C++ conversion and try to get things working well for LTO of large binaries - this is important change for optimizers, since we go from units consisting of hundred functions to units consiting of million of functions and heuristics needs to retune. And I also hope we will continue pushing bits making LTO more transparent and reliable (command line arguments, debug info etc.) Honza > > C++14 support will continue to land as bits are written. > > I'm certainly missing lots of important stuff... > > > WRT to gcc-4.9.1, like most (all?) point releases, it's primarily > meant to address bugs in the prior release. I wouldn't expect > significant features to be appearing in 4.9.x releases. > > Jeff
Reducing Register Pressure through Live range Shrinking through Loops!!
Hello All: Simpson does the Live range shrinking and reduction of register pressure by using the computation that are not load and store but the arithmetic computation. The computation where the operands and registers are live at the entry and exit of the basic block but not touched inside the block then the computation is moved at the end of the block the reducing the register pressure inside the block by one. Extension of the Simpson work by extending the computation not being touched inside the basic block to the spanning of the Loops. If the Live ranges spans the Loops and live at the entry and exit of the Loop but the computation is not being touched inside the Loops then the computation is moved after the exit of the Loop. REDUCTION OF REGISTER PRESSURE THROUGH LIVE RANGE SHRINKING INSIDE THE LOOPS for each Loop starting from inner to outer do the following begin RELIEFIN(i) = null if i is the entry of the cfg. Else For all predecessors j RELIEFOUT(j) RELIEFOUT(i) = RELIEFIN(i) exposed union relief INSERT(I,j) = RELIEFOUT(i) RELIEFIN(i) Intersection Live(i) end The Simpson approach does takes the nesting depth into consideration of placing the computation and the relieve of the register pressure. Simpson approach doesn't takes into consideration the computation which spans throughout the loop and the operands and results are live at the entry of the Loop and exit of the Loop but not touched inside the Loops can be useful in reduction of register pressure inside the Loops. This approach will be useful in Region Based Register Allocator for Live Range Splitting at the Region Boundaries. Extension of the Simpson approach is to consider the data flow analysis with respect to the given Loop rather than having it for entire control flow graph. This data flow analysis starts from the inner loop and extends it to the outer loop. If the reference is not through the nested depth or with some depth then the computation can be placed accordingly. For register allocator by Graph coloring the live ranges that are with respect to operands and results of the computation are taken into consideration and for the above approach put into the stack during simplification phase of Graph Coloring so that there is a chance of getting such Live ranges colorable and thus reduces the register pressure. This is extended to splitting approach based on containment of Live ranges OPTIMAL PLACEMENT OF THE COMPUTATION FOR SINGLE ENTRY AND MULTIPLE EXIT LOOPS The placement of the computation to reduce the register pressure for Single Entry and Multiple exit by Simpson approach lead to unoptimal solution. The unoptimal Solution is because of the exit node of the loop does not post dominates all the basic block inside the Loops. Due to this the placement of the computation just after the tail block of the Loop will lead to incorrect results. In order to perform the Optimal Solution of the placement of the computation, the computation needs to be placed the block just after all the exit points of the Loop reconverge and which will post dominates all the blocks of the Loops. This will take care of reducing the register pressure for the Loops that are single Entry and Multiple Exit. For irreducible Loops the optimization to convert to reducible is done before the register allocation that reduces the register pressure and will be applicable to structured control flow and thus reduces the register pressure. The Live range shrinkage reducing register pressure takes load and store into consideration but not computation as proposed by Simpson. I am proposing to extend in GCC for the computation to reduce register pressure and for the Loop as given above for both Single Entry and Single Exit and Single Entry and Multiple Exit Loops. Please let me know what do you think. Thanks & Regards Ajit
Re: Weird startup issue with -fsplit-stack
On 05/20/2014 10:16 PM, Ian Lance Taylor wrote: This is the call to __morestack_block_signals in morestack.S. It should only be possible if __morestack_block_signals or something it calls directly has a split stack. __morestack_block_signals has the no_split_stack attribute, meaning that it should never call __morestack. __morestack_block_signals only calls pthread_sigmark or sigprocmask, neither of which should be compiled with -fsplit-stack. So something has gone wrong, but I don't know what. Thanks - that was an application's own copy of pthread_sigmask (compiled with -fsplit-stack) linked into the binary due to a subtle configuration issue. The next major problem is that -fsplit-stack code randomly crashes with the useless gdb backtrace, usually pointing to the very beginning of the function (plus occasional "Cannot access memory at..." messages), e.g.: (gdb) bt 1 #0 0x005a615b in mark_object (arg=0) at ../../trunk/src/alloc.c:6039 6037 void 6038 mark_object (Lisp_Object arg) ==> 6039 { IIUC this usually (with traditional stack) happens due to stack overflow. But what may be the case with -fsplit-stack? I do not receive any error messages from libgcc, and there are a lot of free heap memory. If that matters, mark_object is recursive, and recursion depth may be very high, up to a few tens of thousands calls. Dmitry