Re: [gsoc 2014] moving fold-const patterns to gimple
On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener wrote: > On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni > wrote: >> Hi Richard, >> Sorry for the late reply. I would like to have few clarifications >> regarding the following points: >> >> a) Pattern matching: Currently, gimple_match_and_simplify() matches >> patterns one-by-one. Could we use a decision tree to do the matching >> instead (similar to insn-recog.c) ? >> For the moment, let's consider pattern matching on only unary >> expressions without valueize and predicates: >> pattern 1: (negate (negate @0)) >> pattern 2: (negate (bit_not @0)) >> >> from the two AST's corresponding to patterns (match::op), we can build >> a decision tree: >> Some-thing similar to: >>NEGATE_EXPR >> NEGATE_EXPRBIT_NOT_EXPR >> >> and then generate code corresponding to this decision tree in gimple-match.c >> so the generated code should look something similar to: >> >> tree >> gimple_match_and_simplify (enum tree_code code, tree type, tree op0, >> gimple_seq *seq, tree (*valueize)(tree)) >> { >> if (code == NEGATE_EXPR) >> { >> tree captures[4] = {}; >> if (TREE_CODE (op0) != SSA_NAME) >> return NULL_TREE; >> gimple def_stmt = SSA_NAM_DEF_STMT (op0); >> if (!is_gimple_assign (def_stmt)) >> return NULL_TREE; >> tree op = gimple_assign_rhs1 (def_stmt); >> if (gimple_assign_rhs_code (op) == NEGATE_EXPR) >> { >>/* pattern (negate (negate @0)) matched */ >> } >> else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR) >> { >>/* pattern (negate (bit_not_expr @0)) matched */ >> } >> else >>return NULL_TREE; >> } >> else >> return NULL_TREE; >> } >> >> For commutative ops, the pattern can be duplicated by walking the >> children of the node in reverse order. >> (I am not exactly clear so far about representing binary operators in a >> decision >> tree) Is this the right way to go ? I shall try to shortly post a patch that >> implements this. > > Yes, that's the way to go (well, I'd even use a switch ()). > >> b) Targeting GENERIC, separating AST from gimple/generic: >> For generating a GENERIC pattern should there be another pattern >> something like match_and_simplify_generic ? > > Yes, there is an existing API in GCC for this that operates on GENERIC. > It's fold_unary_loc, fold_binary_loc, fold_ternary_loc. The interface > the GENERIC match_and_simplify variant provides should match > that one. > >> Currently, the AST data structures (operand, expr, etc.) >> are tied to gimple (gen_gimple_match, gen_gimple_transform). >> We could also have similar functions: gen_generic_match, >> gen_generic_transform for generating GENERIC ? > > Yeah, but I'm not sure if keeping the (virtual) methods for generating > code makes much sense with a rewritten code generator. > >> Instead will it be better if we separate the AST >> from target IR (generic/gimple) and make simplify a visitor on AST >> (make simplify >> abstract class, with simplify_generic and simplify_gimple visitor >> classes that generate corresponding IR code) ? > > Yes. Keep in mind the current state of genmatch.c is "quick hack > to make playing with the API side and with patterns possible" ;) > >> c) Shall it be a good idea in define_match , for >> name to act as a substitute for pattern (similar to flex pattern >> definitions), so the name can be used in bigger patterns ? > > Maybe, I suppose we'll see when adding more patterns. > >> d) This is silly, but maybe use constants to denote corresponding tree nodes >> ? >> for example instead of { build_int_cst (integer_type_node, 0); }, one >> could directly write 0, to denote a INTEGER_CST node with value 0. > > Yes, that might be possible - though it will require more knowledge > in the pattern matcher (you also want to match '0'?) and the code > generator. > >> e) There was a mention on the thread, regarding testing of patterns >> integrated into DSL. I wasn't able to understand that clearly. Could >> you explain that briefly ? > > DSL? Currently I'd say it would be nice to make sure each pattern > is triggered by at least one GCC testcase - this requires looking > at a particular pass dump (that of forwprop or ccp are probably most suitable > as they are run very early). I mentioned the possibility to do offline > (thus not with C testcases) testing but that would require some tool > to do that and it would be correctness testing (some automatic proof > generation tool - ISTR academics have this kind of stuff). But that was > just an idea. > >> Regarding gsoc proposal, I would like to align it on the following points: >> a) Pattern matching using decision tree > > good. > >> b) Generate GIMPLE folding patterns (tree-ssa-forwprop, >> tree-ssa-sccvn, gimple-fold) > > I'd narrow it down a bit, you can optionally do more if time permits. > I'd say > 0) add basic arithmetic ide
RE: dom requires PROP_loops
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 11 March 2014 10:52 > To: Paulo Matos > Cc: gcc@gcc.gnu.org > Subject: Re: dom requires PROP_loops > > On Mon, Mar 10, 2014 at 12:57 PM, Paulo Matos wrote: > > Hello, > > > > In an attempt to test some optimization I destroyed the loop property in > pass_tree_loop_done and reinstated it in pass_rtl_loop_init, however then I > noticed that pass_dominator started generating wrong code. > > My guess is that we should mark pass_dominator with PROP_loops as a required > property? Do you agree? > > No, "PROP_loops" is something artificial. Passes needing loops > will compute them (call loop_optimizer_init). > > You probably did sth wrong with how you "destroy" PROP_loops. > Haven't done anything out of the ordinary. I actually copied what other passes do: cfun->curr_properties &= ~PROP_loops; before calling loop_optimizer_finalize. I find it strange that you actually need to do this since something like this should be done automatically if you mark the property as being destroyed. The background of my investigation in to this is that we don't like instructions in loop latches. This blocks generation of zero-overhead loops for us. PROP_loops is enabled from tree loopinit to rtl loop_done2 and with this property enabled cfg_cleanup doesn't remove empty latches allowing GCC to move instructions into the latch in the meantime. If I destroy this property (as above) in tree_ssa_loop_done and recreate it in rtl loop_init we have major performance boost because CFG cleanup removes empty loop latches but I found a case when jump threading generates wrong code in this case. > Richard.
Re: dom requires PROP_loops
On Thu, Mar 13, 2014 at 12:20 PM, Paulo Matos wrote: > > >> -Original Message- >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: 11 March 2014 10:52 >> To: Paulo Matos >> Cc: gcc@gcc.gnu.org >> Subject: Re: dom requires PROP_loops >> >> On Mon, Mar 10, 2014 at 12:57 PM, Paulo Matos wrote: >> > Hello, >> > >> > In an attempt to test some optimization I destroyed the loop property in >> pass_tree_loop_done and reinstated it in pass_rtl_loop_init, however then I >> noticed that pass_dominator started generating wrong code. >> > My guess is that we should mark pass_dominator with PROP_loops as a >> > required >> property? Do you agree? >> >> No, "PROP_loops" is something artificial. Passes needing loops >> will compute them (call loop_optimizer_init). >> >> You probably did sth wrong with how you "destroy" PROP_loops. >> > > Haven't done anything out of the ordinary. I actually copied what other > passes do: > cfun->curr_properties &= ~PROP_loops; > > before calling loop_optimizer_finalize. That should work. Eventually. > I find it strange that you actually need to do this since something like this > should be done automatically if you mark the property as being destroyed. It was never entirely clear how "properties" were designed and certainly they are not used consistently. For 4.10 PROP_loops will eventually just go away. > The background of my investigation in to this is that we don't like > instructions in loop latches. This blocks generation of zero-overhead loops > for us. > PROP_loops is enabled from tree loopinit to rtl loop_done2 and with this > property enabled cfg_cleanup doesn't remove empty latches allowing GCC to > move instructions into the latch in the meantime. Isn't that fixed on trunk now? (for gimple cfgcleanup at least?) > If I destroy this property (as above) in tree_ssa_loop_done and recreate it > in rtl loop_init we have major performance boost because CFG cleanup removes > empty loop latches but I found a case when jump threading generates wrong > code in this case. Well, you also lose all meta-data attached to loops. Also which jump-threading? RTL or GIMPLE? On GIMPLE loops are requested from both jump-threading passes, DOM and VRP. Richard. >> Richard. >
RE: dom requires PROP_loops
> -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo > Matos > Sent: 13 March 2014 11:21 > To: Richard Biener > Cc: gcc@gcc.gnu.org > Subject: RE: dom requires PROP_loops > > > PROP_loops is enabled from tree loopinit to rtl loop_done2 and with this > property > enabled cfg_cleanup doesn't remove empty latches allowing GCC to move > instructions into the latch in the meantime. > Let me clarify this statement. With PROP_loops enabled, loop_optimizer_finalize doesn't free loops and current_loops is non-null. This has an array of consequences, one of them being that loop latches are not removed. So, which PROP_loops is the reason for the non-removal of empty loop latches, removal actually depends on the value of current_loops. -- Paulo Matos
Re: dom requires PROP_loops
On Thu, Mar 13, 2014 at 1:58 PM, Paulo Matos wrote: > >> -Original Message- >> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo >> Matos >> Sent: 13 March 2014 11:21 >> To: Richard Biener >> Cc: gcc@gcc.gnu.org >> Subject: RE: dom requires PROP_loops >> >> >> PROP_loops is enabled from tree loopinit to rtl loop_done2 and with this >> property >> enabled cfg_cleanup doesn't remove empty latches allowing GCC to move >> instructions into the latch in the meantime. >> > > Let me clarify this statement. With PROP_loops enabled, > loop_optimizer_finalize doesn't free loops and current_loops is non-null. > This has an array of consequences, one of them being that loop latches are > not removed. So, which PROP_loops is the reason for the non-removal of empty > loop latches, removal actually depends on the value of current_loops. Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup then - allow removal if loop properties allows it. Richard. > -- > Paulo Matos
GCC 4.9.0 Status Report (2014-03-13)
Status == The trunk is still in Stage 4, which means only patches fixing regressions and documentation issues are appropriate. Comparing to last year's status reports, we are something in between a fortnight and month behind the last year's schedule, but if enough attention is given to the remaining P1 blockers, we could still release around the beginning of April. The list of secondary architectures has changed recently, so to remind people I'm including it here: The primary platforms are: arm-linux-gnueabi i386-unknown-freebsd i686-pc-linux-gnu mipsisa64-elf powerpc64-unknown-linux-gnu sparc-sun-solaris2.10 x86_64-unknown-linux-gnu The secondary platforms are: aarch64-elf powerpc-ibm-aix7.1.0.0 i686-apple-darwin i686-pc-cygwin i686-mingw32 s390x-linux-gnu Quality Data Priority # Change from last report --- --- P19- 23 P2 75- 12 P3 15- 6 --- --- Total99- 41 Previous Report === http://gcc.gnu.org/ml/gcc/2014-02/msg00013.html The next report will be sent by me again, hopefully announcing the first GCC 4.9.0 release candidate soon.
RE: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Hi Richard/all, The spec on: https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking has been updated and attempts to account for all the feedback. Not everything has been possible to simplify/rework as requested but I believe I have managed to address many points cleanly. Sections 9 and 10 contain pretty much all the changes and a fresh read is better than me attempting to summarise. All the attributes, flags, etc are now defined with specific values and specific comments regarding kernel support for UFR have been added. I have an implementation of everything except the program loader which I will post to ensure the overall approach in code is acceptable. I'll do this on each project's list appropriately. Since I am only just starting testing I have no test cases to offer alongside the patches currently, I will be working on that next. I'm deferring writing all the tests in case the implementation/behaviour changes, hence initial review for now. The implementation in GCC relies on LRA due to the way in which caller-save is handled. A patch to enable LRA and fix all regressions is being developed concurrently and will be posted ready for stage 1. Let me know if there is any feedback on the updated spec. I'm afraid the last aspect we discussed is still a point of contention :-) I'm sure we'll get there though. I've added more comments inline below: Richard Sandiford writes: > Matthew Fortune writes: > >> I think instead we should have a configuration switch that allows a > >> particular -mfp option to be inserted alongside -mabi=32 if no > >> explicit -mfp is given. This is how most --with options work. Maybe > >> --with-fp- 32={32|64|xx}? Specific triples could set a default value > if they like. > >> E.g. the MTI, SDE and mipsisa* ones would probably want to default to > >> -- with-32-fp=xx. Triples aimed at MIPS IV and below would stay as > >> they are. (MIPS IV is sometimes used with -mabi=32.) > >> > >> --with-fp-32 isn't the greatest name but is at least consistent with > >> --with-arch-32 and -mabi=32. Maybe --with-fp-32=64 is so weird that > >> breaking consistency is better though. > > > > Tying the use of fpxx by default to a configure time setting is OK > > with me. When enabled it would still have to follow the rules as > > defined in the design in that it can only apply to architectures that > > can support the variant. > > Right. It's really equivalent to putting the -mfp on every command line > that doesn't have one. > > > Currently that means everything but mips1. > > Yeah, using -mips1 on a --with-{o}32-fp=xx toolchain would be an error. > > > I'm not sure this is the same as tying an ABI to an architecture as > > both fp32 and fpxx are O32 and link compatible. Perhaps the configure > > switch would be --with-o32-fp={32|64|xx}. This shows it is just an O32 > > related setting. > > What I meant is that -march= and -mips shouldn't imply a different -mfp > setting. The -mfp setting should be self-contained and it should be an > error if the architecture isn't compatible. > > We might be in violent agreement here :-) Like I say, I was just a bit > worried by the earlier -mips32r2 thing because there was a time when a - > mips option really could imply things like -mabi, -mgp and -mfp. > > --with-o32-fp would be OK with me. I'm just worried about the ABI being > spelt differently from -mabi=, but there's probably no perfect > alternative. I'd like to encourage the perspective that -mfp* options do not lead to a different ABI in the same sense that other variations do. While it is true that the calling conventions and code generation rules vary, 2 out of 3 combinations of -mfp32 -mfpxx and -mfp64 with -mabi=o32 are link compatible. The introduction of the modeless O32 ABI is intended to remove the part of the O32 definition that says 'FR=0' and hence the architecture then gets to dictate this and the generated code is still O32. It is true today that we have several architectures that mandate FR=0, some that cannot support fpxx and some that can support all fp* variations. I see nothing preventing the future having an architecture only supporting FR=1 though which we should also think about. When considering such a scenario it would be highly desirable for the following to just work as I believe architectural restrictions should be accounted for when designing default options. If the architecture gives no choice then it should just work IMO: Some ideas (speculating that someone builds a core called mips_n with only FR=1): --with-o32-fp=32 mips-*-gcc -march=mips1 fp.c ==> generates fp32 code mips-*-gcc -march=mips2 fp.c ==> generates fp32 code mips-*-gcc -march=mips32r2 fp.c ==> generates fp32 code mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code mips-*-gcc -march=mips_n fp.c ==> generates fp64 code --with-o32-fp=xx mips-*-gcc -march=mips1 fp.c ==> generates fp32 code mips-*-gcc -march=mips2 fp.c ==> gene
RE: dom requires PROP_loops
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 13 March 2014 13:24 > To: Paulo Matos > Cc: gcc@gcc.gnu.org > Subject: Re: dom requires PROP_loops > > > Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup > then - allow removal if loop properties allows it. > In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects loop latches, but I see no code that allows removal of latch if property allows it. From what you say I would expect this would already be implemented in tree-cfgcleanup.c, however what actually happens is that since current_loops is non-null (PROP_loops is not destroyed in tree loopdone), tree-cfgcleanup call chain ends up calling cleanup_tree_cfg_bb on the bb loop latch and tree_forwarder_block_p returns false for bb because of the following code thereby not removing the latch: if (current_loops) { basic_block dest; /* Protect loop latches, headers and preheaders. */ if (bb->loop_father->header == bb) return false; dest = EDGE_SUCC (bb, 0)->dest; if (dest->loop_father->header == dest) return false; } Why do we need to protect the latch? Paulo Matos > Richard. >
#51253 seriously limits the use of G++ for C++11
Hi all, There is an issue that (imho) seriously limits the use of G++ in one of the most significant improvements in C++11: variadic templates: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51253 Because of this limitation, I cannot use GCC at all since several months in my main project, and I have to impose Clang to my users. Of course, there is no warranty, etc., and I am not entitled to ask for anything, however, I wanted to draw some attention on this bug, as it might have been overlooked. I don't mean to bother, I'm just hoping to get some feedback. Cheers! Akim
RE: dom requires PROP_loops
On March 13, 2014 5:00:53 PM CET, Paulo Matos wrote: >> -Original Message- >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: 13 March 2014 13:24 >> To: Paulo Matos >> Cc: gcc@gcc.gnu.org >> Subject: Re: dom requires PROP_loops >> >> >> Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup >> then - allow removal if loop properties allows it. >> > >In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects >loop latches, but I see no code that allows removal of latch if >property allows it. >From what you say I would expect this would already be implemented in >tree-cfgcleanup.c, however what actually happens is that since >current_loops is non-null (PROP_loops is not destroyed in tree >loopdone), tree-cfgcleanup call chain ends up calling >cleanup_tree_cfg_bb on the bb loop latch and tree_forwarder_block_p >returns false for bb because of the following code thereby not removing >the latch: > if (current_loops) >{ > basic_block dest; > /* Protect loop latches, headers and preheaders. */ > if (bb->loop_father->header == bb) > return false; > dest = EDGE_SUCC (bb, 0)->dest; > > if (dest->loop_father->header == dest) > return false; >} > >Why do we need to protect the latch? You are looking at old sources. Richard. >Paulo Matos > >> Richard. >>
SET_EXPR_LOCATION usage for unused tree?
Hi! In gcc/c/c-parser.c:c_parser_omp_clause_num_threads (as well as other, similar functions), what is the point of setting the boolean tree c's location, given that this tree won't be used in the following? /* Attempt to statically determine when the number isn't positive. */ c = fold_build2_loc (expr_loc, LE_EXPR, boolean_type_node, t, build_int_cst (TREE_TYPE (t), 0)); if (CAN_HAVE_LOCATION_P (c)) SET_EXPR_LOCATION (c, expr_loc); if (c == boolean_true_node) { warning_at (expr_loc, 0, "% value must be positive"); t = integer_one_node; } [c not used anymore] Both with and without the SET_EXPR_LOCATION, the error is the same: ../../loop.c: In function 'main': ../../loop.c:10:34: warning: 'num_threads' value must be positive #pragma omp parallel num_threads(-1) ^ Grüße, Thomas pgpmDSAl6HwGx.pgp Description: PGP signature
Ian Lance Taylor and Ramana Radhakrishnan join GCC Steering Committee
On behalf of the entire GCC Steering Committee, it gives me great pleasure to welcome Ian Lance Taylor and Ramana Radhakrishnan as the newest members of the GCC Steering Committee. We hope that everyone will join us to wish them all of the support and wisdom for this new challenge. We are happy to gain their experience and knowledge. The GCC Steering Commitee David Edelsohn Kaveh Ghazi Jeffrey Law Marc Lehmann Jason Merrill David Miller Toon Moene Joseph Myers Gerald Pfeifer Ramana Radhakrishnan Joel Sherrill Ian Lance Taylor Jim Wilson
Re: Completing GCC Go escape analysis in GSoC 2014
On Wed, Mar 12, 2014 at 8:31 PM, Ray Li wrote: > > Hi, I'm a student interested in working on GCC and want to make a > proposal of GSoC 2014 on GCC Go escape analysis. > > I 've read code under /gcc/testsuit/go.* the some source code of > gofrontend, and realization of escape analysis and furthermore > optimization is needed. > > Right now I have come up with a small patch of escape test at the > beginning. My patch aims at test for whether escape analysis is > working. Then I want to start some small part of performance function > and write more tests for optimization. Am i on the right direction? > Thanks a lot if anyone can give me some advice. Thanks for your interest. Yes, all of your examples look correct to me. There is a larger escape analysis test in libgo/go/fmt/fmt_test.go. That file is copied from the master repository, but in mallocTest the numbers are changed. Instead of 7 5's and a 20, it should be 0, 1, 1, 2, 1, 2, 0, 1. Ian
gcc-4.8-20140313 is now available
Snapshot gcc-4.8-20140313 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140313/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 208558 You'll find: gcc-4.8-20140313.tar.bz2 Complete GCC MD5=cd65801f0c1ccb277a0f4d25544affde SHA1=ad63c184c056b2d4f54fce1b2d7f02fce2489fa7 Diffs from 4.8-20140306 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.