Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Matthew Fortune writes: > Richard Sandiford writes: >> Matthew Fortune writes: >> >> I think instead we should have a configuration switch that allows a >> >> particular -mfp option to be inserted alongside -mabi=32 if no >> >> explicit -mfp is given. This is how most --with options work. Maybe >> >> --with-fp- 32={32|64|xx}? Specific triples could set a default value >> if they like. >> >> E.g. the MTI, SDE and mipsisa* ones would probably want to default to >> >> -- with-32-fp=xx. Triples aimed at MIPS IV and below would stay as >> >> they are. (MIPS IV is sometimes used with -mabi=32.) >> >> >> >> --with-fp-32 isn't the greatest name but is at least consistent with >> >> --with-arch-32 and -mabi=32. Maybe --with-fp-32=64 is so weird that >> >> breaking consistency is better though. >> > >> > Tying the use of fpxx by default to a configure time setting is OK >> > with me. When enabled it would still have to follow the rules as >> > defined in the design in that it can only apply to architectures that >> > can support the variant. >> >> Right. It's really equivalent to putting the -mfp on every command line >> that doesn't have one. >> >> > Currently that means everything but mips1. >> >> Yeah, using -mips1 on a --with-{o}32-fp=xx toolchain would be an error. >> >> > I'm not sure this is the same as tying an ABI to an architecture as >> > both fp32 and fpxx are O32 and link compatible. Perhaps the configure >> > switch would be --with-o32-fp={32|64|xx}. This shows it is just an O32 >> > related setting. >> >> What I meant is that -march= and -mips shouldn't imply a different -mfp >> setting. The -mfp setting should be self-contained and it should be an >> error if the architecture isn't compatible. >> >> We might be in violent agreement here :-) Like I say, I was just a bit >> worried by the earlier -mips32r2 thing because there was a time when a - >> mips option really could imply things like -mabi, -mgp and -mfp. >> >> --with-o32-fp would be OK with me. I'm just worried about the ABI being >> spelt differently from -mabi=, but there's probably no perfect >> alternative. > > I'd like to encourage the perspective that -mfp* options do not lead to > a different ABI in the same sense that other variations do. While it is > true that the calling conventions and code generation rules vary, 2 out > of 3 combinations of -mfp32 -mfpxx and -mfp64 with -mabi=o32 are link > compatible. -mfp32 and -mfp64 aren't link-compatible though, so -mfp is part of the ABI. What you're adding is a new variant that is individually link-compatible with the other two (but obviously not both simultaneously). It's a third ABI variant in itself. > The introduction of the modeless O32 ABI is intended to > remove the part of the O32 definition that says 'FR=0' and hence the > architecture then gets to dictate this and the generated code is still > O32. It is true today that we have several architectures that mandate > FR=0, some that cannot support fpxx and some that can support all fp* > variations. I see nothing preventing the future having an architecture > only supporting FR=1 though which we should also think about. Agreed. > When considering such a scenario it would be highly desirable for the > following to just work as I believe architectural restrictions should > be accounted for when designing default options. If the architecture > gives no choice then it should just work IMO: > > Some ideas (speculating that someone builds a core called mips_n with > only FR=1): > > --with-o32-fp=32 > > mips-*-gcc -march=mips1 fp.c ==> generates fp32 code > mips-*-gcc -march=mips2 fp.c ==> generates fp32 code > mips-*-gcc -march=mips32r2 fp.c ==> generates fp32 code > mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code > mips-*-gcc -march=mips_n fp.c ==> generates fp64 code > > --with-o32-fp=xx > > mips-*-gcc -march=mips1 fp.c ==> generates fp32 code > mips-*-gcc -march=mips2 fp.c ==> generates fpxx code > mips-*-gcc -march=mips32r2 fp.c ==> generates fpxx code > mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code > mips-*-gcc -march=mips_n fp.c ==> generates fp64 code > > --with-o32-fp=64 > > mips-*-gcc -march=mips1 fp.c ==> generates fp32 code > mips-*-gcc -march=mips2 fp.c ==> generates fpxx code > mips-*-gcc -march=mips32r2 fp.c ==> generates fp64 code > mips-*-gcc -march=mips32r2 -mfp64 fp.c ==> generates fp64 code > mips-*-gcc -march=mips32r2 -mfpxx fp.c ==> generates fpxx code > mips-*-gcc -march=mips_n fp.c ==> generates fp64 code > > With these defaults, the closest supported ABI is used for each > architecture based on the --with-o32-fp build option. The only one I > really care about is the middle one as it makes full use of the O32 FPXX > ABI without a user needing to account for arch restrictions. Note that --with-* options just insert a canned -mfoo=bar option under certain conditions, with those conditions being the same regardless of "bar". So --with-o32-fp=32 should inser
RE: dom requires PROP_loops
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 13 March 2014 18:46 > To: Paulo Matos > Cc: gcc@gcc.gnu.org > Subject: RE: dom requires PROP_loops > > On March 13, 2014 5:00:53 PM CET, Paulo Matos wrote: > >> -Original Message- > >> From: Richard Biener [mailto:richard.guent...@gmail.com] > >> Sent: 13 March 2014 13:24 > >> To: Paulo Matos > >> Cc: gcc@gcc.gnu.org > >> Subject: Re: dom requires PROP_loops > >> > >> > >> Probably RTL cfgcleaup needs the same treatment as GIMPLE cfgcleanup > >> then - allow removal if loop properties allows it. > >> > > > >In both cfgcleanup.c and tree-cfgcleanup.c I can see code that protects > >loop latches, but I see no code that allows removal of latch if > >property allows it. > >From what you say I would expect this would already be implemented in > >tree-cfgcleanup.c, however what actually happens is that since > >current_loops is non-null (PROP_loops is not destroyed in tree > >loopdone), tree-cfgcleanup call chain ends up calling > >cleanup_tree_cfg_bb on the bb loop latch and tree_forwarder_block_p > >returns false for bb because of the following code thereby not removing > >the latch: > > if (current_loops) > >{ > > basic_block dest; > > /* Protect loop latches, headers and preheaders. */ > > if (bb->loop_father->header == bb) > > return false; > > dest = EDGE_SUCC (bb, 0)->dest; > > > > if (dest->loop_father->header == dest) > > return false; > >} > > > >Why do we need to protect the latch? > > You are looking at old sources. > That's correct. I was looking at 4.8. Let me take a look at what trunk is doing... :) > Richard. > > >Paulo Matos > > > >> Richard. > >> >
Re: SET_EXPR_LOCATION usage for unused tree?
On Thu, Mar 13, 2014 at 10:44 PM, Thomas Schwinge wrote: > Hi! > > In gcc/c/c-parser.c:c_parser_omp_clause_num_threads (as well as other, > similar functions), what is the point of setting the boolean tree c's > location, given that this tree won't be used in the following? > > /* Attempt to statically determine when the number isn't positive. > */ > c = fold_build2_loc (expr_loc, LE_EXPR, boolean_type_node, t, >build_int_cst (TREE_TYPE (t), 0)); > if (CAN_HAVE_LOCATION_P (c)) > SET_EXPR_LOCATION (c, expr_loc); > if (c == boolean_true_node) > { > warning_at (expr_loc, 0, > "% value must be positive"); > t = integer_one_node; > } > [c not used anymore] > > Both with and without the SET_EXPR_LOCATION, the error is the same: > > ../../loop.c: In function 'main': > ../../loop.c:10:34: warning: 'num_threads' value must be positive > #pragma omp parallel num_threads(-1) > ^ That can be even simplified to avoid building the tree if it doesn't simplify with c = fold_binary (LE_EXPR, boolean_type_node, t, build_int_cst (TREE_TYPE (t), 0)); if (c && c == boolean_true_node) { warning_at ( Richard. > > Grüße, > Thomas
Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Matthew Fortune writes: > The spec on: > https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking > has been updated and attempts to account for all the feedback. Not > everything has been possible to simplify/rework as requested but I > believe I have managed to address many points cleanly. (FWIW there seem to be some weird line breaks in the page which make it a bit hard to read.) The main thing that stood out for me was section 9. If we have the attributes and the program header (both good to have IMO) then we shouldn't have an ELF flag too. "Static" consumers should use the attribute and "dynamic" consumers should use the program header. The main point of encoding future info in a program header was to relieve the pressure on the ELF flags. As far as the program header encoding goes: I was thinking of a more general mechanism that specifies a block of data, a bit like the current PT_MIPS_OPTIONS does. Encoding the information directly in the enumeration wouldn't scale well, since we'd end up with the same problem as we have now for ELF flags. It would also be a bit wasteful to specify two bits of information this way since the other parts of the header structure don't carry any weight. Thanks, Richard
Legitimize address after reload
Hello, I'm writing a simple gcc backend and I'm experiencing a weird thing regarding address legitimation process. Two scenarios: If I only allow addresses to be either a register or symbols my gcc works. To do so I add the restrictions into the TARGET_LEGITIMATE_ADDRESS_P macro. This makes gcc to force registers for all the addresses. If I allow also a 'PLUS' expression to be a valid address (adding the restriction that the two addends are a register and a constant) it happens (sometimes) that gcc comes up with an expression like this one: (plus:SI (plus:SI (reg:SI somereg) (const_int 4)) (const_int 8)) After taking a look at the 386 backend (and others) I just discovered that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is responsible for handling this case. My issue is that this function is not being called and, from what I saw while debugging, it seems that the offending RTX expression is created after the address_reload pass, and thus impossible for this pass to legitimize the address. Looking at other architectures it seems that they are doing more or less the same, so I don't know what the issue might be. Do you have any idea? Thanks, David
Re: Legitimize address after reload
On Fri, 14 Mar 2014 12:52:35 +0100 David Guillen wrote: > If I allow also a 'PLUS' expression to be a valid address (adding the > restriction that the two addends are a register and a constant) it > happens (sometimes) that gcc comes up with an expression like this > one: > > (plus:SI (plus:SI (reg:SI somereg) > (const_int 4)) > (const_int 8)) > > > After taking a look at the 386 backend (and others) I just discovered > that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is > responsible for handling this case. My issue is that this function is > not being called and, from what I saw while debugging, it seems that > the offending RTX expression is created after the address_reload pass, > and thus impossible for this pass to legitimize the address. Look at how e.g. the ARM backend and others handle the "strict" parameter to the legitimate_address hook -- you need to use that to forbid pseudo registers being allowed in RTXs in the strict case. LEGITIMIZE_RELOAD_ADDRESS is probably a red herring (at least for the simple cases you're probably dealing with to start with), and isn't used for LRA anyway. Getting these bits right can be very fiddly! The (plus (reg) (const)) operands can arise before/during during register elimination, IIRC. (You might need to get the register-elimination bits right, too...) Just a guess, anyway. (http://gcc.gnu.org/wiki/reload might be helpful if you've not read it.) Julian
RE: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Richard Sandiford writes: > Matthew Fortune writes: > > The spec on: > > https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinki > > ng has been updated and attempts to account for all the feedback. Not > > everything has been possible to simplify/rework as requested but I > > believe I have managed to address many points cleanly. > > (FWIW there seem to be some weird line breaks in the page which make it > a bit hard to read.) Apologies, I edited it offline and didn't check the result carefully enough. I'll clean it up. > The main thing that stood out for me was section 9. If we have the > attributes and the program header (both good to have IMO) then we > shouldn't have an ELF flag too. "Static" consumers should use the > attribute and "dynamic" consumers should use the program header. > The main point of encoding future info in a program header was to > relieve the pressure on the ELF flags. I know what you mean. I kept the ELF flag around because it firstly already exists (with the correct meaning as it happens) and secondly ELF flags are already consumed in the program loader whereas a small amount of new framework in the kernel is needed for the loader to respond to program headers. The 'executable stack' header is currently consumed but the mechanism is not extensible today. My thinking is that the ELF flag eases us into the program loader but could validly be dropped/not required long term. It is largely ignored by the tools anyway in favour of the program headers. I am happy to remove the ELF flag if I can confirm with our MIPS kernel developers that they can implement the program header inspection sooner rather than later. > As far as the program header encoding goes: I was thinking of a more > general mechanism that specifies a block of data, a bit like the current > PT_MIPS_OPTIONS does. Encoding the information directly in the > enumeration wouldn't scale well, since we'd end up with the same problem > as we have now for ELF flags. It would also be a bit wasteful to > specify two bits of information this way since the other parts of the > header structure don't carry any weight. I was trying to avoid the need for a program header to refer to a block of data as that is another part of the object that has to be loaded to determine the flag information. There are 2^28 processor specific program headers available which seems quite generous (I half though of using 2 for the two modes), but I do also recognise that most of the header then becomes wasted space. I guess there may be some complaint if we choose to abuse every field of a header to encode information (i.e. address, size, alignment etc) but this would be a nice compact way to store flags. It would be more visible to put flags in the address fields as these are already printed by readelf et al. but the processor specific flags are not. Personally I'd open up all the fields to abuse over adding a block of data. The block of data increases the complexity of the program loader and dynamic loader as they have to ensure more of an object is read in order to make a decision. The extra data needed from an object would also be target specific, all do-able I'm just not sure on complexity. I wonder if Joseph or Maciej have any thoughts here as I believe they discussed this idea of using program headers in the past. Since I'm far from being an expert in this area I'm OK with anything as long as I can get all maintainers of dynamic loaders and program loaders to agree (ha!). Bionic, glibc, uclibc and linux kernel are the primary targets here. Regards, Matthew
Re: Legitimize address after reload
Thanks for you info Julian. I actually read all the docs and I think I 'more or less' understand the inner workings of gcc. What surprises me most is that during the non-strict RTL generation I do not see any 'strange' address pattern but during the post-reload process the non-legitimate address comes up. I guess it is due to the fact that one of the PLUS operands is a memory operand (a local var.?), thus resulting in double indirect memory address. In any case I'm not using the restrict variable and I'm assuming strict is zero, this is, not checking the hard regsiters themselves. This is because any reg is OK for base reg. I'm pretty sure I'm behaving similarly to arm, cris or x86 backends. Thanks, David 2014-03-14 13:11 GMT+01:00 Julian Brown : > On Fri, 14 Mar 2014 12:52:35 +0100 > David Guillen wrote: > >> If I allow also a 'PLUS' expression to be a valid address (adding the >> restriction that the two addends are a register and a constant) it >> happens (sometimes) that gcc comes up with an expression like this >> one: >> >> (plus:SI (plus:SI (reg:SI somereg) >> (const_int 4)) >> (const_int 8)) >> >> >> After taking a look at the 386 backend (and others) I just discovered >> that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is >> responsible for handling this case. My issue is that this function is >> not being called and, from what I saw while debugging, it seems that >> the offending RTX expression is created after the address_reload pass, >> and thus impossible for this pass to legitimize the address. > > Look at how e.g. the ARM backend and others handle the "strict" > parameter to the legitimate_address hook -- you need to use that to > forbid pseudo registers being allowed in RTXs in the strict case. > LEGITIMIZE_RELOAD_ADDRESS is probably a red herring (at least for the > simple cases you're probably dealing with to start with), and isn't used > for LRA anyway. Getting these bits right can be very fiddly! The (plus > (reg) (const)) operands can arise before/during during register > elimination, IIRC. (You might need to get the register-elimination bits > right, too...) > > Just a guess, anyway. (http://gcc.gnu.org/wiki/reload might be helpful > if you've not read it.) > > Julian
Re: Reg Alloc Problem.
Hi All, To handle the below problem i.e making specific set of register as base registers ,which is the subset of general registers set. we see the *.c.208.ira logs as Pass 0 for finding pseudo/allocno costs r21: preferred BASE_REGS, alternative GENERAL_REGS, allocno GENERAL_REGS a2 (r21,l0) best BASE_REGS, allocno GENERAL_REGS r19: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS a0 (r19,l0) best GENERAL_REGS, allocno GENERAL_REGS r18: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS a1 (r18,l0) best GENERAL_REGS, allocno GENERAL_REGS a0(r19,l0) costs: LOW_8BIT_REGS:0 BASE_REGS:0 GENERAL_REGS:0 ALL_REGS:0 MEM:8 a1(r18,l0) costs: LOW_8BIT_REGS:0 BASE_REGS:0 GENERAL_REGS:0 ALL_REGS:0 MEM:8 a2(r21,l0) costs: LOW_8BIT_REGS:2 BASE_REGS:0 GENERAL_REGS:4 ALL_REGS:4 MEM:8 where IRA choose the GENERAL_REG over BASE (preferred) for the r21 pseudo,i'm looking for the cause in our backend,but mean while anyone in the group can share there experience w.r.t that will help me to solve issue asap. Thank you ~Umesh On Wed, Mar 12, 2014 at 7:30 PM, Umesh Kalappa wrote: > Hi All, > > We are porting the gcc 4.8.1 to the new target and which has the pair > 16 bit registers like AB or CD or EF and we modeled it in > reg_class as AB,CD and DE 16 bit pair_regs and CD ,EF as 16 bit > base_regs and A,B,C,D E and F as 8 bit as general_regs. > > We are stuck with below issues like > > 1)How do we modelled such that the register alloc to pick the > respective base_regs i.e CD,DE instead of AB as show in the below > case > > LD AB ,_a;//invalid instead of it should be emit LD CD ,_a > > LD (AB),#100; // invalid instead of it should be emit LD (CD),#100 > > > Please note that we override the target hook like REGNO_REG_CLASS > ,but still no luck here . > > 2)Current target enforce the restrictions on the pair register set > usage for multiplication like > > MUL A,B or MUL C,D or MUL E,F > > But not MUL A,C or MUL B,C etc not across the pair_regs . > > > Anyone can please shed some lights here ,will be appreciate and help > us in the great way . > > Thank you for the patience > > ~Umesh
Re: [gsoc 2014] moving fold-const patterns to gimple
On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener wrote: > On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener > wrote: >> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni >> wrote: >>> Hi Richard, >>> Sorry for the late reply. I would like to have few clarifications >>> regarding the following points: >>> >>> a) Pattern matching: Currently, gimple_match_and_simplify() matches >>> patterns one-by-one. Could we use a decision tree to do the matching >>> instead (similar to insn-recog.c) ? >>> For the moment, let's consider pattern matching on only unary >>> expressions without valueize and predicates: >>> pattern 1: (negate (negate @0)) >>> pattern 2: (negate (bit_not @0)) >>> >>> from the two AST's corresponding to patterns (match::op), we can build >>> a decision tree: >>> Some-thing similar to: >>>NEGATE_EXPR >>> NEGATE_EXPRBIT_NOT_EXPR >>> >>> and then generate code corresponding to this decision tree in gimple-match.c >>> so the generated code should look something similar to: >>> >>> tree >>> gimple_match_and_simplify (enum tree_code code, tree type, tree op0, >>> gimple_seq *seq, tree (*valueize)(tree)) >>> { >>> if (code == NEGATE_EXPR) >>> { >>> tree captures[4] = {}; >>> if (TREE_CODE (op0) != SSA_NAME) >>> return NULL_TREE; >>> gimple def_stmt = SSA_NAM_DEF_STMT (op0); >>> if (!is_gimple_assign (def_stmt)) >>> return NULL_TREE; >>> tree op = gimple_assign_rhs1 (def_stmt); >>> if (gimple_assign_rhs_code (op) == NEGATE_EXPR) >>> { >>>/* pattern (negate (negate @0)) matched */ >>> } >>> else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR) >>> { >>>/* pattern (negate (bit_not_expr @0)) matched */ >>> } >>> else >>>return NULL_TREE; >>> } >>> else >>> return NULL_TREE; >>> } >>> >>> For commutative ops, the pattern can be duplicated by walking the >>> children of the node in reverse order. >>> (I am not exactly clear so far about representing binary operators in a >>> decision >>> tree) Is this the right way to go ? I shall try to shortly post a patch that >>> implements this. >> >> Yes, that's the way to go (well, I'd even use a switch ()). >> >>> b) Targeting GENERIC, separating AST from gimple/generic: >>> For generating a GENERIC pattern should there be another pattern >>> something like match_and_simplify_generic ? >> >> Yes, there is an existing API in GCC for this that operates on GENERIC. >> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc. The interface >> the GENERIC match_and_simplify variant provides should match >> that one. >> >>> Currently, the AST data structures (operand, expr, etc.) >>> are tied to gimple (gen_gimple_match, gen_gimple_transform). >>> We could also have similar functions: gen_generic_match, >>> gen_generic_transform for generating GENERIC ? >> >> Yeah, but I'm not sure if keeping the (virtual) methods for generating >> code makes much sense with a rewritten code generator. >> >>> Instead will it be better if we separate the AST >>> from target IR (generic/gimple) and make simplify a visitor on AST >>> (make simplify >>> abstract class, with simplify_generic and simplify_gimple visitor >>> classes that generate corresponding IR code) ? >> >> Yes. Keep in mind the current state of genmatch.c is "quick hack >> to make playing with the API side and with patterns possible" ;) >> >>> c) Shall it be a good idea in define_match , for >>> name to act as a substitute for pattern (similar to flex pattern >>> definitions), so the name can be used in bigger patterns ? >> >> Maybe, I suppose we'll see when adding more patterns. >> >>> d) This is silly, but maybe use constants to denote corresponding tree >>> nodes ? >>> for example instead of { build_int_cst (integer_type_node, 0); }, one >>> could directly write 0, to denote a INTEGER_CST node with value 0. >> >> Yes, that might be possible - though it will require more knowledge >> in the pattern matcher (you also want to match '0'?) and the code >> generator. >> >>> e) There was a mention on the thread, regarding testing of patterns >>> integrated into DSL. I wasn't able to understand that clearly. Could >>> you explain that briefly ? >> >> DSL? Currently I'd say it would be nice to make sure each pattern >> is triggered by at least one GCC testcase - this requires looking >> at a particular pass dump (that of forwprop or ccp are probably most suitable >> as they are run very early). I mentioned the possibility to do offline >> (thus not with C testcases) testing but that would require some tool >> to do that and it would be correctness testing (some automatic proof >> generation tool - ISTR academics have this kind of stuff). But that was >> just an idea. >> >>> Regarding gsoc proposal, I would like to align it on the following points: >>> a) Pattern matching using decision tree >> >> good. >> >>> b) Generate GIMPLE fold
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, Mar 14, 2014 at 9:01 PM, Prathamesh Kulkarni wrote: > On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener > wrote: >> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener >> wrote: >>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni >>> wrote: Hi Richard, Sorry for the late reply. I would like to have few clarifications regarding the following points: a) Pattern matching: Currently, gimple_match_and_simplify() matches patterns one-by-one. Could we use a decision tree to do the matching instead (similar to insn-recog.c) ? For the moment, let's consider pattern matching on only unary expressions without valueize and predicates: pattern 1: (negate (negate @0)) pattern 2: (negate (bit_not @0)) from the two AST's corresponding to patterns (match::op), we can build a decision tree: Some-thing similar to: NEGATE_EXPR NEGATE_EXPRBIT_NOT_EXPR and then generate code corresponding to this decision tree in gimple-match.c so the generated code should look something similar to: tree gimple_match_and_simplify (enum tree_code code, tree type, tree op0, gimple_seq *seq, tree (*valueize)(tree)) { if (code == NEGATE_EXPR) { tree captures[4] = {}; if (TREE_CODE (op0) != SSA_NAME) return NULL_TREE; gimple def_stmt = SSA_NAM_DEF_STMT (op0); if (!is_gimple_assign (def_stmt)) return NULL_TREE; tree op = gimple_assign_rhs1 (def_stmt); if (gimple_assign_rhs_code (op) == NEGATE_EXPR) { /* pattern (negate (negate @0)) matched */ } else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR) { /* pattern (negate (bit_not_expr @0)) matched */ } else return NULL_TREE; } else return NULL_TREE; } For commutative ops, the pattern can be duplicated by walking the children of the node in reverse order. (I am not exactly clear so far about representing binary operators in a decision tree) Is this the right way to go ? I shall try to shortly post a patch that implements this. >>> >>> Yes, that's the way to go (well, I'd even use a switch ()). >>> b) Targeting GENERIC, separating AST from gimple/generic: For generating a GENERIC pattern should there be another pattern something like match_and_simplify_generic ? >>> >>> Yes, there is an existing API in GCC for this that operates on GENERIC. >>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc. The interface >>> the GENERIC match_and_simplify variant provides should match >>> that one. >>> Currently, the AST data structures (operand, expr, etc.) are tied to gimple (gen_gimple_match, gen_gimple_transform). We could also have similar functions: gen_generic_match, gen_generic_transform for generating GENERIC ? >>> >>> Yeah, but I'm not sure if keeping the (virtual) methods for generating >>> code makes much sense with a rewritten code generator. >>> Instead will it be better if we separate the AST from target IR (generic/gimple) and make simplify a visitor on AST (make simplify abstract class, with simplify_generic and simplify_gimple visitor classes that generate corresponding IR code) ? >>> >>> Yes. Keep in mind the current state of genmatch.c is "quick hack >>> to make playing with the API side and with patterns possible" ;) >>> c) Shall it be a good idea in define_match , for name to act as a substitute for pattern (similar to flex pattern definitions), so the name can be used in bigger patterns ? >>> >>> Maybe, I suppose we'll see when adding more patterns. >>> d) This is silly, but maybe use constants to denote corresponding tree nodes ? for example instead of { build_int_cst (integer_type_node, 0); }, one could directly write 0, to denote a INTEGER_CST node with value 0. >>> >>> Yes, that might be possible - though it will require more knowledge >>> in the pattern matcher (you also want to match '0'?) and the code >>> generator. >>> e) There was a mention on the thread, regarding testing of patterns integrated into DSL. I wasn't able to understand that clearly. Could you explain that briefly ? >>> >>> DSL? Currently I'd say it would be nice to make sure each pattern >>> is triggered by at least one GCC testcase - this requires looking >>> at a particular pass dump (that of forwprop or ccp are probably most >>> suitable >>> as they are run very early). I mentioned the possibility to do offline >>> (thus not with C testcases) testing but that would require some tool >>> to do that and it would be correctness testing (some automatic proof >>> generation tool - ISTR academics have this kind of stuff). Bu
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote: I had a look at PR 14753 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first link. I have tried to implement those transforms (attached patch, stage-1 compiled). I have written the transforms to operate on GENERIC. Why not directly gimple or the .pd file? Is that correct ? The patterns mentioned in the links were: a) (X >> CST1) >= CST2 -> X >= CST2 << CST1 however, an expression Y >= CST gets folded to Y > CST - 1 so the transform I wrote: (X >> CST1) > CST2 -> X > CST2 << CST1 That's not the same, try X=1, CST1=1, CST2=0. b) (X & ~CST) == 0 -> X <= CST Uh, that can't be true for all constants, only some with a very specific shape (7 is 2^3-1). -- Marc Glisse
Re: Legitimize address after reload
On 03/14/14 05:52, David Guillen wrote: Hello, I'm writing a simple gcc backend and I'm experiencing a weird thing regarding address legitimation process. Two scenarios: If I only allow addresses to be either a register or symbols my gcc works. To do so I add the restrictions into the TARGET_LEGITIMATE_ADDRESS_P macro. This makes gcc to force registers for all the addresses. If I allow also a 'PLUS' expression to be a valid address (adding the restriction that the two addends are a register and a constant) it happens (sometimes) that gcc comes up with an expression like this one: (plus:SI (plus:SI (reg:SI somereg) (const_int 4)) (const_int 8)) After taking a look at the 386 backend (and others) I just discovered that there is a function called LEGITIMIZE_RELOAD_ADDRESS which is responsible for handling this case. My issue is that this function is not being called and, from what I saw while debugging, it seems that the offending RTX expression is created after the address_reload pass, and thus impossible for this pass to legitimize the address. Looking at other architectures it seems that they are doing more or less the same, so I don't know what the issue might be. LEGITIMIZE_RELOAD_ADDRESS is a hook that allows the target to rewrite invalid addresses during the reload pass in such a way that reload can generate the address more efficiently than the generic code in reload can do. It should always be safe for LEGITIMIZE_RELOAD_ADDRESS to do nothing. For your problem I'm sure it's a total red herring. Jeff
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, Mar 14, 2014 at 9:25 PM, Marc Glisse wrote: > On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote: > >> I had a look at PR 14753 >> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first >> link. I have tried to implement those transforms (attached patch, >> stage-1 compiled). >> I have written the transforms to operate on GENERIC. > > > Why not directly gimple or the .pd file? > > >> Is that correct ? >> The patterns mentioned in the links were: >> a) (X >> CST1) >= CST2 -> X >= CST2 << CST1 >> however, an expression Y >= CST gets folded to Y > CST - 1 >> so the transform I wrote: >> (X >> CST1) > CST2 -> X > CST2 << CST1 > > > That's not the same, try X=1, CST1=1, CST2=0. Ah yes. Shall following be correct ? (X >> CST1) > CST2 -> X > ( (CST2 + 1) << CST1 ) - 1 Works correctly for X=1, CST1 = 1, CST2 = 0 (X >> CST1) > CST2 => (X >> CST1) >= (CST2 + 1) // this pattern is mentioned in PR => X >= (CST2 + 1) << CST1 => X > ((CST2 + 1) << CST1) - 1 > > >> b) (X & ~CST) == 0 -> X <= CST > > > Uh, that can't be true for all constants, only some with a very specific > shape (7 is 2^3-1). Agreed. Shall the pattern be folded if CST is 2^(n-1) ? > > -- > Marc Glisse
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, Mar 14, 2014 at 4:31 PM, Prathamesh Kulkarni wrote: > On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener > wrote: >> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener >> wrote: >>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni >>> wrote: Hi Richard, Sorry for the late reply. I would like to have few clarifications regarding the following points: a) Pattern matching: Currently, gimple_match_and_simplify() matches patterns one-by-one. Could we use a decision tree to do the matching instead (similar to insn-recog.c) ? For the moment, let's consider pattern matching on only unary expressions without valueize and predicates: pattern 1: (negate (negate @0)) pattern 2: (negate (bit_not @0)) from the two AST's corresponding to patterns (match::op), we can build a decision tree: Some-thing similar to: NEGATE_EXPR NEGATE_EXPRBIT_NOT_EXPR and then generate code corresponding to this decision tree in gimple-match.c so the generated code should look something similar to: tree gimple_match_and_simplify (enum tree_code code, tree type, tree op0, gimple_seq *seq, tree (*valueize)(tree)) { if (code == NEGATE_EXPR) { tree captures[4] = {}; if (TREE_CODE (op0) != SSA_NAME) return NULL_TREE; gimple def_stmt = SSA_NAM_DEF_STMT (op0); if (!is_gimple_assign (def_stmt)) return NULL_TREE; tree op = gimple_assign_rhs1 (def_stmt); if (gimple_assign_rhs_code (op) == NEGATE_EXPR) { /* pattern (negate (negate @0)) matched */ } else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR) { /* pattern (negate (bit_not_expr @0)) matched */ } else return NULL_TREE; } else return NULL_TREE; } For commutative ops, the pattern can be duplicated by walking the children of the node in reverse order. (I am not exactly clear so far about representing binary operators in a decision tree) Is this the right way to go ? I shall try to shortly post a patch that implements this. >>> >>> Yes, that's the way to go (well, I'd even use a switch ()). >>> b) Targeting GENERIC, separating AST from gimple/generic: For generating a GENERIC pattern should there be another pattern something like match_and_simplify_generic ? >>> >>> Yes, there is an existing API in GCC for this that operates on GENERIC. >>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc. The interface >>> the GENERIC match_and_simplify variant provides should match >>> that one. >>> Currently, the AST data structures (operand, expr, etc.) are tied to gimple (gen_gimple_match, gen_gimple_transform). We could also have similar functions: gen_generic_match, gen_generic_transform for generating GENERIC ? >>> >>> Yeah, but I'm not sure if keeping the (virtual) methods for generating >>> code makes much sense with a rewritten code generator. >>> Instead will it be better if we separate the AST from target IR (generic/gimple) and make simplify a visitor on AST (make simplify abstract class, with simplify_generic and simplify_gimple visitor classes that generate corresponding IR code) ? >>> >>> Yes. Keep in mind the current state of genmatch.c is "quick hack >>> to make playing with the API side and with patterns possible" ;) >>> c) Shall it be a good idea in define_match , for name to act as a substitute for pattern (similar to flex pattern definitions), so the name can be used in bigger patterns ? >>> >>> Maybe, I suppose we'll see when adding more patterns. >>> d) This is silly, but maybe use constants to denote corresponding tree nodes ? for example instead of { build_int_cst (integer_type_node, 0); }, one could directly write 0, to denote a INTEGER_CST node with value 0. >>> >>> Yes, that might be possible - though it will require more knowledge >>> in the pattern matcher (you also want to match '0'?) and the code >>> generator. >>> e) There was a mention on the thread, regarding testing of patterns integrated into DSL. I wasn't able to understand that clearly. Could you explain that briefly ? >>> >>> DSL? Currently I'd say it would be nice to make sure each pattern >>> is triggered by at least one GCC testcase - this requires looking >>> at a particular pass dump (that of forwprop or ccp are probably most >>> suitable >>> as they are run very early). I mentioned the possibility to do offline >>> (thus not with C testcases) testing but that would require some tool >>> to do that and it would be correctness testing (some automatic proof >>> generation tool - ISTR academics have this kind of stuff). Bu
Re: [gsoc 2014] moving fold-const patterns to gimple
On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote: On Fri, Mar 14, 2014 at 9:25 PM, Marc Glisse wrote: On Fri, 14 Mar 2014, Prathamesh Kulkarni wrote: The patterns mentioned in the links were: a) (X >> CST1) >= CST2 -> X >= CST2 << CST1 however, an expression Y >= CST gets folded to Y > CST - 1 so the transform I wrote: (X >> CST1) > CST2 -> X > CST2 << CST1 That's not the same, try X=1, CST1=1, CST2=0. Ah yes. Shall following be correct ? (X >> CST1) > CST2 -> X > ( (CST2 + 1) << CST1 ) - 1 Works correctly for X=1, CST1 = 1, CST2 = 0 Looks better. Though there is still the case where the new constant overflows, in which case we can fold the comparison to false. b) (X & ~CST) == 0 -> X <= CST Uh, that can't be true for all constants, only some with a very specific shape (7 is 2^3-1). Agreed. Shall the pattern be folded if CST is 2^(n-1) ? Wrong parentheses. And I didn't really think about it, so that may not be the right test. I think it would be a good idea to write, in comments, next to each non-trivial transformation, a short "proof" (at least some form of explanation). It would help people re-reading it later see quickly why the conditions are what they are. -- Marc Glisse
Re: Legitimize address after reload
David Guillen writes: > In any case I'm not using the restrict variable and I'm assuming > strict is zero, this is, not checking the hard regsiters themselves. > This is because any reg is OK for base reg. I'm pretty sure I'm > behaving similarly to arm, cris or x86 backends. "strict" doesn't mean which hard register it is, "strict" means whether or not it's a hard register at all. If "strict" is true, you must assume any REG which isn't a real hard register (i.e. REGNO >= FIRST_PSEUDO_REGISTER) does NOT match.
Integration of ISL code generator into Graphite
Dear gcc contributors, I am going to try to participate in Google Summer of Code 2014. My project is "Integration of ISL code generator into Graphite". My proposal can be found at on the following link https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing . I would be very grateful for your comments, feedback and ideas about its improvement. - Roman Gareev
Re: Integration of ISL code generator into Graphite
On 03/14/2014 09:21 PM, Roman Gareev wrote: Dear gcc contributors, I am going to try to participate in Google Summer of Code 2014. My project is "Integration of ISL code generator into Graphite". My proposal can be found at on the following link https://drive.google.com/file/d/0B2Wloo-931AoTWlkMzRobmZKT1U/edit?usp=sharing . I would be very grateful for your comments, feedback and ideas about its improvement. Thanks Roman, I will have a look later on. For now, please make sure you already register now your proposal in Google Melange. You can always upload better/improved versions. Thanks, Tobias
PLEASE RE-ADD MIRRORS
Hello, We previously had these same mirrors up under Go-Part.com but then changed our domain to Go-Parts.com. The mirror links then dropped off. We apologize deeply for this, and assure you that this is a one-time event. Going forward, the mirrors will stay up for a very long time to come, and are being served from very reliable and fast servers, and being monitored and maintained by a very competent server admin team. PLEASE ADD: (USA) http://mirrors-usa.go-parts.com/gcc ftp://mirrors.go-parts.com/gcc rsync://mirrors.go-parts.com/gcc (Australia) http://mirrors-au.go-parts.com/gcc ftp://mirrors-au.go-parts.com/gcc rsync://mirrors-au.go-parts.com/gcc (Russia) http://mirrors-ru.go-parts.com/gcc ftp://mirrors-ru.go-parts.com/gcc rsync://mirrors-ru.go-parts.com/gcc Thanks, Dan
PLEASE RE-ADD MIRRORS (small correction)
I made a small mistake below on the ftp/rsync mirrors for the USA mirror. They should be: (USA) http://mirrors-usa.go-parts.com/gcc ftp://mirrors-usa.go-parts.com/gcc rsync://mirrors-usa.go-parts.com/gcc > From: dan1...@msn.com > To: gcc@gcc.gnu.org > Subject: PLEASE RE-ADD MIRRORS > Date: Fri, 14 Mar 2014 16:53:22 -0700 > > Hello, > > We previously had these same mirrors up under Go-Part.com but then changed > our domain to Go-Parts.com. The mirror links then dropped off. We apologize > deeply for this, and assure you that this is a one-time event. Going forward, > the mirrors will stay up for a very long time to come, and are being served > from very reliable and fast servers, and being monitored and maintained by a > very competent server admin team. > > PLEASE ADD: > > (USA) > http://mirrors-usa.go-parts.com/gcc > ftp://mirrors.go-parts.com/gcc > rsync://mirrors.go-parts.com/gcc > > > (Australia) > http://mirrors-au.go-parts.com/gcc > ftp://mirrors-au.go-parts.com/gcc > rsync://mirrors-au.go-parts.com/gcc > > (Russia) > http://mirrors-ru.go-parts.com/gcc > ftp://mirrors-ru.go-parts.com/gcc > rsync://mirrors-ru.go-parts.com/gcc > > > Thanks, > Dan