Insn missing in Size optimization(-Os)
Hi, I met a bug in my gcc porting. It work fine when executing with -O0. But with -Os, there is a insn missed. I dumped the RTL and checked them. When in movebug.c.175r.lreg, it is fine. But in next phase -- movebug.c.176r.greg, the insn missed. Here is the simple c program (movebug.c): void fun() __attribute__((noinline)); void fun() { volatile int a=0; } int main() { int i; for(i=2; i<16; ++i) { fun(); } return 0; } In *.175r.lreg, the RTL code is: (insn:HI 6 3 8 2 movebug.c:8 (set (reg/v:SI 37 [ i ]) (const_int 2 [0x2])) 2 {constant_load_si} (expr_list:REG_EQUAL (const_int 2 [0x2]) (nil))) (insn:HI 8 6 11 2 movebug.c:12 (set (reg/f:SI 42) (symbol_ref:SI ("fun") [flags 0x3] )) 15 {symbolic_address_load} (expr_list:REG_EQUIV (symbol_ref:SI ("fun") [flags 0x3] ) (nil))) (code_label:HI 11 8 7 3 4 "" [1 uses]) (note:HI 7 11 10 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn:HI 10 7 9 3 movebug.c:10 (set (reg/v:SI 37 [ i ]) (plus:SI (reg/v:SI 37 [ i ]) (const_int 1 [0x1]))) 45 {rice_addsi3} (nil)) (call_insn:HI 9 10 13 3 movebug.c:12 (parallel [ (call (mem:SI (reg/f:SI 42) [0 S4 A32]) (const_int 0 [0x0])) (clobber (reg:SI 17 LINK)) ]) 99 {call} (expr_list:REG_EH_REGION (const_int 0 [0x0]) (nil)) (nil)) The "fun" function address will be first loaded into register, and then CALL insn will refer this register which means call the function. The *.sched1 give the data flow: ;; == ;; -- basic block 2 from 6 to 8 -- before reload ;; == ;;0--> 6 r37=0x2 :nothing ;;1--> 8 r42=`fun' :nothing ;; Ready list (final): ;; total time = 1 ;; new head = 6 ;; new tail = 8 ;; == ;; -- basic block 3 from 9 to 13 -- before reload ;; == changing bb of uid 10 ;;0-->10 r37=r37+0x1 :nothing ;;1--> 9 call [r42]:nothing ;;2-->13 pc={(r37!=0x10)?L11:pc} :nothing ;; Ready list (final): ;; total time = 2 ;; new head = 10 ;; new tail = 13 ;; == ;; -- basic block 4 from 19 to 25 -- before reload ;; == ;;0-->19 R2=0x0:nothing ;;1-->25 use R2:nothing ;; Ready list (final): ;; total time = 1 ;; new head = 19 ;; new tail = 25 But in movebug.c.176r.greg, the move function address into a register insn disappeared. I don't know why the optimization step will take this insn out of the program, because this will cause a logical error. (insn:HI 6 3 37 2 movebug.c:8 (set (reg:SI 4 R4) (const_int 2 [0x2])) 2 {constant_load_si} (expr_list:REG_EQUAL (const_int 2 [0x2]) (nil))) (insn 37 6 8 2 movebug.c:8 (set (mem/c:SI (reg/f:SI 15 R15) [3 i+0 S4 A32]) (reg:SI 4 R4)) 8 {store_si} (nil)) (note:HI 8 37 11 2 NOTE_INSN_DELETED) (code_label:HI 11 8 7 3 4 "" [1 uses]) (note:HI 7 11 38 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 38 7 10 3 movebug.c:10 (set (reg:SI 4 R4) (mem/c:SI (reg/f:SI 15 R15) [3 i+0 S4 A32])) 11 {load_si} (nil)) (insn:HI 10 38 39 3 movebug.c:10 (set (reg:SI 4 R4) (plus:SI (reg:SI 4 R4) (const_int 1 [0x1]))) 45 {rice_addsi3} (nil)) (insn 39 10 9 3 movebug.c:10 (set (mem/c:SI (reg/f:SI 15 R15) [3 i+0 S4 A32]) (reg:SI 4 R4)) 8 {store_si} (nil)) (call_insn:HI 9 39 40 3 movebug.c:12 (parallel [ (call (mem:SI (reg:SI 0 R0) [0 S4 A32]) (const_int 0 [0x0])) (clobber (reg:SI 17 LINK)) ]) 99 {call} (expr_list:REG_EH_REGION (const_int 0 [0x0]) (nil)) (nil)) also in movebug.c.194r.sched2, there is an overview of the flowchart: ;; == ;; -- basic block 2 from 41 to 37 -- after reload ;; == ;;0-->41 [R15-0x4]=R14 :nothing ;;1-->42 R0=LINK :nothing changing bb of uid 44 ;;2-->44 R15=R15-0xc :nothing ;;3-->43 [R14-0x8]=R0 :nothing changing bb of uid 6 ;;4--> 6 R4=0x2:nothing ;;5-->45 R14=R15 :nothing ;;6-->37 [R15]=R4 :nothing ;; Ready list (final): ;; total time = 6 ;; new head = 46 ;; new tail = 37 ;;
Re: Insn missing in Size optimization(-Os)
Hi, Addition information, I just found. It was deleted in function: void set_insn_deleted (rtx insn), in emit-rtl.c. It is called by reload() in reload1.c. Here is the code in reload(): /* If a pseudo has no hard reg, delete the insns that made the equivalence. If that insn didn't set the register (i.e., it copied the register to memory), just delete that insn instead of the equivalencing insn plus anything now dead. If we call delete_dead_insn on that insn, we may delete the insn that actually sets the register if the register dies there and that is incorrect. */ for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) { if (reg_renumber[i] < 0 && reg_equiv_init[i] != 0) { rtx list; for (list = reg_equiv_init[i]; list; list = XEXP (list, 1)) { rtx equiv_insn = XEXP (list, 0); /* If we already deleted the insn or if it may trap, we can't delete it. The latter case shouldn't happen, but can if an insn has a variable address, gets a REG_EH_REGION note added to it, and then gets converted into a load from a constant address. */ if (NOTE_P (equiv_insn) || can_throw_internal (equiv_insn)) ; else if (reg_set_p (regno_reg_rtx[i], PATTERN (equiv_insn))) delete_dead_insn (equiv_insn); else SET_INSN_DELETED (equiv_insn); } } } But I don't know why the pseudo register can not be fit into a hard register. The insn : (insn:HI 8 6 11 2 movebug.c:12 (set (reg/f:SI 42) (symbol_ref:SI ("fun") [flags 0x3] )) 15 {symbolic_address_load} (expr_list:REG_EQUIV (symbol_ref:SI ("fun") [flags 0x3] ) (nil))) is obvious a very useful insn, not a dead one. So does anybody know why. Thanks very much. daniel.tian
GCC 4.5 Status Report (2009-12-02)
Status == The trunk is in regression and documentation fixes only mode, Stage 3 has ended yesterday. Release branch rules are now in effect for all changes to trunk that touch release critical parts of the compiler (primary and secondary targets, C and C++ and their runtimes). There will be a release candidate made available when there are no remaining P1 regressions on the trunk. If you have found a bug that you think should block the release and that is not marked as P1 regression make sure to mark it as regression with P3 and CC one of the release managers. Likewise if you think a bug should not be P1. In general all regressions toward GCC 4.4 in release critical parts of the compiler should be P1 at this point if they are build issues, ICEs or wrong-code issues. We seem to get more testing of GCC 4.5 in the field now, so bugs are more frequently reported now. Please help in analyzing (and possibly finding duplicates) and fixing bugs. Quality Data Priority # Change from Last Report --- --- P1 26 + 7 P2 93 - 15 P34 - 3 --- --- Total 123 - 11 There are 67 P4 and 74 P5 priority regressions. 71 of all regressions (including P4 and P5) are new in GCC 4.5. Previous Report === http://gcc.gnu.org/ml/gcc/2009-09/msg00657.html The next report will be sent by Jakub. -- Richard Guenther Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Re: Insn missing in Size optimization(-Os)
daniel tian writes: > Addition information, I just found. It was deleted in function: void > set_insn_deleted (rtx insn), in emit-rtl.c. You need to figure out how register 42 in the call insn got changed to register 0 if reg_renumber[42] was not set to 0. Ian
Re: Insn missing in Size optimization(-Os)
On 12/02/09 05:29, daniel tian wrote: Hi, Addition information, I just found. It was deleted in function: void set_insn_deleted (rtx insn), in emit-rtl.c. It is called by reload() in reload1.c. Here is the code in reload(): /* If a pseudo has no hard reg, delete the insns that made the equivalence. If that insn didn't set the register (i.e., it copied the register to memory), just delete that insn instead of the equivalencing insn plus anything now dead. If we call delete_dead_insn on that insn, we may delete the insn that actually sets the register if the register dies there and that is incorrect. */ for (i = FIRST_PSEUDO_REGISTER; i< max_regno; i++) { if (reg_renumber[i]< 0&& reg_equiv_init[i] != 0) { rtx list; for (list = reg_equiv_init[i]; list; list = XEXP (list, 1)) { rtx equiv_insn = XEXP (list, 0); /* If we already deleted the insn or if it may trap, we can't delete it. The latter case shouldn't happen, but can if an insn has a variable address, gets a REG_EH_REGION note added to it, and then gets converted into a load from a constant address. */ if (NOTE_P (equiv_insn) || can_throw_internal (equiv_insn)) ; else if (reg_set_p (regno_reg_rtx[i], PATTERN (equiv_insn))) delete_dead_insn (equiv_insn); else SET_INSN_DELETED (equiv_insn); } } } But I don't know why the pseudo register can not be fit into a hard register. The insn : (insn:HI 8 6 11 2 movebug.c:12 (set (reg/f:SI 42) (symbol_ref:SI ("fun") [flags 0x3])) 15 {symbolic_address_load} (expr_list:REG_EQUIV (symbol_ref:SI ("fun") [flags 0x3]) (nil))) is obvious a very useful insn, not a dead one. When a pseudo which has an equivalent form (via the REG_EQUIV note) fails to get a hard register, reload deletes the insn which sets the pseudo and instead will reload the equivalent form into a suitable hard register prior to use points. What you want to do is look at the reloads generated for insn #9. I'd hazard a guess one of them loaded the value (symbol_ref ("fun")) into R0. Then for some reason (you'll have to figure that out), the reload insn which sets R0 was deleted (or possibly doesn't get emitted because reload thought it was unncessary). jeff
RE: Understanding IRA
Jeff Law wrote: > Ian Bolton wrote: > > Initial results showed that IRA was moving input arguments out of > their > > BOTTOM_REGS (e.g. $c1) into TOP_CREGS to do work on them, since it > > thought TOP_CREGS were less costly to use, despite the cost of the > move > > instruction to get the input argument into a TOP_CREG. > > > That may indicate a cost scaling issue or more general weaknesses in > IRA's cost modeling. > > > I addressed this problem by splitting my register bank a little > > differently: instead of making a distinction between BOTTOM_REGS and > > TOP_CREGS, I made it so there was only a penalty if you used one of > the > > non-argument BOTTOM_REGS (i.e. a callee-save BOTTOM_REG). This meant > > that IRA was happy to leave input arguments in their BOTTOM_REGS but > > erred towards using TOP_CREGS once the caller-save BOTTOM_REGS had > run > > out. This was an improvement, but there was still a case where these > > '?' penalties were not aligned with reality: > > > > T1 = A + B; // can use any register, TOP_CREGS appears cheaper > > T2 = A - C; // can use any register, TOP_CREGS appears cheaper > > T3 = A& D; // must use BOTTOM_REGS > > > > The constraints for the first two instructions show that TOP_CREGS is > > cheaper, but then you have to plant a move to get A into a BOTTOM_REG > > to do the AND; in reality, we know it cheaper to have A in a > BOTTOM_REG > > all along, but the '?' constraint suggests there is a cost in doing > this > > for the ADD and SUB and so IRA will put A in a TOP_CREG at first and > > incur the cost of the move because it is still cheaper than the costs > I > > have defined in with my constraints. I don't believe there is a way > to > > communicate a conditional cost, so I'm thinking that constraints are > not > > the solution for me at this time. What are your thoughts? > > > See above. This might be a problem with scaling or weaknesses in IRA's > costing model. In theory IRA accumulates the cost of using each class > over the set of insns using a particular pseudo. I had an epiphany this morning and came up with an idea to achieve the lookahead I thought I needed, thereby making the costs created by '?' a lot more concrete and reliable. Firstly, I have altered the alt_cost adjustment (for '?') in ira-costs.c, so that it only happens on the second pass *and* only when the first pass has determined that this allocno never needs a BOTTOM_REG. The first condition (that it only occurs on the second pass) is there so that the preferred class calculated for an allocno is based on hard constraints, as opposed to the fuzzier constraints of '?'. Without this change, the second pass cannot use the preferred class to correctly add costs for each class that doesn't intersect with the preferred class. e.g. If an insn has an allocno as an operand that requires BOTTOM_REGS, then we want the cost for TOP_CREGS for that allocno in that operand to be higher to show the cost of moving it into BOTTOM_REGS. But if we let the '?' constraints dictate the pref class, and this allocno appears in other insns where the '?' constraint has appeared, then TOP_CREGS may end up being the preferred class and so this insn, which actually needs BOTTOM_REGS for its operand, will end increasing the costs for any class that doesn't intersect with TOP_CREGS (i.e. BOTTOM_REGS). I'm thinking that this change will be generally useful. What are your thoughts? The second condition is determined by me storing a new variable in each allocno on the first pass to flag whether it ever appears as an operand that must have BOTTOM_REGS. On the second pass, I can then only penalise an allocno for using my precious BOTTOM_REGS if I have already determined that it will never need them. This change is probably too specific to our case at the moment, but it could probably be made generic, if it has use to other architectures. These two changes together make the example I created above (where A appears in multiple operations) correctly allocate a BOTTOM_REG from the outset. I am yet to run benchmarks with this solution, but I think it will end up being similar to my alternate REG_ALLOC_ORDER, where I just gave out TOP_CREGS first. Sadly, I don't think either approach will handle the case where there is low pressure and we first grab a TOP_CREG for something like an ADD (due to our constraints telling us to) then we free up that register cos it's not needed anymore and then we have to use a different (BOTTOM_REG) reg for something like an AND; we should have just used just one BOTTOM_REG. I guess solving this would involve bigger changes to the algorithm, so that each allocno is aware of how many conflicting allocnos want the BOTTOM_REGS. We can then only penalise them for considering using BOTTOM_REGS if we know it will impact someone. > > > > >> You might try something like this: > >> > >> 1. Crank up the callee-saved register cost adjustment in > >> assign_hard_reg so that it's scaled based
Re: GCC 4.5 Status Report (2009-12-02)
Hello Richard, * Richard Guenther wrote on Wed, Dec 02, 2009 at 01:32:24PM CET: > The trunk is in regression and documentation fixes only mode, > Stage 3 has ended yesterday. Release branch rules are now > in effect for all changes to trunk that touch release critical > parts of the compiler (primary and secondary targets, C and > C++ and their runtimes). > We seem to get more testing of GCC 4.5 in the field now, so > bugs are more frequently reported now. Please help in > analyzing (and possibly finding duplicates) and fixing bugs. The Libtool update would fix a couple of bugs, one of which is important for MinGW at least. Any chance this could still be considered? Thanks, and sorry I didn't get it sent earlier, Ralf
Re: i370 port
I think I would stop right there. Why can't the i370 port support 64-bit integers? Plenty of 32-bit hosts support them. It got an internal error. I don't have the skills to get that to work, but I do have the skills to bypass it one way or another (and I demonstrated what I am doing now, but I know that that intrusive code will break everything else, so want to back it out, without losing the functionality that I want). A failure in your target is not a reason to change target-independent code. Well I found out what was causing this - the adddi3 definition. Commenting that out allowed the target to be built the normal way. However, when doing the host build, I wanted everything done by the (ansi) book so that I end up with code that can be compiled with a C90 compiler, be it IBM's C/370 or Borland C++. I found that I could achieve that by making my dummy cross-compile script introduce the -ansi -pedantic-errors options. However, that triggered off some more changes to configure like this ... Index: gccnew/libiberty/configure diff -c gccnew/libiberty/configure:1.1.1.3 gccnew/libiberty/configure:1.25 *** gccnew/libiberty/configure:1.1.1.3^ISun Nov 15 19:41:46 2009 --- gccnew/libiberty/configure^IWed Dec 2 17:18:07 2009 *** *** 4190,4196 #if defined (__stub_$ac_func) || defined (__stub___$ac_func) choke me #else ! char (*f) () = $ac_func; #endif #ifdef __cplusplus } --- 4190,4196 #if defined (__stub_$ac_func) || defined (__stub___$ac_func) choke me #else ! char (*f) () = (char (*)())$ac_func; #endif #ifdef __cplusplus } *** *** 4199,4205 int main () { ! return f != $ac_func; ; return 0; } --- 4199,4205 int main () { ! return f != (char (*)())$ac_func; ; return 0; } I still haven't found the wild pointer in my GCC 3.2.3 port that gets masked with xcalloc. It's a tough one because the problem keeps on disappearing whenever I try to introduce debug info and I haven't found a technique yet. So I'm working on 3.2.3, 3.4.6 and 4.4 at any particular time. :-) BFN. Paul..
Re: GCC 4.5 Status Report (2009-12-02)
On Wed, 2 Dec 2009, Ralf Wildenhues wrote: > Hello Richard, > > * Richard Guenther wrote on Wed, Dec 02, 2009 at 01:32:24PM CET: > > The trunk is in regression and documentation fixes only mode, > > Stage 3 has ended yesterday. Release branch rules are now > > in effect for all changes to trunk that touch release critical > > parts of the compiler (primary and secondary targets, C and > > C++ and their runtimes). > > > We seem to get more testing of GCC 4.5 in the field now, so > > bugs are more frequently reported now. Please help in > > analyzing (and possibly finding duplicates) and fixing bugs. > > The Libtool update would fix a couple of bugs, one of which is important > for MinGW at least. Any chance this could still be considered? I think this is up to you - I can't asses the benefits or risks of the change. Thanks, Richard.
Re: plugin issues to fix (or document) before 4.5 release
2009/11/29 Basile STARYNKEVITCH : > Hello All, > > I believe there are several plugin issues to fix before 4.5 releases: > > 1. use of libiberty from plugins. > > As several patches recently sent demonstrated, the current state of the > trunk does not work with plugins calling some of the libiberty functions is > IMHO not acceptable. > > we could take some of the following solutions > > a) explicitly document in plugins.texi that libiberty is not callable from > plugins (e.g. functions like pex_execute or make_temp_file, which are not > currently linked into cc1). This is the easiest to do. My feeling is that it > would be very unfortunate. Libiberty is documented as a portability layer; > if plugins cannot use it, that means that plugins will never be a way of > experimenting code which might (much latter) be (partly) proposed into the > trunk. So giving up libiberty in plugins is an important social decision; it > discourage people coding plugins to try to propose (much latter, when their > plugin has a big enough audience) their code into the trunk latter. So, for 4.5 I think that documenting the issue might be the right thing. For 4.6 I am not sure. Do the functions you want to use keep internal state? If they don't, it should be safe to link a static (but PIC) libiberty in the plugin. The objection for a libiberty.so was that we would have to start versioning it and making sure the ABI was stable. Was there something else? If the only objection to using a libiberty.so is the ABI, maybe we could install it in a non standard place and use rpaths. That way gcc 4.4 and 4.5 can use completely different and incompatible versions of libiberty. The plugin would have to make sure that it is linked with the correct one, but that is in line with we not making any promises for backward compatibility. Are rpaths as portable as shared libraries or do we support a host architecture that has shared libraries but no equivalent to rpath? That failing, IMHO, the best proposal is to have the build system do its best at passing --whole-archive or equivalent when linking libiberty in cc1. > > Regards. > > -- > Basile STARYNKEVITCH http://starynkevitch.net/Basile/ > email: basilestarynkevitchnet mobile: +33 6 8501 2359 > 8, rue de la Faiencerie, 92340 Bourg La Reine, France > *** opinions {are only mines, sont seulement les miennes} *** > Cheers, -- Rafael Ávila de Espíndola
[alpha] Request for help wrt gcc bugs 27468, 27469
Hi, Could someone please take a look at these two bugs? 27468 - sign-extending Alpha instructions not exploited 27469 - zero extension not eliminated [on Alpha] Andrew Pinski has confirmed both of them three and a half years ago. My uninformed feeling after seeing bugs 8603 and 42113 fixed is that both of them are relatively simple. I CC'd Richard since you probably know more about Alpha than anyone else, and I CC'd you, Uros, since you were extremely nice and helpful with bugs the other two previously mentioned bugs. I'm more than willing to do any testing I can, and I can get you access to a quad-833MHz ES40 to do testing on, if need be. Thanks, Matt Turner
Re: [alpha] Request for help wrt gcc bugs 27468, 27469
On 12/02/2009 02:46 PM, Matt Turner wrote: Hi, Could someone please take a look at these two bugs? 27468 - sign-extending Alpha instructions not exploited 27469 - zero extension not eliminated [on Alpha] Andrew Pinski has confirmed both of them three and a half years ago. My uninformed feeling after seeing bugs 8603 and 42113 fixed is that both of them are relatively simple. They aren't. As originally suggested in the PR, these are thought to be addressed by the "ZEE" pass, which still hasn't been committed to mainline. And definitely won't be for 4.5. The last iteration of the patch appears to have been http://gcc.gnu.org/ml/gcc-patches/2009-10/msg00088.html which also did something stupid like move the patch down into the i386 subdirectory. It'll never be applied in that form if I have anything to say on the subject, because obviously this is a generic sort of optimization. r~
Re: GCC 4.5 Status Report (2009-12-02)
On Wed, Dec 02, 2009 at 11:08:49PM +0100, Richard Guenther wrote: > On Wed, 2 Dec 2009, Ralf Wildenhues wrote: > > > Hello Richard, > > > > * Richard Guenther wrote on Wed, Dec 02, 2...@01:32:24PM CET: > > > The trunk is in regression and documentation fixes only mode, > > > Stage 3 has ended yesterday. Release branch rules are now > > > in effect for all changes to trunk that touch release critical > > > parts of the compiler (primary and secondary targets, C and > > > C++ and their runtimes). > > > > > We seem to get more testing of GCC 4.5 in the field now, so > > > bugs are more frequently reported now. Please help in > > > analyzing (and possibly finding duplicates) and fixing bugs. > > > > The Libtool update would fix a couple of bugs, one of which is important > > for mi...@least. Any chance this could still be considered? > > I think this is up to you - I can't asses the benefits or risks > of the change. > > Thanks, > Richard. Richard, Darwin10 should benefit from a libtool update since Peter O'Gorman's patch to leverage the new -force_load flag in Snow Leopard's ld... http://git.savannah.gnu.org/cgit/libtool.git/commit/?id=4ad8b63bd1069246c540ba2973da967fe7b68c9c is supposed to provide the equivalent to the linux ld's --whole-archive. -force_load path_to_archive Loads all members of the specified static archive library. Note: -all_load forces all members of all archives to be loaded. This option allows you to target a specific archive. This should eliminate the issues with missing object files when libtool's convenience archives are used... http://gcc.gnu.org/ml/gcc/2008-10/msg00083.html Actually, I would be very interested in testing the proposed libtool patch to verify that this new change works as advertised (coupled with the pending dsymutil fixes...http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00094.html and http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00096.html). This should finally fix up darwin for properly debugging in libstdc++ and libjava which both use such libtool convenience archives. Jack
Re: plugin issues to fix (or document) before 4.5 release
> Are rpaths as portable as shared libraries or do we support a host > architecture that has shared libraries but no equivalent to rpath? Windows (mingw) comes to mind at least. Arno