[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec
--- Comment #3 from zadeck at naturalbridge dot com 2006-04-26 14:50 --- Yes janis, it is quite likely that that patch will fix this problem. This looks like exactly the same failure as the other bug that that this patch was submitted for. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855
[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec
--- Comment #5 from zadeck at naturalbridge dot com 2006-04-26 20:51 --- Subject: Re: [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec janis at gcc dot gnu dot org wrote: > --- Comment #4 from janis at gcc dot gnu dot org 2006-04-26 17:48 --- > The patch doesn't apply cleanly now, which isn't surprising, but it also > doesn't apply to mainline sources as of 2006-03-28, when it was submitted. > What date or revision can I start with to try this patch, without porting it > forward to today's sources? > > > I will redo the patch tomorrow on the way home from california. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855
[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec
--- Comment #8 from zadeck at naturalbridge dot com 2006-04-29 04:23 --- Subject: Re: [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec janis at gcc dot gnu dot org wrote: > --- Comment #7 from janis at gcc dot gnu dot org 2006-04-29 00:02 --- > I tried the patch at http://gcc.gnu.org/ml/gcc-patches/2006-04/msg01061.html > on > powerpc64-linux and used the resulting compilers with "-O2 -fmodulo-sched" to > build SPEC CPU2000 and run with the small, test input, and also built and ran > the special version of HMMER (which uses AltiVec macros) with those same > options. I still get lots of failures: some tests ICE in the build, others > get > runtime failures. I got failures with different tests when I moved the > compiler install tree to a different system, or when I ran it as a different > user. There's something very flaky going on. > > > Janis, I have not tried spec on my powerpc system. could you send me some a spec config file and any scripts you use and your special version of HMMER. I can build this over the weekend on my g5. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855
[Bug rtl-optimization/20972] Register allocator/reload uses auto-inc register in non-addressing operand
--- Comment #10 from zadeck at naturalbridge dot com 2006-06-17 04:14 --- (In reply to comment #9) > The bug is in flow.c and fixed by the new df.c rewrite of dataflow. Ken and I > tripped over the same problem. > While I thought this earlier, I do not believe it now. There is a problem in flow that it fails to generate reg-dead notes for dead index regs in auto-inc insns, but this is a separate problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20972
[Bug rtl-optimization/36365] [4.3/4.4 Regression] Hang in df_analyze
--- Comment #13 from zadeck at naturalbridge dot com 2008-12-06 22:33 --- Subject: Re: [4.3/4.4 Regression] Hang in df_analyze steven at gcc dot gnu dot org wrote: > --- Comment #12 from steven at gcc dot gnu dot org 2008-12-06 21:25 > --- > Patch here: > http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00409.html > > Approval mail never made it through, but you can see traces of it here: > http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00410.html > > > just to make it official, approved. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36365
[Bug rtl-optimization/38532] New: dse broken for frame related stores
Some time ago, rth changed reload so that calls to dse_record_singleton_alias_set and dse_invalidate_singleton_alias_set were removed. I believe that this was an accidental side effect of fixing some other bug. These calls identified these addresses as being "special", in the sense that the values died at the end of the function. I had discussed this with vlad, because his method of allocating stack slots was different than the old ra's and he was supposed to add these calls into where ira allocates stack slots. As of this morning's trunk, this has not been done. So I am adding this bugzilla as a reminder. Kenny -- Summary: dse broken for frame related stores Product: gcc Version: 4.4.0 Status: UNCONFIRMED Keywords: ra Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: vmakarov at gcc dot gnu dot org ReportedBy: zadeck at naturalbridge dot com GCC build triplet: all GCC host triplet: all GCC target triplet: all http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38532
[Bug target/30271] -mstrict-align can an store extra for struct agrument passing
--- Comment #9 from zadeck at naturalbridge dot com 2008-12-15 15:32 --- Andrew, What is your point here? 1) Is it your claim that anything that is arg_pointer_rtx related would automatically qualify as being safe enough to remove dead stores to? or 2) Is it your claim that if we could generalize the game proposed in comment #7 to cover the arg_pointer_rtx's also? Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271
[Bug c++/37922] [4.3/4.4 Regression] code generation error
--- Comment #16 from zadeck at naturalbridge dot com 2008-12-16 18:43 --- and how would you ask that question in a machine independent way? I am going to find the shift sequence and if it has a set or clobber of any currently live hard reg, i will reject the sequence. I am working on a fix now. kenny -- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37922
[Bug c++/37922] [4.3/4.4 Regression] code generation error
--- Comment #20 from zadeck at naturalbridge dot com 2008-12-18 14:23 --- committed patch to fix this. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37922
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #3 from zadeck at naturalbridge dot com 2008-12-29 23:40 --- additional info. gcc.c-torture/compile/930523-1.c on x86-32. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #4 from zadeck at naturalbridge dot com 2009-01-02 00:38 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 2009-01-01 Kenneth Zadeck PR rtl-optimization/35805 * df-problems.c (df_lr_finalize): Add recursive call to resolve lr problem if fast dce is able to remove any instructions. * dce.c (dce_process_block): Fix dump message. This patch fixes the problem. The comment in the patch describes the issue.Since this was not really a failure, it would be hard to make this issue into a testcase. Ok to commit? Bootstrapped and regression tested on x86*. Kenny Index: df-problems.c === --- df-problems.c (revision 142954) +++ df-problems.c (working copy) @@ -1001,22 +1001,32 @@ df_lr_transfer_function (int bb_index) /* Run the fast dce as a side effect of building LR. */ static void -df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED) +df_lr_finalize (bitmap all_blocks) { if (df->changeable_flags & DF_LR_RUN_DCE) { run_fast_df_dce (); - if (df_lr->problem_data && df_lr->solutions_dirty) + + /* If dce deletes some instructions, we need to recompute the lr +solution before proceeding further. The problem is that fast +dce is a pessimestic dataflow algorithm. In the case where +it deletes a statement S inside of a loop, the uses inside of +S may not be deleted from the dataflow solution because they +were carried around the loop. While it is conservatively +correct to leave these extra bits, the standards of df +require that we maintain the best possible (least fixed +point) solution. The only way to do that is to redo the +iteration from the beginning. See PR35805 for an +example. */ + if (df_lr->solutions_dirty) { - /* If we are here, then it is because we are both verifying - the solution and the dce changed the function. In that case - the verification info built will be wrong. So we leave the - dirty flag true so that the verifier will skip the checking - part and just clean up.*/ - df_lr->solutions_dirty = true; + df_clear_flags (DF_LR_RUN_DCE); + df_lr_alloc (all_blocks); + df_lr_local_compute (all_blocks); + df_worklist_dataflow (df_lr, all_blocks, df->postorder, df->n_blocks); + df_lr_finalize (all_blocks); + df_set_flags (DF_LR_RUN_DCE); } - else - df_lr->solutions_dirty = false; } else df_lr->solutions_dirty = false; Index: dce.c === --- dce.c (revision 142954) +++ dce.c (working copy) @@ -601,7 +601,7 @@ dce_process_block (basic_block bb, bool if (dump_file) { - fprintf (dump_file, "processing block %d live out = ", bb->index); + fprintf (dump_file, "processing block %d lr out = ", bb->index); df_print_regset (dump_file, DF_LR_OUT (bb)); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #6 from zadeck at naturalbridge dot com 2009-01-02 14:09 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 Paolo Bonzini wrote: > Kenneth Zadeck wrote: > >> 2009-01-01 Kenneth Zadeck >> >> PR rtl-optimization/35805 >> * df-problems.c (df_lr_finalize): Add recursive call to resolve lr >> problem if fast dce is able to remove any instructions. >> * dce.c (dce_process_block): Fix dump message. >> >> This patch fixes the problem. The comment in the patch describes the >> issue.Since this was not really a failure, it would be hard to make >> this issue into a testcase. >> > > IIUC the bugzilla comment trail, this caused > gcc.c-torture/compile/930523-1.c to fail with --enable-checking=df; > that's already a testcase. > > >> Ok to commit? >> > > Hmmm... I am not sure I like this patch, for two reasons. > > 1) it might incur a compile-time penalty for the sake of verification, > even with df checking disabled. OTOH having possibly different code for > checking and non-checking compilation is even worse. > > There is a compile time penalty here but it is not for the sake of verification. It is for the sake of getting the best answer going forward, into the computation of live. There was a deeper bug here. The code that was removed which cleared the solutions_dirty flag is really wrong, because it lets the conservative solution go forward and the next call to df_analyze will not even try to redo anything and thus improve the solution. That was how vlad saw the extra bits even though he was calling df_analyze before using the bits. On the other hand, if you do not clear that flag in the old way, the verifier will fail. > 2) there are already provisions in dce.c to redo the analysis. But they > do not get to the least fixed point because they just rebuild the local > bitmaps and iterate from the existing solution. Instead of iterating > "while (global_changed)", we could try doing only one iteration (it's a > fast DCE after all, and the pessimistic dataflow makes me guess that > subsequent DCE iterations won't find much?) and zap the solution there. > This has the advantage that we can skip the recomputation if > global_changed is false. > > Did I miss anything? > > I think so. The global changed flag allows it to delete the case: loop: ... <- x // This is dead. x- <- ... go to loop it just is not going to get rid of it if there is is no kill of x inside the loop. Anyway. the loop inside the fast dce code will only cause one extra iteration of the blocks, and because of that it is still pessimistic. > > Paolo > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #8 from zadeck at naturalbridge dot com 2009-01-02 15:20 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 Paolo Bonzini wrote: >> I think so. The global changed flag allows it to delete the case: >> >> loop: >> ... <- x // This is dead. >> x- <- ... >> go to loop >> >> it just is not going to get rid of it if there is is no kill of x inside >> the loop. >> > > I just don't think it's acceptable to load each and every "fast DCE" > with the burden of a full df solution. We need to find a way to limit > this to the cases when it is needed, or at least not to be too > conservative in ascertaining *when* it is needed. > i am not, i am only doing it for each and every dce, only if the dce actually deletes code. If there was a faster way to determine if the solution was too conservative than redoing it, you would have an effective incremental dataflow analysis algorithm. I strongly believe that such a technique does not exist. > Hence my first and foremost question is: does it happen that the > solution is wrong and global_changed never became true? > > The example in the pr exhibits this property. the problem is that deleting the use of pseudo 69 does not cause bit 69 to ever get turned off because it was live at the bottom of the loop (since it had been propagated around the loop to start with.) Hence, when you get to the top of the loop, there are no changes at all with respect to pseudo 69 and local_changed would not have been set. (I do not know if it is really true for the example that local_changes is not set, because the deletion of the kill on the set side of the insn could have caused that to happen. But the point is that with respect to position 69, the use in the deleted insn would not have caused local_changed to be set.) > If the answer is "definitely no", then an alternative preferrable > patch would be to move the code you added to df-problems.c into dce.c, > so that the full analysis (including rebuilding the bitmaps and > iterating possibly many times) is not run if it was to yield the same > answer that was before in the bitmaps. > > Paolo > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #9 from zadeck at naturalbridge dot com 2009-01-02 15:34 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 On looking at the code, there is an issue with the first patch. I should have been clearing solutions_dirty flag at the start of the function. However, I do not think that this is the issue that you are complaining about. What this corrects is the case where the solution was dirty before the first call to df_analyze and dce finds nothing to delete. In that case, the code would have redone the lr solution for no reason. I will test this patch, but we still need to resolve your issues with my approach. Kenny zadeck at naturalbridge dot com wrote: > --- Comment #8 from zadeck at naturalbridge dot com 2009-01-02 15:20 > --- > Subject: Re: [ira] error in start_allocno_priorities, > at ira-color.c:1806 > > Paolo Bonzini wrote: > >>> I think so. The global changed flag allows it to delete the case: >>> >>> loop: >>> ... <- x // This is dead. >>> x- <- ... >>> go to loop >>> >>> it just is not going to get rid of it if there is is no kill of x inside >>> the loop. >>> >>> >> I just don't think it's acceptable to load each and every "fast DCE" >> with the burden of a full df solution. We need to find a way to limit >> this to the cases when it is needed, or at least not to be too >> conservative in ascertaining *when* it is needed. >> >> > i am not, i am only doing it for each and every dce, only if the dce > actually deletes code. > > If there was a faster way to determine if the solution was too > conservative than redoing it, you would have an effective incremental > dataflow analysis algorithm. I strongly believe that such a technique > does not exist. > >> Hence my first and foremost question is: does it happen that the >> solution is wrong and global_changed never became true? >> >> >> > The example in the pr exhibits this property. the problem is that > deleting the use of pseudo 69 does not cause bit 69 to ever get turned > off because it was live at the bottom of the loop (since it had been > propagated around the loop to start with.) Hence, when you get to the > top of the loop, there are no changes at all with respect to pseudo 69 > and local_changed would not have been set. (I do not know if it is > really true for the example that local_changes is not set, because the > deletion of the kill on the set side of the insn could have caused that > to happen. But the point is that with respect to position 69, the use > in the deleted insn would not have caused local_changed to be set.) > > >> If the answer is "definitely no", then an alternative preferrable >> patch would be to move the code you added to df-problems.c into dce.c, >> so that the full analysis (including rebuilding the bitmaps and >> iterating possibly many times) is not run if it was to yield the same >> answer that was before in the bitmaps. >> >> Paolo >> >> > > > Index: ChangeLog === --- ChangeLog (revision 142954) +++ ChangeLog (working copy) @@ -1,3 +1,10 @@ +2009-01-01 Kenneth Zadeck + + PR rtl-optimization/35805 + * df-problems.c (df_lr_finalize): Add recursive call to resolve lr + problem if fast dce is able to remove any instructions. + * dce.c (dce_process_block): Fix dump message. + 2008-12-29 Seongbae Park * tree-profile.c (tree_init_ic_make_global_vars): Make static Index: df-problems.c === --- df-problems.c (revision 142954) +++ df-problems.c (working copy) @@ -1001,25 +1001,34 @@ df_lr_transfer_function (int bb_index) /* Run the fast dce as a side effect of building LR. */ static void -df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED) +df_lr_finalize (bitmap all_blocks) { + df_lr->solutions_dirty = false; if (df->changeable_flags & DF_LR_RUN_DCE) { run_fast_df_dce (); - if (df_lr->problem_data && df_lr->solutions_dirty) + + /* If dce deletes some instructions, we need to recompute the lr +solution before proceeding further. The problem is that fast +dce is a pessimestic dataflow algorithm. In the case where +it deletes a statement S inside of a loop, the uses inside of +S may not be deleted from the dataflow solution because they +were carried around the loop. While it is conservatively +correct to leave these extra bits, the standards of df +
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #11 from zadeck at naturalbridge dot com 2009-01-02 18:21 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 Paolo Bonzini wrote: >> I will test this patch, but we still need to resolve your issues with my >> approach. >> > > The problem is that you're really doubling the cost of computing the > live registers. I know that previously it was wrong, but at this point > there's no difference with the full-blown pass... Despite the idea of > DF_LR_RUN_DCE being that it was "free", now it would do the same work as > a pass_fast_rtl_dce modulo some O(#bbs) work. > you are being too pessimistic. most of the time, dce finds nothing. If DCE finds nothing, then the second pass does not run. I considered just fixing the verification part (not clearing the solutions_dirty flag) and letting the next call to df_analyze clean things up. In this way it would be like every other pass and leave things dirty until the next pass that needed the info. StevenB talked me out of this because he considered it wrong to have the client pass get conservative info. I agreed with him but I am willing to change my mind if you really want to push your case. > At this point, if your patch costs say 0.3%, and removing all traces of > DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in > passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to > look for opportunities to remove some useless DCE. > > If it wasn't for verification, we could just decide that DF_LR_RUN_DCE > is only for passes that can tolerate a little inaccurate info... > > This was in fact my argument to stevenb. The point is that the live info which is run after it will generally hide this conservativeness. On the other hand we do have standards that we always use the best info As i pointed out on irc, the only reason that vlad noticed this at all was that he uses the wrong sets in his code (and he was running at O1 in this case.) At O2 and above he should be using the DF_LIVE sets. Kenny > Paolo > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #14 from zadeck at naturalbridge dot com 2009-01-02 18:54 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 Steven Bosscher wrote: > On Fri, Jan 2, 2009 at 7:37 PM, Paolo Bonzini wrote: > >>>> At this point, if your patch costs say 0.3%, and removing all traces >>>> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in >>>> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to >>>> look for opportunities to remove some useless DCE. >>>> >> I'll try to do this for 4.5. >> > > It might be more worthwhile to just "fix" IRA to use DF_LIVE (which > Vlad should have done in the first place). Then we wouldn't need > Kenny's patch and DF_LR_RUN_DCE would still be essentially free. > > Gr. > Steven There is the issue of correctness vs rot. I actually think that one of the reasons that flow was so bad was that people went down this long slippery slope of well it is good enough here ... and we really can get away with it not being right here ... and after a while, all you have is garbage. The problem with this game is that it is not maintainable. Those kinds of decisions tend to get forgotten and lost as the personnel supporting the compiler changes.Even if it is a fractional percentage slower, the fact that you do not have to reason about it as the compiler evolves is actually quite important. Thus, I plan to both fix this bug and add another one for vlad to fix the sets that he uses. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #16 from zadeck at naturalbridge dot com 2009-01-03 00:35 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 Kenneth Zadeck wrote: > Steven Bosscher wrote: > >> On Fri, Jan 2, 2009 at 7:37 PM, Paolo Bonzini wrote: >> >> >>>>> At this point, if your patch costs say 0.3%, and removing all traces >>>>> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in >>>>> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to >>>>> look for opportunities to remove some useless DCE. >>>>> >>>>> >>> I'll try to do this for 4.5. >>> >>> >> It might be more worthwhile to just "fix" IRA to use DF_LIVE (which >> Vlad should have done in the first place). Then we wouldn't need >> Kenny's patch and DF_LR_RUN_DCE would still be essentially free. >> >> Gr. >> Steven >> > There is the issue of correctness vs rot. I actually think that one of > the reasons that flow was so bad was that people went down this long > slippery slope of well it is good enough here ... and we really can get > away with it not being right here ... and after a while, all you have is > garbage. > > The problem with this game is that it is not maintainable. Those kinds > of decisions tend to get forgotten and lost as the personnel supporting > the compiler changes.Even if it is a fractional percentage slower, > the fact that you do not have to reason about it as the compiler evolves > is actually quite important. > > Thus, I plan to both fix this bug and add another one for vlad to fix > the sets that he uses. > > Kenny > 2009-01-02 Kenneth Zadeck PR rtl-optimization/35805 * df-problems.c (df_lr_finalize): Add recursive call to resolve lr problem if fast dce is able to remove any instructions. * dce.c (dce_process_block): Fix dump message. Rebootstrapped and regression tested on x86*. Committed as revision 143027. Kenny Index: ChangeLog === --- ChangeLog (revision 142954) +++ ChangeLog (working copy) @@ -1,3 +1,10 @@ +2009-01-01 Kenneth Zadeck + + PR rtl-optimization/35805 + * df-problems.c (df_lr_finalize): Add recursive call to resolve lr + problem if fast dce is able to remove any instructions. + * dce.c (dce_process_block): Fix dump message. + 2008-12-29 Seongbae Park * tree-profile.c (tree_init_ic_make_global_vars): Make static Index: df-problems.c === --- df-problems.c (revision 142954) +++ df-problems.c (working copy) @@ -1001,25 +1001,34 @@ df_lr_transfer_function (int bb_index) /* Run the fast dce as a side effect of building LR. */ static void -df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED) +df_lr_finalize (bitmap all_blocks) { + df_lr->solutions_dirty = false; if (df->changeable_flags & DF_LR_RUN_DCE) { run_fast_df_dce (); - if (df_lr->problem_data && df_lr->solutions_dirty) + + /* If dce deletes some instructions, we need to recompute the lr +solution before proceeding further. The problem is that fast +dce is a pessimestic dataflow algorithm. In the case where +it deletes a statement S inside of a loop, the uses inside of +S may not be deleted from the dataflow solution because they +were carried around the loop. While it is conservatively +correct to leave these extra bits, the standards of df +require that we maintain the best possible (least fixed +point) solution. The only way to do that is to redo the +iteration from the beginning. See PR35805 for an +example. */ + if (df_lr->solutions_dirty) { - /* If we are here, then it is because we are both verifying - the solution and the dce changed the function. In that case - the verification info built will be wrong. So we leave the - dirty flag true so that the verifier will skip the checking - part and just clean up.*/ - df_lr->solutions_dirty = true; + df_clear_flags (DF_LR_RUN_DCE); + df_lr_alloc (all_blocks); + df_lr_local_compute (all_blocks); + df_worklist_dataflow (df_lr, all_blocks, df->postorder, df->n_blocks); + df_lr_finalize (all_blocks); + df_set_flags (DF_LR_RUN_DCE); } - else - df_lr->solutions_dirty = false; } - else -df_lr->solutions_dirty = false; } Index: dce.c === --- dce.c (revision 1429
[Bug rtl-optimization/38711] New: ira should not be using df-lr except at -O1.
Ira should be using the DF-LIVE sets, which are smaller than the DF-LR sets when they are available (typically at O2 and above). The proper sets can be conveniently accessed using the df_get_live_[in,out] functions which use DF-LIVE if it is available and fall back to DF-LR if it is not. -- Summary: ira should not be using df-lr except at -O1. Product: gcc Version: unknown Status: UNCONFIRMED Keywords: ra Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: vmakarov at gcc dot gnu dot org ReportedBy: zadeck at naturalbridge dot com GCC build triplet: all GCC host triplet: all GCC target triplet: all http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38711
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #17 from zadeck at naturalbridge dot com 2009-01-03 01:05 --- patch committed to fix this. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug rtl-optimization/38774] [4.4 Regression] ice in df_refs_verify, at df-scan.c:4307
--- Comment #2 from zadeck at naturalbridge dot com 2009-01-09 12:41 --- i will have my best people work on it. -- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38774
[Bug rtl-optimization/38774] [4.4 Regression] ice in df_refs_verify, at df-scan.c:4307
--- Comment #3 from zadeck at naturalbridge dot com 2009-01-10 01:57 --- Created an attachment (id=17068) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17068&action=view) patch to cause df to verify after every patch this is a combine bug. The df verification fails after combine makes some modification to the cc arg of insn 28 in the foo function that bypasses the implicit and explicit calls to mark the insn as being changed. I am looking into trying to figure out what path thru combine is doing this. However, if some combine expert (or just someone who wants to prove that they have better skill with the debugger than I do) wants to get there first, be my guest. I have attached a patch that improves some of the debugging and causes df to verify after every pass. This patch causes the failure to move from being in ira, to the start of if conversion after combine. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38774
[Bug rtl-optimization/36365] [4.3 Regression] Hang in df_analyze
--- Comment #17 from zadeck at naturalbridge dot com 2009-01-24 20:28 --- Subject: Re: [4.3 Regression] Hang in df_analyze rguenth at gcc dot gnu dot org wrote: > --- Comment #16 from rguenth at gcc dot gnu dot org 2009-01-24 10:20 > --- > GCC 4.3.3 is being released, adjusting target milestone. > > > steven, did you fix this and forget to close it? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36365
[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented
--- Comment #4 from zadeck at naturalbridge dot com 2009-01-28 16:03 --- Subject: Re: [4.3/4.4 Regression] life passes dump option still documented rguenth at gcc dot gnu dot org wrote: > --- Comment #3 from rguenth at gcc dot gnu dot org 2009-01-24 10:20 > --- > GCC 4.3.3 is being released, adjusting target milestone. > > > This may be more a change than is acceptable right now for 4.4. If so I will sit on this patch until 4.5 opens up. The patch is basically a complete rewrite of the part of invoke.texi that deals with dump options for the rtl pass. This section had badly rotted. I started from a grep of the sources looking for "rtl_opt_pass" and documented all of the passes that i found in mostly alphabetical order. Where the old version documented several passes together, I kept that unless things had changed. In total there were about a half dozen passes that were no longer there and about a dozen new passes that had not been documented. I did make some changes in the code, which is the reason that this may not be acceptable to 4.4. The changes are pretty harmless: all of them involve either removing the pass name or changing it. 1) Pass names that contained dashes had the dashes changed to underscores. About half used slashes and half underscores and I went with underscores to avoid a possible ambiguity with the options parsing. 2) I also removed the pass name from 6 passes that do not print anything or dump the code. 3) Files that contained multiple passes with names of the form xx, xx2... were renamed xx1,xx2. This later change causes a test suit failure which was fixed. All of these changes are pretty minor. The only possible failure these can cause are in the test suite where dump files are scanned. I tested this on x86 and ppc both 32 and 64. It is possible that there are platform specific regression tests that scan for dump files that were not caught on these four targets. I also left in lreg and greg.These are at the end and need to be deleted along with those passes. I have enclosed a copy of the new text. The diff is unreadable. ok for 4.4 or should i wait for 4.5? Kenny 2009-01-28 Kenneth Zadeck PR middle-end/35854 * doc/invoke.texi (rtl debug options): Complete rewrite. * auto-inc-dec.c (pass_inc_dec): Rename pass from "auto-inc-dec" to auto_inc_dec". * df-core.c (df_pass_initialize_opt, df_pass_initialize_no_opt, df_pass_finish): Removed pass name. * mode-switching.c (pass_mode_switching): Rename pass from "mode-sw" to "mode_sw". * except.c (pass_convert_to_eh_ranges): Rename pass from "eh-ranges" to "eh_ranges". * regclass.c (pass_regclass_init, pass_subregs_of_mode_init, pass_subregs_of_mode_finish): Removed pass name. * lower-subreg.c (pass_lower_subreg): Renamed pass from "subreg" to "subreg1". 2009-01-28 Kenneth Zadeck PR middle-end/35854 * gcc.dg/lower-subreg-1.c: Renamed dump pass from "subreg" to "subreg1" == @item -...@var{letters} @itemx -fdump-r...@var{pass} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}.This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}. @var{dumpname} is generated from the name of the output file, if explicitly specified and it is not an executable, otherwise it is the basename of the source file. These switches may have different effects when @option{-E} is used for preprocessing. Debug dumps can be enabled with a @option{-fdump-rtl} switch or some @option{-d} option @var{letters}. Here are the possible letters for use in @var{pass} and @var{letters}, and their meanings: @table @gcctabopt @item -fdump-rtl-alignments @opindex fdump-rtl-alignments Dump after branch alignments have been computed. @item -fdump-rtl-asmcons @opindex fdump-rtl-asmcons Dump after fixing rtl statements that have unsatisfied in/out constraints. @item -fdump-rtl-auto_inc_dec @opindex fdump-rtl-auto_inc_dec Dump after auto-inc-dec discovery. This pass is only run on architectures that have auto inc or auto dec instructions. @item -fdump-rtl-barriers @opindex fdump-rtl-barriers Dump after cleaning up the barrier instructions. @item -fdump-rtl-bbpart @opindex fdump-rtl-bbpart Dump after partitioning hot and cold basic blocks. @item -fdump-rtl-bbro @opindex fdump-rtl-bbro Dump after block reordering. @item -fdump-rtl-btl1 @itemx -fdump-rtl-btl2 @opindex fdump-rtl-btl2 @opindex fdump-rtl-btl2 @option{-fdump-rtl-btl1} and @option{-fdump-rtl-btl2} enable dumping after the two branch target load optimization passes. @item -fdump-rtl-bypass @opindex fdump-rtl-bypass Dump after jump bypassing and control flow optimizatio
[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented
--- Comment #7 from zadeck at naturalbridge dot com 2009-01-29 14:38 --- Subject: Re: [4.3/4.4 Regression] life passes dump option still documented Richard Guenther wrote: > On Wed, Jan 28, 2009 at 5:03 PM, Kenneth Zadeck > wrote: > >> rguenth at gcc dot gnu dot org wrote: >> >>> --- Comment #3 from rguenth at gcc dot gnu dot org 2009-01-24 10:20 >>> --- >>> GCC 4.3.3 is being released, adjusting target milestone. >>> >>> >>> >>> >> This may be more a change than is acceptable right now for 4.4. If so >> I will sit on this patch until 4.5 opens up. The patch is basically a >> complete rewrite of the part of invoke.texi that deals with dump options >> for the rtl pass. This section had badly rotted. >> >> I started from a grep of the sources looking for "rtl_opt_pass" and >> documented all of the passes that i found in mostly alphabetical >> order. Where the old version documented several passes together, I >> kept that unless things had changed. In total there were about a half >> dozen passes that were no longer there and about a dozen new passes that >> had not been documented. >> >> I did make some changes in the code, which is the reason that this may >> not be acceptable to 4.4. The changes are pretty harmless: all of them >> involve either removing the pass name or changing it. >> >> 1) Pass names that contained dashes had the dashes changed to >> underscores. About half used slashes and half underscores and I went >> with underscores to avoid a possible ambiguity with the options parsing. >> >> 2) I also removed the pass name from 6 passes that do not print anything >> or dump the code. >> > > I think this change is agains what was asked for in the past. We want to have > pass names for all passes. > > >> 3) Files that contained multiple passes with names of the form xx, >> xx2... were renamed xx1,xx2. >> This later change causes a test suit failure which was fixed. >> >> All of these changes are pretty minor. The only possible failure these >> can cause are in the test suite where dump files are scanned. >> >> I tested this on x86 and ppc both 32 and 64. It is possible that there >> are platform specific regression tests that scan for dump files that >> were not caught on these four targets. >> >> I also left in lreg and greg.These are at the end and need to be >> deleted along with those passes. >> >> I have enclosed a copy of the new text. The diff is unreadable. >> >> ok for 4.4 or should i wait for 4.5? >> > > This is ok for 4.4 if you remove the parts that remove pass names. Please > wait a day for comments from others. > > Thanks, > Richard. > > > I put those pass names back, but I documented them as producing no output. I also removed the lreg and greg part since the RA removal patch has been approved. committed as revision 143756 kenny 2009-01-29 Kenneth Zadeck PR middle-end/35854 * doc/invoke.texi (rtl debug options): Complete rewrite. * auto-inc-dec.c (pass_inc_dec): Rename pass from "auto-inc-dec" to auto_inc_dec". * mode-switching.c (pass_mode_switching): Rename pass from "mode-sw" to "mode_sw". * except.c (pass_convert_to_eh_ranges): Rename pass from "eh-ranges" to "eh_ranges". * lower-subreg.c (pass_lower_subreg): Renamed pass from "subreg" to "subreg1". 2009-01-29 Kenneth Zadeck PR middle-end/35854 * gcc.dg/lower-subreg-1.c: Renamed dump pass from "subreg" to "subreg1" Index: doc/invoke.texi === --- doc/invoke.texi (revision 143754) +++ doc/invoke.texi (working copy) @@ -4545,172 +4545,275 @@ preprocessing. Debug dumps can be enabled with a @option{-fdump-rtl} switch or some @option{-d} option @var{letters}. Here are the possible -letters for use in @var{letters} and @var{pass}, and their meanings: +letters for use in @var{pass} and @var{letters}, and their meanings: @table @gcctabopt -...@item -dA -...@opindex dA -Annotate the assembler output with miscellaneous debugging information. + +...@item -fdump-rtl-alignments +...@opindex fdump-rtl-alignments +Dump after branch alignments have been computed. + +...@item -fdump-rtl-asmcons +...@opindex fdump-rtl-asmcons +Dump after fixing rtl statements that have unsatisfied in/out constraints. + +...@item -fdump-rtl-auto_inc_dec +...@opindex fdump-rtl-auto_inc_dec +Dump after auto-inc-dec discovery. This pass is only run on +architectures that hav
[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented
--- Comment #8 from zadeck at naturalbridge dot com 2009-01-29 14:42 --- patch committed. closed for 4.4. richi said not to backport to 4.3 on irc. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35854
[Bug rtl-optimization/25483] [4.2 Regression] ICE on valid code with -O2 -fmove-loop-invariants
-- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25483
[Bug rtl-optimization/25483] [4.2 Regression] ICE on valid code with -O2 -fmove-loop-invariants
--- Comment #7 from zadeck at naturalbridge dot com 2005-12-19 19:43 --- I had messed up the original change to df.c. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25483
[Bug rtl-optimization/25799] [42. Regression] cc1 stalled with -O1 -fmodulo-sched
-- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-01-16 19:11:26 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25799
[Bug rtl-optimization/25799] [4.2 Regression] cc1 stalled with -O1 -fmodulo-sched
--- Comment #8 from zadeck at naturalbridge dot com 2006-01-20 01:33 --- 2005-01-19 Kenneth Zadeck <[EMAIL PROTECTED]> PR rtl-optimization/25799 * df-problems.c (df_ru_confluence_n, df_rd_confluence_n): Corrected confluence operator to remove bits from op2 before oring with op1 rather than removing bits from op1. (df_ru_transfer_function): Corrected test on wrong bitmap which caused infinite loop. Both of these problems were introduced in the dataflow rewrite. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25799
[Bug target/29083] useless clrlwi instruction produced for 16-bit bitfield
--- Comment #2 from zadeck at naturalbridge dot com 2006-09-14 12:51 --- Subject: Re: useless clrlwi instruction produced for 16-bit bitfield bonzini at gnu dot org wrote: > --- Comment #1 from bonzini at gnu dot org 2006-09-14 12:07 --- > The sole difference in the IR is > > ;; if ((int) node->x == a) goto ; else (void) 0; > (insn 19 18 20 (set (reg:HI 125) > (mem/s/j:HI (reg/v/f:SI 123 [ node ]) > [2 .x+0 S2 A32])) -1 (nil) > > ;; if ((int) MEM[base: (short unsigned int *) node] == a) goto ; else > (void) 0; > (insn 20 19 21 (set (reg:HI 125) > (mem/s:HI (reg/v/f:SI 123 [ node ]) > [3 .x+0 S2 A8])) -1 (nil) > (nil)) > > (COMPONENT_REF vs. TARGET_MEM_REF, the first produces A32 and the second A8) > > > > > > It's actually flow's fault, because it fails to recognize a PRE_MODIFY > address, > and things go downhill from there: life1 dump is > >16 r121:SI=r121:SI+0x1 |17 r122:SI=r122:SI+0x1 >18 r123:SI=r123:SI-0x4 |20 r126:HI=[--r124:SI] >19 r125:HI=[r123:SI] | REG_INC: r124:SI >20 r124:SI=zero_extend(r125:HI)|21 r125:SI=zero_extend(r126:HI) > REG_DEAD: r125:HI | REG_DEAD: r126:HI >21 r126:CC=cmp(r124:SI,r121:SI)|22 r127:CC=cmp(r125:SI,r122:SI) > REG_DEAD: r124:SI | REG_DEAD: r125:SI >22 pc={(r126:CC==0x0)?L13:pc} |23 pc={(r127:CC==0x0)?L14:pc} > REG_DEAD: r126:CC | REG_DEAD: r127:CC > REG_BR_PROB: 0x22c4 REG_BR_PROB: 0x22c4 >24 NOTE_INSN_BASIC_BLOCK |25 NOTE_INSN_BASIC_BLOCK >28 NOTE_INSN_FUNCTION_END |29 NOTE_INSN_FUNCTION_END >31 r3:SI=r121:SI |32 r3:SI=r122:SI > REG_DEAD: r121:SI | REG_DEAD: r122:SI >37 use r3:SI |38 use r3:SI > > while combine dump is > >14 NOTE_INSN_BASIC_BLOCK |15 NOTE_INSN_BASIC_BLOCK >16 r121:SI=r121:SI+0x1 |17 r122:SI=r122:SI+0x1 >18 NOTE_INSN_DELETED |20 NOTE_INSN_DELETED >19 {r125:HI=[r123:SI-0x4];r123:SI= |21 r125:SI=zero_extend([--r124:SI] >20 r124:SI=zero_extend(r125:HI)| REG_INC: r124:SI > REG_DEAD: r125:HI |22 r127:CC=cmp(r125:SI,r122:SI) >21 r126:CC=cmp(r124:SI,r121:SI)| REG_DEAD: r125:SI > REG_DEAD: r124:SI > > where it has synthesized a movsi_movhi_update1, but then failed to implement > the merged. > > Could this be fixed on dataflow-branch? > > > The current flow does not recognize any pre modify cases. What flow does do is recognize pre_increment, which is a subset of pre_modify that has the restriction that the width of the load be equal to the amount of the increment. By changing the type of x, you made the example fit into the restrictions of the current code. The post side of things in flow is a little more general than the pre side because this was hacked for the ia-64. My code on the dataflow branch knows what the machine is capable of doing and would get this case, since the ppc is capable of much more general updates. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29083
[Bug debug/31412] [4.3] inf loop/long compile time, time spent in var-tracking.c
--- Comment #14 from zadeck at naturalbridge dot com 2007-04-03 16:47 --- Subject: Re: [4.3] inf loop/long compile time, time spent in var-tracking.c steven at gcc dot gnu dot org wrote: > --- Comment #13 from steven at gcc dot gnu dot org 2007-04-03 16:40 > --- > So this may be a non-monotonous dataflow problem...? > > Do we have the dataflow equations of the var-tracking problem somewhere? It'd > be interesting to check them against the actual implementation. > > > this is a pretty complex problem. I gave it a cursory once over and it looks like the problem may not terminate if the location (stack offset) of a variable is not the same on all paths into a block. (the code may be different than the comments and i did just scan this) I assume that this case has a "bug" where a variable appears to be at a different location coming across an exception edge. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31412
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #17 from zadeck at naturalbridge dot com 2007-07-16 23:26 --- Subject: Re: [4.3 regression]: gfortran.dg/auto_array_1.f90 hjl at lucon dot org wrote: > --- Comment #16 from hjl at lucon dot org 2007-07-16 19:27 --- > revision 125923 works. Kenny, it looks like your patch > > http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html > > causes this regression. Can you look into it? Thanks. > > > I will look into this as soon as the bootstrap starts working again on the ia-64. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #21 from zadeck at naturalbridge dot com 2007-07-26 17:35 --- Subject: Re: [4.3 regression]: gfortran.dg/auto_array_1.f90 Seongbae Park (???, ???) wrote: > On 7/26/07, Kenneth Zadeck <[EMAIL PROTECTED]> wrote: >> This patch extends the fix in >> http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html >> to handle the case of clobbers inside conditional calls. >> >> This problem caused the regression of gfortran.dg/matmul_3.f90 on the >> ia-64 in addition to the regression cited in this pr. >> >> Tested on ppc-32, ia-64 and x86-64. >> >> 2007-07-26 Kenneth Zadeck <[EMAIL PROTECTED]> >> >> PR middle-end/32749 >> >> * df-problems.c (df_note_bb_compute): Handle case of clobber >> inside conditional call. >> >> ok to commit? > > This change is OK. > Though I wonder if we need to do similar checking > for the regular insn case below. No the checking is done in df_create_unused_note. The only reason you have to do it here is that you are not calling that. thanks kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #19 from zadeck at naturalbridge dot com 2007-07-26 11:51 --- Subject: Re: [4.3 regression]: gfortran.dg/auto_array_1.f90 This patch extends the fix in http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html to handle the case of clobbers inside conditional calls. This problem caused the regression of gfortran.dg/matmul_3.f90 on the ia-64 in addition to the regression cited in this pr. Tested on ppc-32, ia-64 and x86-64. 2007-07-26 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/32749 * df-problems.c (df_note_bb_compute): Handle case of clobber inside conditional call. ok to commit? kenny Index: df-problems.c === --- df-problems.c (revision 126918) +++ df-problems.c (working copy) @@ -3989,7 +3989,7 @@ df_note_bb_compute (unsigned int bb_inde /* However a may or must clobber still needs to kill the reg so that REG_DEAD notes are later placed appropriately. */ - else + else if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) bitmap_clear_bit (live, DF_REF_REGNO (def)); } } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #18 from zadeck at naturalbridge dot com 2007-07-25 18:41 --- i am testing a patch. -- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED Last reconfirmed|2007-07-13 00:25:37 |2007-07-25 18:41:41 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #25 from zadeck at naturalbridge dot com 2007-07-27 17:29 --- Subject: Re: [4.3 regression]: gfortran.dg/auto_array_1.f90 This patch rearranges the updating of the local dataflow info when building reg_dead notes. The need for this was that processing was not correctly handled for clobbers that occurred within conditional call insns. A rare case but one that at least happens on the ia-64. This patch not only fixes the regressions listed in pr32749, but also fixes the gfortran.dg/matmul_3.f90 on the ia-64 regressions. This patch was bootstrapped and regression tested yesterday on x86-64 and ia-64 and was again bootstrapped this morning on x86-64 (just to make sure there were no interactions with richard sandiford's fixes to closely related code that was just committed.) Committed as revision 126987. Kenny 2007-07-26 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/32749 * df-problems.c (df_create_unused_note): Removed do_not_gen parm and the updating of the live and do_not_gen sets. (df_note_bb_compute): Added updating of live and do_not_gen sets for regular defs so that the case of clobber inside conditional call is processed correctly. Index: df-problems.c === --- df-problems.c (revision 126979) +++ df-problems.c (working copy) @@ -3868,13 +3868,12 @@ df_set_dead_notes_for_mw (rtx insn, rtx } -/* Create a REG_UNUSED note if necessary for DEF in INSN updating LIVE - and DO_NOT_GEN. Do not generate notes for registers in artificial - uses. */ +/* Create a REG_UNUSED note if necessary for DEF in INSN updating + LIVE. Do not generate notes for registers in ARTIFICIAL_USES. */ static rtx df_create_unused_note (rtx insn, rtx old, struct df_ref *def, - bitmap live, bitmap do_not_gen, bitmap artificial_uses) + bitmap live, bitmap artificial_uses) { unsigned int dregno = DF_REF_REGNO (def); @@ -3899,12 +3898,6 @@ df_create_unused_note (rtx insn, rtx old #endif } - if (!(DF_REF_FLAGS (def) & (DF_REF_MUST_CLOBBER + DF_REF_MAY_CLOBBER))) -bitmap_set_bit (do_not_gen, dregno); - - /* Kill this register if it is not a subreg store or conditional store. */ - if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) -bitmap_clear_bit (live, dregno); return old; } @@ -3915,7 +3908,7 @@ df_create_unused_note (rtx insn, rtx old static void df_note_bb_compute (unsigned int bb_index, - bitmap live, bitmap do_not_gen, bitmap artificial_uses) + bitmap live, bitmap do_not_gen, bitmap artificial_uses) { basic_block bb = BASIC_BLOCK (bb_index); rtx insn; @@ -4012,17 +4005,17 @@ df_note_bb_compute (unsigned int bb_inde for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++) { struct df_ref *def = *def_rec; - if (!(DF_REF_FLAGS (def) & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER))) - old_unused_notes - = df_create_unused_note (insn, old_unused_notes, - def, live, do_not_gen, - artificial_uses); - - /* However a may or must clobber still needs to kill the -reg so that REG_DEAD notes are later placed -appropriately. */ - else - bitmap_clear_bit (live, DF_REF_REGNO (def)); + unsigned int dregno = DF_REF_REGNO (def); + if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)) + { + old_unused_notes + = df_create_unused_note (insn, old_unused_notes, +def, live, artificial_uses); + bitmap_set_bit (do_not_gen, dregno); + } + + if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL | DF_REF_CONDITIONAL)) + bitmap_clear_bit (live, dregno); } } else @@ -4043,10 +4036,16 @@ df_note_bb_compute (unsigned int bb_inde for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++) { struct df_ref *def = *def_rec; + unsigned int dregno = DF_REF_REGNO (def); old_unused_notes = df_create_unused_note (insn, old_unused_notes, -def, live, do_not_gen, -artificial_uses); +def, live, artificial_uses); + + if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)) + bitmap_set_bit (do_not_gen, dregno); + + if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL | DF_REF_CONDITIONAL)) + bitmap_clear_bit (live, dregno); } } -- http://gcc.gnu.org/b
[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90
--- Comment #26 from zadeck at naturalbridge dot com 2007-07-27 17:33 --- revision 126987 -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749
[Bug target/32431] [4.3 Regression] ICE in df_refs_verify, at df-scan.c:4066
--- Comment #3 from zadeck at naturalbridge dot com 2007-08-02 19:19 --- Given that the rtl passes are moving to not allow illegally shared rtl, i do not believe that the resolution of this bug has anything to do with the dataflow port. If this bug is to be resolved, it will be done by cleaning up this back end. -- zadeck at naturalbridge dot com changed: What|Removed |Added CC|zadeck at naturalbridge dot | |com | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32431
[Bug rtl-optimization/32300] [4.3 Regression] ICE with -O2 -fsee
--- Comment #10 from zadeck at naturalbridge dot com 2007-08-17 12:48 --- Subject: Re: [4.3 Regression] ICE with -O2 -fsee wouter dot vermaelen at scarlet dot be wrote: > --- Comment #9 from wouter dot vermaelen at scarlet dot be 2007-08-17 > 12:44 --- > Here is a simpler testcase: > > int f(int i) { return 100LL / (1 + i); } > > > thanks, everyone knows what the problems with see.c are, it is simply a matter of having the authors fix their code. Virtually anything that invokes this pass will cause it to fail. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32300
[Bug target/33151] Invalid insn with pre_inc
--- Comment #4 from zadeck at naturalbridge dot com 2007-08-23 18:59 --- Subject: Re: Invalid insn with pre_inc pinskia at gcc dot gnu dot org wrote: > --- Comment #3 from pinskia at gcc dot gnu dot org 2007-08-22 22:41 > --- > I think we need a new predicate for this rtl instruction, currently we just > have: >(clobber (match_operand:DF 4 "memory_operand" "=o")) > > > After thinking about this last night, i believe that this problem should be solved at the machine description level, not by changing auto-inc-dec.c. Auto-inc-dec.c uses all of the standard interfaces to keep from generating invalid rtl. So it seems proper to have the md level not allow the creation of this insn. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33151
[Bug middle-end/32758] [4.3 Regression] ecj1 hangs
--- Comment #30 from zadeck at naturalbridge dot com 2007-08-29 15:34 --- Subject: Re: [4.3 Regression] ecj1 hangs bonzini at gnu dot org wrote: > --- Comment #29 from bonzini at gnu dot org 2007-08-29 14:16 --- > (When I said "post your first patch", I meant the first one from comment #26; > if my "fixing the mess" works, it'll not be necessary anymore). > > > For some reason, I was not copied on any of the postings for this patch until this morning. First, thankyou Jakub and Andreas for going this. I think that it is obvious that you have spotted the exact problem: in some way shape form of fashion, the artificial uses at the end of the block need to be re added into the live set after the processing of each insn in the block. There are two ways of doing this (assume that you have a local variable called artificial_uses_fixup which is a pointer to either df->eh_block_artificial_uses or df->regular_block_artificial_uses depending on if the block has eh preds) : 1) you can explicitly or artificial_uses_fixup into local_live after processing each insn. 2) you can test artificial_uses_fixup along with local_live when setting needed. As noted, (1) has the problem that may cause an infinite loop. This infinite loop could be fixed by changing the equation for block_changed to be !bitmap_equal (local_live, DF_LR_IN (BB) || artificial_uses_fixup) i.e. the infinite loop is because DF_LR_IN may be deficient in some of the bits in artificial_uses_fixup for basically the same reason that caused the bug in the first place. I personally think that solution (1) is preferable to (2) because it is fewer bitmap operations even though it will require a extra temp bitmap to hold the or. But either patch is a reasonable approach. As far as why there are all of the df_simulate functions that do things in different ways, the answer is that the code has evolved and sometimes things get missed. The addition of the df->eh_block_artificial_uses and df->regular_block_artificial_uses sets is fairly recent and it would most likely be useful to replace walks of artificial_uses with them. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32758
[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)
--- Comment #4 from zadeck at naturalbridge dot com 2007-08-30 14:43 --- Subject: Re: failing rtl iv analysis (maybe due to df) dorit at gcc dot gnu dot org wrote: > --- Comment #3 from dorit at gcc dot gnu dot org 2007-08-30 08:12 --- > (In reply to comment #2) > >> I suspect this might be due to not updating the rd information after >> unrolling. >> Can you check if >> analyze_insns_in_loop() (which calls df_analyze()) is being called just >> before >> the problematic unrolling ? >> > > it looks like it's called just before the unroller actually transforms > somthing, but not before the (failing) analysis. But when I add a call to it > in > decide_peel_completely the analysis still fails. > > > dorit, i am having trouble exactly reproducing this example because you did not give the svn revision and so all of the numbers are a little bit different. However, I am going to submit a patch which improves the dump information a lot for these passes and we should talk about it after we can get on the same page. However, from looking at your posting, there are some issues that you may want to look at before we talk: The reaching defs problem makes a scan for all of the defs in the blocks in the region. Once all of the defs are found, they are sorted where the primary key is the regno. The id's (DF_REF_ID) are then assigned based on this sorting. The reaching defs problem actually depends on all of the defs for a regno to be contigious. The DF_REF_IDs are not stable between calls to df_set_blocks and any def outside of the region has an undefined DF_REF_ID. In your posting you have: > Below is the output of df_ref_debug for adef in each iteration of the loop in > latch_dominating_def: > d40 reg 187 bb 3 insn 255 flag 0x0 type 0x0 loc 0xf7da4608(0xf7d9a4e0) chain > { } > d93 reg 187 bb 2 insn 40 flag 0x0 type 0x0 loc 0xf7d89cc8(0xf7d9a4e0) chain { > } The number after the first "d" is the DF_REF_ID. Note that they are not contiguous. Given the sorting that occurred, they must be contiguous. I assume from this that someone is holding on to old id's. This is not correct. If you are going to play the game with df_set_blocks, you are allowed to hold onto a def, but not the DF_REF_ID, you cannot look at the DF_REF_ID for a def that is not in the blocks set by df_set_blocks. Kenny Index: df-core.c === --- df-core.c (revision 127917) +++ df-core.c (working copy) @@ -1761,6 +1761,7 @@ df_print_regset (FILE *file, bitmap r) /* Dump dataflow info. */ + void df_dump (FILE *file) { @@ -1778,6 +1779,33 @@ df_dump (FILE *file) } +/* Dump dataflow info for df->blocks_to_analyze. */ + +void +df_dump_region (FILE *file) +{ + if (df->blocks_to_analyze) +{ + bitmap_iterator bi; + unsigned int bb_index; + + fprintf (file, "\n\nstarting region dump\n"); + df_dump_start (file); + + EXECUTE_IF_SET_IN_BITMAP (df->blocks_to_analyze, 0, bb_index, bi) + { + basic_block bb = BASIC_BLOCK (bb_index); + + df_print_bb_index (bb, file); + df_dump_top (bb, file); + df_dump_bottom (bb, file); + } + fprintf (file, "\n"); +} + else df_dump (file); +} + + /* Dump the introductory information for each problem defined. */ void Index: df.h === --- df.h(revision 127917) +++ df.h(working copy) @@ -836,6 +836,7 @@ extern bool df_reg_used (rtx, rtx); extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int); extern void df_print_regset (FILE *file, bitmap r); extern void df_dump (FILE *); +extern void df_dump_region (FILE *); extern void df_dump_start (FILE *); extern void df_dump_top (basic_block, FILE *); extern void df_dump_bottom (basic_block, FILE *); Index: loop-invariant.c === --- loop-invariant.c(revision 127917) +++ loop-invariant.c(working copy) @@ -644,6 +644,7 @@ find_defs (struct loop *loop, basic_bloc if (dump_file) { + df_dump_region (dump_file); fprintf (dump_file, "*starting processing of loop **\n"); print_rtl_with_bb (dump_file, get_insns ()); fprintf (dump_file, "*ending processing of loop **\n"); Index: loop-iv.c === --- loop-iv.c (revision 127917) +++ loop-iv.c (working copy) @@ -280,7 +280,7 @@ iv_analysis_loop_init (struct loop *loop df_set_blocks (blocks); df_analyze (); if (dump_file) -df_dump (dump_file); +df_dump_region (dump_file); check_iv_ref_table_size (); BITMAP_FREE (blocks); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224
[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)
--- Comment #8 from zadeck at naturalbridge dot com 2007-08-30 18:51 --- Subject: Re: failing rtl iv analysis (maybe due to df) rakdver at kam dot mff dot cuni dot cz wrote: > --- Comment #7 from rakdver at kam dot mff dot cuni dot cz 2007-08-30 > 18:09 --- > Subject: Re: failing rtl iv analysis (maybe due to df) > > >> The only thing that you are allowed to do with the DF_REF_ID is to get >> it from a df_def >> AFTER YOU ARE SURE THAT THE DEF IS IN THE REGION >> > > OK, this might be the problem; the code takes the defs from the reg->def > lists, and checks whether the defs are set in the reaching def bitmaps. > Naturally, it assumes that when the region is set by df_set_blocks, the > reaching def bitmaps will only contain the defs that belong to the > region (which used to be true before your changes). > > And it is still true now. The set of bits in the bitmaps are EXACTLY the set of defs inside the region. The thing that has changed is that the location (slot) in the bitmap is only defined after the calls to df_set_blocks and df_analyze, i.e. the slots in the bitvectors are moved around by these calls. In your example, you asked about 2 defs. One of those defs is in the region and one of them is outside the region. It is not that the bits are zero for a def outside of the region, there is no slot in the bitvectors that corresponds to that def in the bitvectors. You are not allowed to look in the bitmap for the def outside of the region as ask any questions at all if they involve the DF_REF_ID. For the def that is in the region, you can ask but you cannot use the DF_REF_ID that it had before the call to set_blocks. That old one is trash. What has changed, and this was a very old change, from the time that danny still worked at ibm, was that the DF_REF_ID's are not stable and the slots change after setting the blocks in the region. One of the first df patches that was committed by us was to reorganize the bits so that all of the refs for a single reg were contiguous. This gave a factor of 7 speedup over the old code because it allowed for the use of new bitmap operations that worked over dense range indexes. I assume that this code has not really worked since then. > Anyway, it would be nice to have some documentation for df (there > is only a short notice in > http://gcc.gnu.org/onlinedocs/gccint/Liveness-information.html#Liveness-information, > which appears wrong given the importance of this api), in particular > pointing out such non-obvious traps would be great. > > > This is only an issue if you use df_set_blocks and the only passes that use it are these zdenek's loop passes. If I had my way (and infinite free time) I would get rid of df_set_blocks anyway. The information that it provides is generally wrong since it ignores information that enters a block from the outside, but if you are very careful to only ask a very limited range of questions, as Zdenek did, it can give you what you want relatively inexpensively. Furthermore, it has been a real pain to keep it correct as the rest of df has evolved. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224
[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)
--- Comment #9 from zadeck at naturalbridge dot com 2007-08-30 18:57 --- Subject: Re: failing rtl iv analysis (maybe due to df) zadeck at naturalbridge dot com wrote: > --- Comment #8 from zadeck at naturalbridge dot com 2007-08-30 18:51 > --- > Subject: Re: failing rtl iv analysis (maybe due > to df) > > rakdver at kam dot mff dot cuni dot cz wrote: > >> --- Comment #7 from rakdver at kam dot mff dot cuni dot cz 2007-08-30 >> 18:09 --- >> Subject: Re: failing rtl iv analysis (maybe due to df) >> >> >> >>> The only thing that you are allowed to do with the DF_REF_ID is to get >>> it from a df_def >>> AFTER YOU ARE SURE THAT THE DEF IS IN THE REGION >>> >>> >> OK, this might be the problem; the code takes the defs from the reg->def >> lists, and checks whether the defs are set in the reaching def bitmaps. >> Naturally, it assumes that when the region is set by df_set_blocks, the >> reaching def bitmaps will only contain the defs that belong to the >> region (which used to be true before your changes). >> >> >> > And it is still true now. The set of bits in the bitmaps are EXACTLY > the set of defs inside the region. The thing that has changed is that > the location (slot) in the bitmap is only defined after the calls to > df_set_blocks and df_analyze, i.e. the slots in the bitvectors are moved > around by these calls. > > In your example, you asked about 2 defs. One of those defs is in the > region and one of them is outside the region. It is not that the bits > are zero for a def outside of the region, there is no slot in the > bitvectors that corresponds to that def in the bitvectors. You are not > allowed to look in the bitmap for the def outside of the region as ask > any questions at all if they involve the DF_REF_ID. For the def that is > in the region, you can ask but you cannot use the DF_REF_ID that it had > before the call to set_blocks. That old one is trash. > > What has changed, and this was a very old change, from the time that > danny still worked at ibm, was that the DF_REF_ID's are not stable and > the slots change after setting the blocks in the region. One of the > first df patches that was committed by us was to reorganize the bits so > that all of the refs for a single reg were contiguous. This gave a > factor of 7 speedup over the old code because it allowed for the use of > new bitmap operations that worked over dense range indexes. I assume > that this code has not really worked since then. > > >> Anyway, it would be nice to have some documentation for df (there >> is only a short notice in >> http://gcc.gnu.org/onlinedocs/gccint/Liveness-information.html#Liveness-information, >> which appears wrong given the importance of this api), in particular >> pointing out such non-obvious traps would be great. >> >> >> >> > This is only an issue if you use df_set_blocks and the only passes that > use it are these zdenek's loop passes. If I had my way (and infinite > free time) I would get rid of df_set_blocks anyway. The information > that it provides is generally wrong since it ignores information that > enters a block from the outside, but if you are very careful to only ask > a very limited range of questions, as Zdenek did, it can give you what > you want relatively inexpensively. > > Furthermore, it has been a real pain to keep it correct as the rest of > df has evolved. > > > > sorry zdenek, i misread who this was from, i would not have referred to you in the third person if i had read it correctly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224
[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)
--- Comment #11 from zadeck at naturalbridge dot com 2007-08-30 21:46 --- Subject: Re: failing rtl iv analysis (maybe due to df) rakdver at gcc dot gnu dot org wrote: > --- Comment #10 from rakdver at gcc dot gnu dot org 2007-08-30 20:05 > --- > I know how to fix the problem, now. > > > thanks kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224
[Bug bootstrap/32161] stage1 libgcc is being built unoptimized
--- Comment #3 from zadeck at naturalbridge dot com 2007-08-31 21:34 --- At least on the x86-32, libgcc is currently being built optimized, but the options are slightly different. the stage1 build does not do -fomit-frame-pointer. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32161
[Bug rtl-optimization/32300] [4.3 Regression] ICE with -O2 -fsee
--- Comment #13 from zadeck at naturalbridge dot com 2007-09-05 01:24 --- Subject: Re: [4.3 Regression] ICE with -O2 -fsee jakub at gcc dot gnu dot org wrote: > --- Comment #12 from jakub at gcc dot gnu dot org 2007-09-04 23:37 > --- > Fixed. > > > jakub thanks for doing this. The changes to df are fine, but i think that it exceeds my authority to approve more than that. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32300
[Bug target/32481] ICE in df_refs_verify, at df-scan.c:4058
--- Comment #11 from zadeck at naturalbridge dot com 2007-10-04 20:51 --- spark fixed this in comment #10. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32481
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
-- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
--- Comment #12 from zadeck at naturalbridge dot com 2007-10-05 13:02 --- Subject: Re: [4.3 regression]: wrong code with -fforce-addr rguenth at gcc dot gnu dot org wrote: > --- Comment #11 from rguenth at gcc dot gnu dot org 2007-10-05 12:36 > --- > But powf is pure/const, so the call is not a use. > > > that is the reason that the call did not kill the potential set of dead stores. I will look at this later today. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
--- Comment #15 from zadeck at naturalbridge dot com 2007-10-05 20:17 --- Subject: Re: [4.3 regression]: wrong code with -fforce-addr kargl at gcc dot gnu dot org wrote: > --- Comment #13 from kargl at gcc dot gnu dot org 2007-10-05 17:50 > --- > (In reply to comment #9) > >>> Hope this helps. >>> >> Sure, I've got the problem. The problem is actually in RTL optimization, >> where >> dse1 pass removes wrong insn. >> >> Suprisingly, the problem is in line 61 of comunpack.f: >> >> --> bscale = 2.0**real(idrstmpl(2)) >> dscale = 10.0**real(-idrstmpl(3)) >> >> > > This meant for Manfred instead of Uros, but it does contain the > relevant info. Manfred, you told me elsewhere that you use -fforce-addr > to achieve better performance. Whoever wrote this code should be > flogged. idrstmpl is an INTEGER variable, and gfortran can generate > much faster code for integer exponents than calling __builtin_powf. > > Try changing the lines to > > bscale = 2.0**idrstmpl(2) > dscale = 10.0**(-idrstmpl(3)) > > This, of course, doesn't fix the underlying bug. > > > neither richi nor myself are able to reproduced the problem. ./xgcc -B. -O2 -march=pentium4 -c mova2i.c -DLINUX ./gfortran -fforce-addr -B. -B../i686-pc-linux-gnu/libgfortran/.libs -O2 - march=pentium4 -o main main.f comunpack.f rdieee.f gbytesc.f mova2i.o and i get the same thing with and without -fforce-addr -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #7 from zadeck at naturalbridge dot com 2007-10-06 04:11 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf hjl at lucon dot org wrote: > --- Comment #5 from hjl at lucon dot org 2007-10-06 02:07 --- > Kenny, does your patch > > http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00124.html > > handle cases where number of consecutive hard regs needed to hold some mode > > 1 > correctly? IA32 needs 2 hard registers to hold long long and your patch > miscompiles the testcase in comment #4. > > > I will look into it. It should do this correctly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
--- Comment #17 from zadeck at naturalbridge dot com 2007-10-06 12:27 --- Subject: Re: [4.3 regression]: wrong code with -fforce-addr ubizjak at gmail dot com wrote: > --- Comment #16 from ubizjak at gmail dot com 2007-10-06 06:49 --- > (In reply to comment #14) > >> The testcase works for me, that is, it produces the expected output good.out. >> > > Uh, you have to un-comment the line 315 of the comunpack.f test. The testcase, > as attached, produces good code. Un-commenting line 315, you will get: > > > you are making this into something of a scavenger hunt. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
--- Comment #18 from zadeck at naturalbridge dot com 2007-10-06 13:07 --- Subject: Re: [4.3 regression]: wrong code with -fforce-addr ubizjak at gmail dot com wrote: > --- Comment #16 from ubizjak at gmail dot com 2007-10-06 06:49 --- > (In reply to comment #14) > >> The testcase works for me, that is, it produces the expected output good.out. >> > > Uh, you have to un-comment the line 315 of the comunpack.f test. The testcase, > as attached, produces good code. Un-commenting line 315, you will get: > > .L80: > movl$0x4000, (%esp) > callpowf > fstps -152(%ebp) > negl-136(%ebp) > fildl -136(%ebp) > fstps 4(%esp) > movl$0x4120, (%esp) > callpowf > > Note that only one argument is loaded to the stack before first powf. > > Without -fforce-address on un-commented testcase, we got: > > .L80: > fildl -132(%ebp) > fstps 4(%esp) > movl$0x4000, (%esp) > callpowf > fstps -140(%ebp) > negl-128(%ebp) > fildl -128(%ebp) > fstps 4(%esp) > movl$0x4120, (%esp) > callpowf > > > ian, As you may remember, the dse code assumes that it can "see" all of the stores that are frame_related. It appears that with the -fforce-addr option this is not true. in this particular example, a frame related pointer gets loaded into register 755 very early on (in a different block) and since const calls only disqualify frame-related stores, (since they may push params onto the stack), the parameter push is considered dead. My question to you, is the proper fix to check flag_force-addr and if it is set just assume that every store may be frame related or is there some sort of tea leaf that i might have access to know that reg 755 is used in this way? (note that you have to jump thru a few hoops to recreate this, since comunpack.f is in a separate attachment from the rest of the code and you have to uncomment line 315 to recreate the bug.) Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
--- Comment #20 from zadeck at naturalbridge dot com 2007-10-06 21:20 --- Subject: Re: [4.3 regression]: wrong code with -fforce-addr ubizjak at gmail dot com wrote: > --- Comment #19 from ubizjak at gmail dot com 2007-10-06 19:58 --- > In dse.c, scan_insn(), we have: > > if ((GET_CODE (PATTERN (insn)) == CLOBBER) > || volatile_refs_p (PATTERN (insn)) > || (flag_non_call_exceptions && may_trap_p (PATTERN (insn))) > || (RTX_FRAME_RELATED_P (insn)) > || find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL_RTX)) > insn_info->cannot_delete = true; > > And since the docs say that: > > `RTX_FRAME_RELATED_P (X)' > Nonzero in an `insn', `call_insn', `jump_insn', `barrier', or > `set' which is part of a function prologue and sets the stack > pointer, sets the frame pointer, or saves a register. This flag > should also be set on an instruction that sets up a temporary > register to use in place of the frame pointer. Stored in the > `frame_related' field and printed as `/f'. > > I wonder if the insn that stores to (or uses(?)) this temporary register (in > place of the frame pointer) should also be marked as frame related insn? > > So, all the insns in the sequence of > > set tmpreg, FP + const > ... > store (tmpreg) > > should be marked as frame related insns. > > > i was not referring to the frame_related flag, though i guess it could be taken over for this purpose. Note that the frame_related flag is for use by the prologue and this is not. This is just a register that happens to point into the frame, which i think is only ever created if you say -fforce-addr. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #9 from zadeck at naturalbridge dot com 2007-10-07 03:18 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf hj, here is a fix. I will most likely post the patch on monday after i get it really tested on a bunch of platforms. The fix is in the third stanza, the rest is better logging. The failure only happens if you have a block with 2 or more uses of a multiword pseudo register that is local to this block and has been allocated by local_alloc. The uses must be in a particular form: the last use was a subreg use that only used some of the hard registers and a previous non subreg use of the multiword register. When all of this happens, the code did not properly expand this to a whole multiregister when the second to last use is encounterd in the backwards scan. I.e. a lot of things have to happen to get this to fail. Thanks for the small test case, that really helped. Kenny Index: ra-conflict.c === --- ra-conflict.c(revision 129036) +++ ra-conflict.c(working copy) @@ -76,7 +76,7 @@ record_one_conflict_between_regnos (enum enum machine_mode mode2, int r2) { if (dump_file) -fprintf (dump_file, " rocbr adding %d<=>%d\n", r1, r2); +fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2); if (reg_allocno[r1] >= 0 && reg_allocno[r2] >= 0) { int tr1 = reg_allocno[r1]; @@ -293,9 +293,6 @@ set_conflicts_for_earlyclobber (rtx insn recog_data.operand[use + 1]); } } - - if (dump_file) -fprintf (dump_file, " finished early clobber conflicts.\n"); } @@ -876,7 +873,7 @@ global_conflicts (void) allocnum, renumber); } - else if (GET_ALLOCNO_LIVE (allocnos_live, allocnum) == 0) + else { if (dump_file) fprintf (dump_file, "dying pseudo\n"); @@ -963,6 +960,8 @@ global_conflicts (void) FIXME: We should consider either adding a new kind of clobber, or adding a flag to the clobber distinguish these two cases. */ + if (dump_file && VEC_length (df_ref_t, clobbers)) +fprintf (dump_file, " clobber conflicts\n"); for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--) { struct df_ref *def = VEC_index (df_ref_t, clobbers, k); @@ -1024,6 +1023,8 @@ global_conflicts (void) if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets (insn)) { int j; + if (dump_file) +fprintf (dump_file, " multiple sets\n"); for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--) { int used_in_output = 0; -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
-- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED GCC target triplet||linux/ia32 Last reconfirmed|2007-10-07 09:41:07 |2007-10-07 11:36:14 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #10 from zadeck at naturalbridge dot com 2007-10-07 21:57 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf This patch fixes pr33669. The failure only happens if you have a block with 2 or more uses of a multiword pseudo register that is local to this block and has been allocated by local_alloc. The uses must be in a particular form: the last use must be a subreg use that only used some of the hard registers and a previous non subreg use of the multiword register. When all of this happens, the code did not properly expand this to a whole multiregister when the second to last use is encountered in the backwards scan. I.e. a lot of things have to happen to get this to fail. I have tested this patch on ia-64, x86-{64,32} and ppc-32. Ok for commit? Kenny 2007-10-07 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33669 * ra-conflict.c (record_one_conflict_between_regnos, set_conflicts_for_earlyclobber, global_conflicts): Improved logging. (global_conflicts): Removed incorrect check. 2007-10-07 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33669 * gcc.c-torture/execute/pr33669.c: New. Index: ra-conflict.c === --- ra-conflict.c (revision 129053) +++ ra-conflict.c (working copy) @@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum int allocno2 = reg_allocno[r2]; if (dump_file) -fprintf (dump_file, " rocbr adding %d<=>%d\n", r1, r2); +fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2); if (allocno1 >= 0 && allocno2 >= 0) set_conflict (allocno1, allocno2); @@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn recog_data.operand[use + 1]); } } - - if (dump_file) -fprintf (dump_file, " finished early clobber conflicts.\n"); } @@ -984,7 +981,7 @@ global_conflicts (void) allocnum, renumber); } - else if (!sparseset_bit_p (allocnos_live, allocnum)) + else { if (dump_file) fprintf (dump_file, "dying pseudo\n"); @@ -1071,6 +1068,8 @@ global_conflicts (void) FIXME: We should consider either adding a new kind of clobber, or adding a flag to the clobber distinguish these two cases. */ + if (dump_file && VEC_length (df_ref_t, clobbers)) + fprintf (dump_file, " clobber conflicts\n"); for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--) { struct df_ref *def = VEC_index (df_ref_t, clobbers, k); @@ -1132,6 +1131,8 @@ global_conflicts (void) if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets (insn)) { int j; + if (dump_file) + fprintf (dump_file, " multiple sets\n"); for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--) { int used_in_output = 0; Index: testsuite/gcc.c-torture/execute/pr33669.c === --- testsuite/gcc.c-torture/execute/pr33669.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr33669.c (revision 0) @@ -0,0 +1,40 @@ +extern void abort (void); + +typedef struct foo_t +{ + unsigned int blksz; + unsigned int bf_cnt; +} foo_t; + +#define _RNDUP(x, unit) x) + (unit) - 1) / (unit)) * (unit)) +#define _RNDDOWN(x, unit) ((x) - ((x)%(unit))) + +long long +foo (foo_t *const pxp, long long offset, unsigned int extent) +{ + long long blkoffset = _RNDDOWN(offset, (long long )pxp->blksz); + unsigned int diff = (unsigned int)(offset - blkoffset); + unsigned int blkextent = _RNDUP(diff + extent, pxp->blksz); + + if (pxp->blksz < blkextent) +return -1LL; + + if (pxp->bf_cnt > pxp->blksz) +pxp->bf_cnt = pxp->blksz; + + return blkoffset; +} + +int +main () +{ + foo_t x; + long long xx; + + x.blksz = 8192; + x.bf_cnt = 0; + xx = foo (&x, 0, 4096); + if (xx != 0LL) +abort (); + return 0; +} -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH
--- Comment #2 from zadeck at naturalbridge dot com 2007-10-08 03:53 --- *** This bug has been marked as a duplicate of 33669 *** -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #11 from zadeck at naturalbridge dot com 2007-10-08 03:53 --- *** Bug 33662 has been marked as a duplicate of this bug. *** -- zadeck at naturalbridge dot com changed: What|Removed |Added CC||kkojima at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #14 from zadeck at naturalbridge dot com 2007-10-09 15:32 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf hjl at gcc dot gnu dot org wrote: > --- Comment #13 from hjl at gcc dot gnu dot org 2007-10-09 14:00 --- > Subject: Bug 33669 > > Author: hjl > Date: Tue Oct 9 14:00:11 2007 > New Revision: 129166 > > URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129166 > Log: > gcc/ > > 2007-10-09 Kenneth Zadeck <[EMAIL PROTECTED]> > > PR middle-end/33669 > * ra-conflict.c (record_one_conflict_between_regnos, > set_conflicts_for_earlyclobber, global_conflicts): Improved > logging. > (global_conflicts): Removed incorrect check. > > gcc/testsuite/ > > 2007-10-09 Kenneth Zadeck <[EMAIL PROTECTED]> > > PR middle-end/33669 > * gcc.c-torture/execute/pr33669.c: New. > > Added: > trunk/gcc/testsuite/gcc.c-torture/execute/pr33669.c > Modified: > trunk/gcc/ChangeLog > trunk/gcc/ra-conflict.c > trunk/gcc/testsuite/ChangeLog > > > please back this out. i have a different patch that i have finished testing. this one is too conservative. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #15 from zadeck at naturalbridge dot com 2007-10-09 15:41 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf This patch fixes the problem in a slightly different way. The other patch was too conservative in that it ended up setting the added flag too often what has some downstream quality issues. I just finished testing this on x86-64, x86-32, ppc-32 and ia-64 kenny 2007-10-07 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33669 * ra-conflict.c (record_one_conflict_between_regnos, set_conflicts_for_earlyclobber, global_conflicts): Improved logging. (global_conflicts): Removed incorrect check. 2007-10-07 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33669 * gcc.c-torture/execute/pr33669.c: New. Index: ra-conflict.c === --- ra-conflict.c (revision 129053) +++ ra-conflict.c (working copy) @@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum int allocno2 = reg_allocno[r2]; if (dump_file) -fprintf (dump_file, " rocbr adding %d<=>%d\n", r1, r2); +fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2); if (allocno1 >= 0 && allocno2 >= 0) set_conflict (allocno1, allocno2); @@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn recog_data.operand[use + 1]); } } - - if (dump_file) -fprintf (dump_file, " finished early clobber conflicts.\n"); } @@ -983,12 +980,12 @@ global_conflicts (void) set_renumbers_live (&renumbers_live, live_subregs, live_subregs_used, allocnum, renumber); } - - else if (!sparseset_bit_p (allocnos_live, allocnum)) + else if (live_subregs_used[allocnum] > 0 + || !sparseset_bit_p (allocnos_live, allocnum)) { if (dump_file) - fprintf (dump_file, "dying pseudo\n"); - + fprintf (dump_file, "%sdying pseudo\n", +(live_subregs_used[allocnum] > 0) ? "partially ": ""); /* Resetting the live_subregs_used is effectively saying do not use the subregs because we are reading the whole pseudo. */ @@ -1071,6 +1068,8 @@ global_conflicts (void) FIXME: We should consider either adding a new kind of clobber, or adding a flag to the clobber distinguish these two cases. */ + if (dump_file && VEC_length (df_ref_t, clobbers)) + fprintf (dump_file, " clobber conflicts\n"); for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--) { struct df_ref *def = VEC_index (df_ref_t, clobbers, k); @@ -1132,6 +1131,8 @@ global_conflicts (void) if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets (insn)) { int j; + if (dump_file) + fprintf (dump_file, " multiple sets\n"); for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--) { int used_in_output = 0; @@ -1166,7 +1167,7 @@ global_conflicts (void) } } - /* Add the renumbers live to the hard_regs_live for the next few + /* Add the renumbers live to the hard_regs_live for the next few calls. All of this gets recomputed at the top of the loop so there is no harm. */ IOR_HARD_REG_SET (hard_regs_live, renumbers_live); Index: testsuite/gcc.c-torture/execute/pr33669.c === --- testsuite/gcc.c-torture/execute/pr33669.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr33669.c (revision 0) @@ -0,0 +1,40 @@ +extern void abort (void); + +typedef struct foo_t +{ + unsigned int blksz; + unsigned int bf_cnt; +} foo_t; + +#define _RNDUP(x, unit) x) + (unit) - 1) / (unit)) * (unit)) +#define _RNDDOWN(x, unit) ((x) - ((x)%(unit))) + +long long +foo (foo_t *const pxp, long long offset, unsigned int extent) +{ + long long blkoffset = _RNDDOWN(offset, (long long )pxp->blksz); + unsigned int diff = (unsigned int)(offset - blkoffset); + unsigned int blkextent = _RNDUP(diff + extent, pxp->blksz); + + if (pxp->blksz < blkextent) +return -1LL; + + if (pxp->bf_cnt > pxp->blksz) +pxp->bf_cnt = pxp->blksz; + + return blkoffset; +} + +int +main () +{ + foo_t x; + long long xx; + + x.blksz = 8192; + x.bf_cnt = 0; + xx = foo (&x, 0, 4096); + if (xx != 0LL) +abort (); + return 0; +} -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #18 from zadeck at naturalbridge dot com 2007-10-10 03:39 --- Subject: Re: [4.3 Regression] Revision 128957 miscompiles 481.wrf HJ, Sorry about the committing snafu. I should have posted the irc log of seonbae's comments to the log for the bug. Also I had a meeting in the city tonight, so there was not time to commit it between when seonbae gave the final approval and when i had to catch my train. I have committed the corrected patch as revision 129193. It looks like you had left the testcase when you reverted so there is no test case in this patch. This patch was tested on ia-64, ppc-32, xa6-{64,32}. Kenny > This patch fixes pr33669 <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669>. > > The failure only happens if you have a block with 2 or more uses of a > multiword pseudo register that is local to this block and has been > allocated by local_alloc. The uses must be in a particular form: the > last use must be a subreg use that only used some of the hard registers and > a previous non subreg use of the multiword register. > > When all of this happens, the code did not properly expand this to a > whole multiregister when the second to last use is encountered in the > backwards scan. > > I.e. a lot of things have to happen to get this to fail. > > I have tested this patch on ia-64, x86-{64,32} and ppc-32. > 2007-10-07 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33669 * ra-conflict.c (record_one_conflict_between_regnos, set_conflicts_for_earlyclobber, global_conflicts): Improved logging. (global_conflicts): Enhanced incorrect check. Index: ra-conflict.c === --- ra-conflict.c (revision 129192) +++ ra-conflict.c (working copy) @@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum int allocno2 = reg_allocno[r2]; if (dump_file) -fprintf (dump_file, " rocbr adding %d<=>%d\n", r1, r2); +fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2); if (allocno1 >= 0 && allocno2 >= 0) set_conflict (allocno1, allocno2); @@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn recog_data.operand[use + 1]); } } - - if (dump_file) -fprintf (dump_file, " finished early clobber conflicts.\n"); } @@ -983,12 +980,12 @@ global_conflicts (void) set_renumbers_live (&renumbers_live, live_subregs, live_subregs_used, allocnum, renumber); } - - else if (!sparseset_bit_p (allocnos_live, allocnum)) + else if (live_subregs_used[allocnum] > 0 + || !sparseset_bit_p (allocnos_live, allocnum)) { if (dump_file) - fprintf (dump_file, "dying pseudo\n"); - + fprintf (dump_file, "%sdying pseudo\n", +(live_subregs_used[allocnum] > 0) ? "partially ": ""); /* Resetting the live_subregs_used is effectively saying do not use the subregs because we are reading the whole pseudo. */ @@ -1071,6 +1068,8 @@ global_conflicts (void) FIXME: We should consider either adding a new kind of clobber, or adding a flag to the clobber distinguish these two cases. */ + if (dump_file && VEC_length (df_ref_t, clobbers)) + fprintf (dump_file, " clobber conflicts\n"); for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--) { struct df_ref *def = VEC_index (df_ref_t, clobbers, k); @@ -1132,6 +1131,8 @@ global_conflicts (void) if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets (insn)) { int j; + if (dump_file) + fprintf (dump_file, " multiple sets\n"); for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--) { int used_in_output = 0; @@ -1166,7 +1167,7 @@ global_conflicts (void) } } - /* Add the renumbers live to the hard_regs_live for the next few + /* Add the renumbers live to the hard_regs_live for the next few calls. All of this gets recomputed at the top of the loop so there is no harm. */ IOR_HARD_REG_SET (hard_regs_live, renumbers_live); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf
--- Comment #19 from zadeck at naturalbridge dot com 2007-10-10 03:41 --- patch committed to fix this. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #12 from zadeck at naturalbridge dot com 2007-10-10 11:41 --- I will look at it today. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
-- zadeck at naturalbridge dot com changed: What|Removed |Added BugsThisDependsOn|33669 | AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH
--- Comment #4 from zadeck at naturalbridge dot com 2007-10-10 13:33 --- Subject: Re: [4.3 Regression] Wrong register allocation on SH kkojima at gcc dot gnu dot org wrote: > --- Comment #3 from kkojima at gcc dot gnu dot org 2007-10-10 13:28 > --- > Not fixed by r129192. I see > > FAIL: gcc.c-torture/execute/pr33669.c execution, -O1 > FAIL: gcc.c-torture/execute/pr33669.c execution, -O2 > FAIL: gcc.c-torture/execute/pr33669.c execution, -Os > > on sh4-unknown-linux-gnu with r129192. > > > i am so embarrassed. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #14 from zadeck at naturalbridge dot com 2007-10-11 11:43 --- Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer ebotcazou at gcc dot gnu dot org wrote: > --- Comment #13 from ebotcazou at gcc dot gnu dot org 2007-10-11 11:14 > --- > >> Revision 128957 causes this regression. >> > > There is a suspect non-documented hunk in the commit: > > * reload1.c (compute_use_by_pseudos): Change DF_RA_LIVE > usage to DF_LIVE usage. > > --- trunk/gcc/reload1.c 2007/10/02 12:47:13 128956 > +++ trunk/gcc/reload1.c 2007/10/02 13:10:07 128957 > @@ -548,7 +548,7 @@ >if (r < 0) > { > /* reload_combine uses the information from > -DF_RA_LIVE_IN (BASIC_BLOCK), which might still > +DF_LIVE_IN (BASIC_BLOCK), which might still > contain registers that have not actually been allocated > since they have an equivalence. */ > gcc_assert (reload_completed); > @@ -1158,10 +1158,7 @@ > >if (! frame_pointer_needed) > FOR_EACH_BB (bb) > - { > - bitmap_clear_bit (df_get_live_in (bb), HARD_FRAME_POINTER_REGNUM); > - bitmap_clear_bit (df_get_live_top (bb), HARD_FRAME_POINTER_REGNUM); > - } > + bitmap_clear_bit (df_get_live_in (bb), HARD_FRAME_POINTER_REGNUM); > >/* Come here (with failure set nonzero) if we can't get enough spill > regs. */ > > > That is fine, there are no top sets anymore. the problem is the code that builds the reload insn chain. the new code uses the cfg and does not add the label or the jump table that lives between basic blocks to the chain. I will post a patch as soon as my tests finish. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #16 from zadeck at naturalbridge dot com 2007-10-11 12:40 --- Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer ebotcazou at gcc dot gnu dot org wrote: > --- Comment #15 from ebotcazou at gcc dot gnu dot org 2007-10-11 12:24 > --- > >> That is fine, there are no top sets anymore. >> > > Thanks for the explanation, please fix the ChangeLog though. > I will, sorry for the oversight. > >> the problem is the code that builds the reload insn chain. the new code >> uses the cfg and does not add the label or the jump table that lives >> between basic blocks to the chain. I will post a patch as soon as my >> tests finish. >> > > OK. > > > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #17 from zadeck at naturalbridge dot com 2007-10-11 16:21 --- Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer 2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33676 * global.c (build_insn_chain): Include insn that occur between basic blocks. 2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33676 * gcc.c-torture/gcc.dg/torture/pr33676.c: New. When I rewrote this code to use backward scanning rather than forwards scanning, I converted it to properly use the cfg, since it is generally considered outmoded to just scan the insns. However, the reload_insn_chain actually needs the insns that appear between basic blocks, in particular the labels in front of branch tables. I added code here to check for insns that may be in front of a basic block after scanning that block. There are a lot of ways that I could have done this, for instance, I could have just written in terms of the PREV_INSN as the old code was. I think that in doing it the way that i have done it, it is obvious what needs to be done if someone really does get rid of the branch tables between the blocks. This has been bootstrapped and regression tested on x86-{64,32} ppc-32, and ia-64. However it is not clear to me how many platforms use this kind of table branch. The bug appears to only be on the -march=i586, so the reviewers may wish to comment on my choice of dg options on the test. Ok to commit? Kenny Index: testsuite/gcc.dg/torture/pr33676.c === --- testsuite/gcc.dg/torture/pr33676.c (revision 0) +++ testsuite/gcc.dg/torture/pr33676.c (revision 0) @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-march=i586 -fomit-frame-pointer" { target { { i?86-*-* x86_64-*-* } && ilp32 } } } */ + +// Small testcase, compile with "-march=i586 -O0 -fomit-frame-pointer": + +__attribute__((noreturn,noinline)) void abrt (const char *fi, const char *fu) +{ + __builtin_abort (); +} + +__attribute__((noinline)) int f (int k) +{ + return k; +} + +__attribute__((noinline)) int g (int t, int k) +{ + int b; + + switch (t) +{ +case 0: + abrt (__FILE__, __FUNCTION__); + +case 1: + b = f (k); + break; + +case 2: + b = f (k); + break; + +case 3: + b = f (k); + break; + +case 4: + b = f (k); + break; + +default: + abrt (__FILE__, __FUNCTION__); +} + + return b; +} + +int main (void) +{ + if (g (3, 1337) != 1337) + abrt (__FILE__, __FUNCTION__); + return 0; +} Index: global.c === --- global.c(revision 129224) +++ global.c(working copy) @@ -1575,6 +1575,37 @@ build_insn_chain (void) } } } + + /* FIXME!! The following code is a disaster. Reload needs to see the +labels and jump tables that are just hanging out in between +the basic blocks. See pr33676. */ + + insn = BB_HEAD (bb); + + /* Skip over the barriers and cruft. */ + while (insn && (BARRIER_P (insn) || NOTE_P (insn) || BLOCK_FOR_INSN (insn) == bb)) + insn = PREV_INSN (insn); + + /* Look for labels and jump tables. */ + while (insn) + { + if (!NOTE_P (insn) && !BARRIER_P (insn)) + { + if (BLOCK_FOR_INSN (insn)) + break; + + c = new_insn_chain (); + c->next = next; + next = c; + *p = c; + p = &c->prev; + + c->insn = insn; + c->block = bb->index; + bitmap_copy (&c->live_throughout, live_relevant_regs); + } + insn = PREV_INSN (insn); + } } for (i = 0; i < (unsigned int)max_regno; i++) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH
--- Comment #7 from zadeck at naturalbridge dot com 2007-10-11 21:50 --- kazumoto, there was a set of miscommunications associated with the final patch for pr33669. hj had checked in an earlier version of the patch and that testcase and i asked him to revert it because there were issues with it. He only reverted the code and left the testcase in. You tested against version 129192 and i checked in the corrected patch as 129193. given that, pr33669.c should have failed. seongbae has verified that pr33669.c and the testcase here no longer fails on the current truck with sh-elf. I am going to assume that this is closed unless you find some other issue. Sorry for the mess up. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #20 from zadeck at naturalbridge dot com 2007-10-11 22:35 --- Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer zadeck at naturalbridge dot com wrote: > --- Comment #17 from zadeck at naturalbridge dot com 2007-10-11 16:21 > --- > Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: > Segmentation fault, wrong code with -fomit-frame-pointer > > > > When I rewrote this code to use backward scanning rather than forwards > scanning, I converted it to properly use the cfg, since it is generally > considered outmoded to just scan the insns. > > However, the reload_insn_chain actually needs the insns that appear > between basic blocks, in particular the labels in front of branch > tables. I added code here to check for insns that may be in front of a > basic block after scanning that block. > > There are a lot of ways that I could have done this, for instance, I > could have just written in terms of the PREV_INSN as the old code was. > I think that in doing it the way that i have done it, it is obvious what > needs to be done if someone really does get rid of the branch tables > between the blocks. > > This has been bootstrapped and regression tested on x86-{64,32} ppc-32, > and ia-64. However it is not clear to me how many platforms use this > kind of table branch. The bug appears to only be on the -march=i586, so > the reviewers may wish to comment on my choice of dg options on the test. > > 2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33676 * global.c (build_insn_chain): Include insn that occur between basic blocks. 2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/33676 * gcc.dg/torture/pr33676.c: New. bootstrapped and regression tested on x86-32 x86-64, ppc-32 and ia-64. committed as revision 129244. Kenny Index: ChangeLog === --- ChangeLog (revision 129243) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> + + PR middle-end/33676 + * global.c (build_insn_chain): Include insn that occur between + basic blocks. + 2007-10-11 Tom Tromey <[EMAIL PROTECTED]> * gengtype-yacc.y: Delete. Index: testsuite/gcc.dg/torture/pr33676.c === --- testsuite/gcc.dg/torture/pr33676.c (revision 0) +++ testsuite/gcc.dg/torture/pr33676.c (revision 0) @@ -0,0 +1,51 @@ +/* { dg-do run } */ +/* { dg-options "-march=i586 -fomit-frame-pointer" { target { { i?86-*-* x86_64-*-* } && ilp32 } } } */ + +__attribute__((noreturn,noinline)) void abrt (const char *fi, const char *fu) +{ + __builtin_abort (); +} + +__attribute__((noinline)) int f (int k) +{ + return k; +} + +__attribute__((noinline)) int g (int t, int k) +{ + int b; + + switch (t) +{ +case 0: + abrt (__FILE__, __FUNCTION__); + +case 1: + b = f (k); + break; + +case 2: + b = f (k); + break; + +case 3: + b = f (k); + break; + +case 4: + b = f (k); + break; + +default: + abrt (__FILE__, __FUNCTION__); +} + + return b; +} + +int main (void) +{ + if (g (3, 1337) != 1337) + abrt (__FILE__, __FUNCTION__); + return 0; +} Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 129243) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> + + PR middle-end/33676 + * gcc.dg/torture/pr33676.c: New. + 2007-10-11 Paolo Carlini <[EMAIL PROTECTED]> PR c++/31441 Index: global.c === --- global.c(revision 129243) +++ global.c(working copy) @@ -1575,6 +1575,41 @@ build_insn_chain (void) } } } + + /* FIXME!! The following code is a disaster. Reload needs to see the +labels and jump tables that are just hanging out in between +the basic blocks. See pr33676. */ + + insn = BB_HEAD (bb); + + /* Skip over the barriers and cruft. */ + while (insn && (BARRIER_P (insn) || NOTE_P (insn) || BLOCK_FOR_INSN (insn) == bb)) + insn = PREV_INSN (insn); + + /* While we add anything except barriers and notes, the focus is +to get the labels and jump tables into the +reload_insn_chain. */ + while (insn) + { + if (!NOTE_P (insn) && !BARRIER_P (insn)) + { + if (BLOCK_FOR_INSN (insn)) + break; + + c = new_insn_chain (); + c->next = next; + next = c; +
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #24 from zadeck at naturalbridge dot com 2007-10-12 14:38 --- Subject: Re: libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer Eric Botcazou wrote: >> 2007-10-11 Kenneth Zadeck <[EMAIL PROTECTED]> >> >> PR middle-end/33676 >> * global.c (build_insn_chain): Include insn that occur between >> basic blocks. >> > > Who approved this patch? > > >> However, the reload_insn_chain actually needs the insns that appear >> between basic blocks, in particular the labels in front of branch >> tables. I added code here to check for insns that may be in front of a >> basic block after scanning that block. >> >> There are a lot of ways that I could have done this, for instance, I >> could have just written in terms of the PREV_INSN as the old code was. >> I think that in doing it the way that i have done it, it is obvious what >> needs to be done if someone really does get rid of the branch tables >> between the blocks. >> > > Sure, but the code in build_insn_chain is now more convoluted than in the > original version (and twice as big). And, please, fix the formatting. > > it was approved by seonbae, a register allocation reviewer.The reason that it is longer is that it is more precise. The code to properly handle subregs, as well as properly dealing with registers live thru insns, accounts for most of the expansion over the old code. formatting fixes committed as revision 129262. kenny Index: global.c === --- global.c(revision 129260) +++ global.c(working copy) @@ -1358,6 +1358,8 @@ mark_elimination (int from, int to) } } +/* Print chain C to FILE. */ + static void print_insn_chain (FILE *file, struct insn_chain *c) { @@ -1366,6 +1368,9 @@ print_insn_chain (FILE *file, struct ins bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n"); } + +/* Print all reload_insn_chains to FILE. */ + static void print_insn_chains (FILE *file) { @@ -1373,8 +1378,11 @@ print_insn_chains (FILE *file) for (c = reload_insn_chain; c ; c = c->next) print_insn_chain (file, c); } + + /* Walk the insns of the current function and build reload_insn_chain, and record register life information. */ + static void build_insn_chain (void) { @@ -1450,7 +1458,7 @@ build_insn_chain (void) { if (regno < FIRST_PSEUDO_REGISTER) { - if (! fixed_regs[regno]) + if (!fixed_regs[regno]) bitmap_set_bit (&c->dead_or_set, regno); } else if (reg_renumber[regno] >= 0) @@ -1461,16 +1469,20 @@ build_insn_chain (void) && (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL))) { rtx reg = DF_REF_REG (def); + /* We can model subregs, but not if they are wrapped in ZERO_EXTRACTS. */ if (GET_CODE (reg) == SUBREG && !DF_REF_FLAGS_IS_SET (def, DF_REF_EXTRACT)) { unsigned int start = SUBREG_BYTE (reg); - unsigned int last = start + GET_MODE_SIZE (GET_MODE (reg)); + unsigned int last = start + + GET_MODE_SIZE (GET_MODE (reg)); - ra_init_live_subregs (bitmap_bit_p (live_relevant_regs, regno), - live_subregs, live_subregs_used, + ra_init_live_subregs (bitmap_bit_p (live_relevant_regs, + regno), + live_subregs, + live_subregs_used, regno, reg); /* Ignore the paradoxical bits. */ if ((int)last > live_subregs_used[regno]) @@ -1535,7 +1547,7 @@ build_insn_chain (void) { if (regno < FIRST_PSEUDO_REGISTER) { - if (! fixed_regs[regno]) + if (!fixed_regs[regno]) bitmap_set_bit (&c->dead_or_set, regno); } else if (reg_renumber[regno] >= 0) @@ -1548,10 +1560,13 @@ build_insn_chain (void) && !DF_REF_FLAGS_IS_SET (use, DF_REF_EXTRACT))
[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer
--- Comment #22 from zadeck at naturalbridge dot com 2007-10-12 11:59 --- it seems to be clean now. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676
[Bug rtl-optimization/33644] [4.3 Regression] ICE in local_cprop_pass with -ftrapv for crafty
--- Comment #2 from zadeck at naturalbridge dot com 2007-10-15 13:11 --- Subject: Re: [4.3 Regression] ICE in local_cprop_pass with -ftrapv for crafty > On Sun, Oct 14, 2007 at 12:29:44PM -0400, Kenneth Zadeck wrote: > > > I have not looked at this bug. I am happy to if you want. I am sure > > > that it will be trivial to modify the pass that moved/created the insn > > > in the middle of the libcall to inherit the LIB_CALL_ID from the > > > previous insn. > > That is not desirable, if anything in this case the insn should be > added before the whole libcall sequence rather than before the insn > that actually needs it. Otherwise, useless insns added to the libcall > sequences wouldn't be ever DCEd. > While it might be easy to modify the instantiate_virtual_regs, there > are dozens of other passes that do similar things, so at least for 4.3 it is > highly unlikely they will be all modified. > Jakub Jakub, i will fix this by moving the insn before the libcall. It may take me a day of so because i am under the weather. But i will do it soon. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33644
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #3 from zadeck at naturalbridge dot com 2007-10-17 11:25 --- Subject: Re: valgrind error with -O2 for linux kernel code bergner at gcc dot gnu dot org wrote: > --- Comment #2 from bergner at gcc dot gnu dot org 2007-10-17 04:46 > --- > Although valgrind is correct that we are doing an uninitialized read, the code > is actually working as designed and is correct. > > When we allocate a sparseset, we only need to set set->members to 0 to clear > the set. The arrays set->sparse[] and set->dense[] are not and do not need to > be initialized. To test a value "n" for membership in "set", it needs to > statisfy two properties: > >set->sparse[n] < set->members > > and > >set->dense[set->sparse[n]] == n > > The uninitialized read occurs when "n" is not (and never has been) a member of > "set". In this case, set->sparse[n] will be uninitialized and could be any > value. If set->sparse[n] happens to be >= set->members, we luckily (but > correctly) return that "n" is not a member of the set. If the uninitialized > set->sparse[n] is < set->members, we continue on to verify that > set->dense[set->sparse[n]] == n. This test cannot be true since all > set->dense[i] entries for i < set->members are initialized and "n" is not a > member of the set. So yes we do some uninitialized accesses to the sparse > array, but that's ok. It's also a benefit of sparseset, given that we don't > have to memset/clear the whole sparseset data structure before using it, so > it's fast. > > > peter, i think that this is clever and nice but it is not going to fly. people will be running valgrind and this will hit them over and over again. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug middle-end/37448] [4.3 Regression] gcc 4.3.1 cannot compile big function
--- Comment #23 from zadeck at naturalbridge dot com 2008-09-27 12:44 --- I do not believe honza. My measurements at -O0 on x86-42 are about 15 refs per insn. This is based on the following stats. (These can be reproduced using a patch that i am about to submit). ;;total ref usage 8428419{7601408d,827011u,0e} in 570685{406804 regular + 163881 call} insns. This yields about 15 refs per insn. While this number is large, it is reasonable considering that slightly less than 30% of the insns are call instructions. Call instructions have a lot of clobbers. It is possible that some mechanism could be devised to share these refs, but this will mess up things like building chains so it is certainly not something that is going to be easy to do. The df patch that i have submitted makes modest progress on reducing the size of df-refs. Hopefully bonzini will finish reviewing this soon. I should also point out that honza's alloc pool stats were completely bogus. I have submitted a patch that fixes the way stats are accumulated for alloc-pools. We can account for all of the df-refs and the peak usage according to the new alloc-pool stats is very close to the number used by the largest function. Once those patches are installed, I will consider this bugzilla resolved with respect to the df issues. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37448
[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86
--- Comment #3 from zadeck at naturalbridge dot com 2008-10-12 04:56 --- Created an attachment (id=16485) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16485&action=view) possible patch to fix the problem I am pretty sure that this fixes it, but i need to do more testing. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808
[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86
--- Comment #8 from zadeck at naturalbridge dot com 2008-10-12 21:13 --- Subject: Re: [4.4 Regression]: Revision 141067 breaks Linux/x86 andreast at gcc dot gnu dot org wrote: > --- Comment #7 from andreast at gcc dot gnu dot org 2008-10-12 20:31 > --- > I see a failure on sparc-solaris8/10 too. Configury of stage2 fails. > Applying the mentioned patch cures compilation. > My sparc config is with multilib. 32-bit/64-bit. > > > The problem is that the bb is no longer kept in the df-ref, and is instead extracted from the insn. This particular problem was caused by insns being deleted in a pass that defers rescanning but that also changes register numbers. The fix checks to make sure the insn is still in a basic block before trying to mark the block as being dirty. 2008-10-12 Kenneth Zadeck <[EMAIL PROTECTED]> PR middle-end/37808 * df-scan.c (df_ref_change_reg_with_loc_1): Added test to make sure that ref has valid bb. Tested by me on both x86-32 and x86-64. Also tested by andreast on spark-solaris and by keating. OK to commit? kenny Index: df-scan.c === --- df-scan.c (revision 141071) +++ df-scan.c (working copy) @@ -1980,7 +1980,8 @@ df_ref_change_reg_with_loc_1 (struct df_ DF_REF_PREV_REG (new_df->reg_chain) = the_ref; new_df->reg_chain = the_ref; new_df->n_refs++; - df_set_bb_dirty (DF_REF_BB (the_ref)); + if (DF_REF_BB (the_ref)) + df_set_bb_dirty (DF_REF_BB (the_ref)); /* Need to sort the record again that the ref was in because the regno is a sorting key. First, find the right -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808
[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86
--- Comment #11 from zadeck at naturalbridge dot com 2008-10-12 21:19 --- Subject: Re: [4.4 Regression]: Revision 141067 breaks Linux/x86 Richard Guenther wrote: > On Sun, Oct 12, 2008 at 11:12 PM, Kenneth Zadeck > <[EMAIL PROTECTED]> wrote: > >> andreast at gcc dot gnu dot org wrote: >> >>> --- Comment #7 from andreast at gcc dot gnu dot org 2008-10-12 20:31 >>> --- >>> I see a failure on sparc-solaris8/10 too. Configury of stage2 fails. >>> Applying the mentioned patch cures compilation. >>> My sparc config is with multilib. 32-bit/64-bit. >>> >>> >>> >>> >> The problem is that the bb is no longer kept in the df-ref, and is >> instead extracted from the insn. >> This particular problem was caused by insns being deleted in a pass that >> defers rescanning but that also changes register numbers. The fix >> checks to make sure the insn is still in a basic block before trying to >> mark the block as being dirty. >> > > Ok. I think it's odd that we keep refs to deleted insns - but that's probably > because of the deferred re-scan, right? > > Thanks, > Richard. > yes, this only is because of the deferred rescan. committed as revision 14178. kenny > >> 2008-10-12 Kenneth Zadeck <[EMAIL PROTECTED]> >> >>PR middle-end/37808 >>* df-scan.c (df_ref_change_reg_with_loc_1): Added test to make >>sure that ref has valid bb. >> >> Tested by me on both x86-32 and x86-64. Also tested by andreast on >> spark-solaris and by keating. >> >> OK to commit? >> >> kenny >> >> -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808
[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86
--- Comment #12 from zadeck at naturalbridge dot com 2008-10-12 21:22 --- fixed with the above patch. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808
[Bug target/37378] [4.4 Regression] Revision 139827 causes Divide_X
--- Comment #20 from zadeck at naturalbridge dot com 2008-10-24 18:44 --- Subject: Re: [4.4 Regression] Revision 139827 causes Divide_X jakub at gcc dot gnu dot org wrote: > --- Comment #19 from jakub at gcc dot gnu dot org 2008-10-24 18:09 > --- > This hunk in df-scan.c confuses me: > > /* These registers are live everywhere. */ > if (!reload_completed) > { > #ifdef EH_USES > /* The ia-64, the only machine that uses this, does not define these > until after reload. */ > for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) > if (EH_USES (i)) > { > bitmap_set_bit (entry_block_defs, i); > } > #endif > > Indeed, ia64 is the only port that defines EH_USES ever to non-zero value, and > only if reload_completed. So this is a nice nop, but supposedly just changing > the guarding condition to if (reload_completed) could fix this up. > > > I cannot justify the existing code, either by looking at it or what used to be in flow.c. I do agree that the existing code is a noop and should be either fixed or deleted. I must admit, that i think that the proper solution is going to be have to be one that adds the eh_uses onto the uses of instructions that can trap because the block of code referenced here only effects the forwards dataflow problem. However, this problem is really not so much about dataflow analysis as it is about the meaning of these target specific macros. What ever the solution is, i think that it should be at least blessed by iant, or jim wilson rather than just a dataflow maintainer. I would also point out that dealing with the EH_USES is not going to make any difference to the "similar" problem that happens on the cris. Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37378
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #50 from zadeck at naturalbridge dot com 2008-01-17 21:06 --- Subject: [4.3 regression] bad interaction between DF and SJLJ exceptions This is the second of three patches to fix 34400. This patch also makes some progress on 26854 but more work is required that is not going to be done in 4.3 to fix the problems here. This patch uses the output of the df_lr problem to make the df_live problem converge faster. This not only saves time but also space since the size of the df_live bitmaps never grows and the space of our bitmaps is proportional to the number of 1 bits. This has been tested on several platforms and along with the patch just committed cuts the time on the 34400 problems significantly. I believe that this patch also has some modest improvement on bootstrap time, i.e regular programs. The change to df_live_reset is a slightly related latent bug fix. Ok to commit? Kenny 2008-01-17 Kenneth Zadeck <[EMAIL PROTECTED]> Steven Bosscher <[EMAIL PROTECTED]> PR rtl-optimization/26854 PR rtl-optimization/34400 * df-problems.c (df_live_scratch): New scratch bitmap. (df_live_alloc): Allocate df_live_scratch when doing df_live. (df_live_reset): Clear the proper bitmaps. (df_live_bb_local_compute): Only process the artificial defs once since the order is not important. (df_live_init): Init the df_live sets only with the variables found live by df_lr. (df_live_transfer_function): Use the df_lr sets to prune the df_live sets as they are being computed. (df_live_free): Free df_live_scratch. Index: df-problems.c === --- df-problems.c (revision 130752) +++ df-problems.c (working copy) @@ -1323,6 +1323,8 @@ struct df_live_problem_data bitmap *out; }; +/* Scratch var used by transfer functions. */ +static bitmap df_live_scratch; /* Set basic block info. */ @@ -1366,6 +1368,8 @@ df_live_alloc (bitmap all_blocks ATTRIBU if (!df_live->block_pool) df_live->block_pool = create_alloc_pool ("df_live_block pool", sizeof (struct df_live_bb_info), 100); + if (!df_live_scratch) +df_live_scratch = BITMAP_ALLOC (NULL); df_grow_bb_info (df_live); @@ -1401,7 +1405,7 @@ df_live_reset (bitmap all_blocks) EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi) { - struct df_lr_bb_info *bb_info = df_lr_get_bb_info (bb_index); + struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index); gcc_assert (bb_info); bitmap_clear (bb_info->in); bitmap_clear (bb_info->out); @@ -1420,13 +1424,6 @@ df_live_bb_local_compute (unsigned int b struct df_ref **def_rec; int luid = 0; - for (def_rec = df_get_artificial_defs (bb_index); *def_rec; def_rec++) -{ - struct df_ref *def = *def_rec; - if (DF_REF_FLAGS (def) & DF_REF_AT_TOP) - bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def)); -} - FOR_BB_INSNS (bb, insn) { unsigned int uid = INSN_UID (insn); @@ -1467,8 +1464,7 @@ df_live_bb_local_compute (unsigned int b for (def_rec = df_get_artificial_defs (bb_index); *def_rec; def_rec++) { struct df_ref *def = *def_rec; - if ((DF_REF_FLAGS (def) & DF_REF_AT_TOP) == 0) - bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def)); + bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def)); } } @@ -1504,8 +1500,11 @@ df_live_init (bitmap all_blocks) EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi) { struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index); + struct df_lr_bb_info *bb_lr_info = df_lr_get_bb_info (bb_index); - bitmap_copy (bb_info->out, bb_info->gen); + /* No register may reach a location where it is not used. Thus +we trim the rr result to the places where it is used. */ + bitmap_and (bb_info->out, bb_info->gen, bb_lr_info->out); bitmap_clear (bb_info->in); } } @@ -1531,12 +1530,18 @@ static bool df_live_transfer_function (int bb_index) { struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index); + struct df_lr_bb_info *bb_lr_info = df_lr_get_bb_info (bb_index); bitmap in = bb_info->in; bitmap out = bb_info->out; bitmap gen = bb_info->gen; bitmap kill = bb_info->kill; - return bitmap_ior_and_compl (out, gen, in, kill); + bitmap_and (df_live_scratch, gen, bb_lr_info->out); + /* No register may reach a location where it is not used. Thus + we trim the rr result to the places where it is used. */ + bitmap_and_into (in, bb_lr_info->in); + + return bitmap_ior_and_compl (out, df_live_scratch, in, kill); } @@ -1591,6 +1596,9 @@ df_live_free (void) free_alloc_pool (df_live->block_pool); df_live->block_info_size = 0; free (df_live->block_info); + +
[Bug tree-optimization/26854] Inordinate compile times on large routines
--- Comment #50 from zadeck at naturalbridge dot com 2008-01-17 21:20 --- Subject: Mark, Am I allowed to set the target milestone for a patch or is that your job? 26854 is not going to get fixed for 4.3. We made a lot of progress on it with the patches to 34400, but largest remaining problem is the space that the current representation of def-use and use-def chains requires. I should be able to almost cut this in half if we move to something like a vec rather than a linked list. But this is a big patch and i do not want to start this until stage I. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug tree-optimization/26854] Inordinate compile times on large routines
--- Comment #52 from zadeck at naturalbridge dot com 2008-01-17 21:46 --- Subject: Re: Inordinate compile times on large routines rguenth at gcc dot gnu dot org wrote: > --- Comment #51 from rguenth at gcc dot gnu dot org 2008-01-17 21:43 > --- > As this isn't even marked at a regression, you can fix it whenever you like ;) > > Only regressions have a target milestone before they are actually fixed, > though. > > > just between you and me this is most likely a regression, on the other hand, i think that people who write functions this large should be thrown into a pit. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #53 from zadeck at naturalbridge dot com 2008-01-17 22:37 --- Subject: Re: [4.3 regression] bad interaction between DF and SJLJ exceptions seongbae dot park at gmail dot com wrote: > --- Comment #52 from seongbae dot park at gmail dot com 2008-01-17 22:31 > --- > Subject: Re: [4.3 regression] bad interaction between DF and SJLJ exceptions > > I just talked to Kenny on the phone, and my suggestion is wrong > since it changes the return value - doing my naive suggestion > would lead to infinite loop, as the transfer function will almost always > return true, even when the out set didn't change. > Can you add a comment to that effect there ? > Also please add a comment above df_live_scratch definition > that this is an optimization to reduce memory allocation overhead > for the scratch. > > will do. > Can you explain why the hunk in df_live_bb_local_compute() is correct ? > As this seems to change what DF_REF_AT_TOP means for artificial defs... > > In the old code we went thru the artificial defs twice, once for the defs at the bottom and once for the defs at the top. This is a waste of time. we only need to go thru them once since, for this problem, the processing is order independent. > Seongbae > > On Jan 17, 2008 1:31 PM, Seongbae Park (¹Ú¼º¹è, ÚÓà÷ÛÆ) > <[EMAIL PROTECTED]> wrote: > >> In df_live_transfer_function: >> >> Doesn't look like we need df_live_scratch - can't we do: >> >> bitmap_and (out, gen, bb_lr_info->out); >> bitmap_and_into (in, bb_lr_info->in); >> return bitmap_ior_and_compl_into (out, in, kill); >> >> ? >> >> Seongbae >> >> >> On Jan 17, 2008 1:05 PM, Kenneth Zadeck <[EMAIL PROTECTED]> wrote: >> >>> This is the second of three patches to fix 34400. This patch also makes >>> some progress on 26854 but more work is required that is not going to be >>> done in 4.3 to fix the problems here. >>> >>> This patch uses the output of the df_lr problem to make the df_live >>> problem converge faster. >>> This not only saves time but also space since the size of the df_live >>> bitmaps never grows and the space of our bitmaps is proportional to the >>> number of 1 bits. >>> >>> This has been tested on several platforms and along with the patch just >>> committed cuts the time on the 34400 problems significantly. I believe >>> that this patch also has some modest improvement on bootstrap time, i.e >>> regular programs. >>> >>> The change to df_live_reset is a slightly related latent bug fix. >>> >>> Ok to commit? >>> >>> Kenny >>> >>> >>> 2008-01-17 Kenneth Zadeck <[EMAIL PROTECTED]> >>> Steven Bosscher <[EMAIL PROTECTED]> >>> >>> PR rtl-optimization/26854 >>> PR rtl-optimization/34400 >>> * df-problems.c (df_live_scratch): New scratch bitmap. >>> (df_live_alloc): Allocate df_live_scratch when doing df_live. >>> (df_live_reset): Clear the proper bitmaps. >>> (df_live_bb_local_compute): Only process the artificial defs once >>> since the order is not important. >>> (df_live_init): Init the df_live sets only with the variables >>> found live by df_lr. >>> (df_live_transfer_function): Use the df_lr sets to prune the >>> df_live sets as they are being computed. >>> (df_live_free): Free df_live_scratch. >>> >>> >>> >> >> -- >> #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com"; >> >> > > > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
[Bug tree-optimization/26854] Inordinate compile times on large routines
--- Comment #55 from zadeck at naturalbridge dot com 2008-01-17 22:57 --- Subject: Re: Inordinate compile times on large routines lucier at math dot purdue dot edu wrote: > --- Comment #54 from lucier at math dot purdue dot edu 2008-01-17 22:39 > --- > Created an attachment (id=14963) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view) > memory details for 131610 > > This is the detailed memory usage for the compiler > > euler-5% /pkgs/gcc-mainline/bin/gcc -v > Using built-in specs. > Target: x86_64-unknown-linux-gnu > Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline > --enable-languages=c --enable-checking=release --with-gmp=/pkgs/gmp-4.2.2 > --with-mpfr=/pkgs/gmp-4.2.2 --enable-gather-detailed-mem-stats > Thread model: posix > gcc version 4.3.0 20080117 (experimental) [trunk revision 131610] (GCC) > > The maximum memory I observed in top was 10.2 GB. > > Kenny, I can't tell whether your patch from > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c50 > > has been committed; will that improve the situation, too? > > > it could, but it is not the big issue here, the big issue is the size of the def use chains. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug tree-optimization/26854] Inordinate compile times on large routines
--- Comment #57 from zadeck at naturalbridge dot com 2008-01-18 02:10 --- Subject: Re: Inordinate compile times on large routines lucier at math dot purdue dot edu wrote: > --- Comment #56 from lucier at math dot purdue dot edu 2008-01-18 01:38 > --- > gcc is now 5-6 times faster than it was nearly two years ago when this was > first reported; many changes have made significant improvements in cpu time. > > But Steven Bosscher's patch from December still improved things more on this > test case. > > In particular, on 12/20/2007, without the patch, CPU time from > > http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799 > > was > > TOTAL : 300.2119.16 319.52 > 778432 kB > > After Steven Bosscher's patch > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c28 > > it was > > TOTAL : 210.9715.80 226.88 > 778432 kB > > and today it's > > TOTAL : 281.0818.03 299.41 > 776514 kB > > Would it still be a good idea to apply Steven's patch? > > > the plan is to apply all of the patches, they each deal with a different problem and the improvement should be additive. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #56 from zadeck at naturalbridge dot com 2008-01-19 13:09 --- Subject: Re: [4.3 regression] bad interaction between DF and SJLJ exceptions Let me commit the patch first. Sent from my iPod On Jan 19, 2008, at 4:41 AM, "steven at gcc dot gnu dot org" <[EMAIL PROTECTED] > wrote: > > > --- Comment #55 from steven at gcc dot gnu dot org 2008-01-19 > 09:41 --- > IMHO we can close this one now as fixed. Objections to that, anyone? > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400 > > --- You are receiving this mail because: --- > You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
[Bug middle-end/34874] New: struct reorg valgrind failure
FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (internal compiler error) FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (test for excess errors) valgrind --tool=memcheck --db-attach=yes --error-limit=no /home/zadeck/gbB2/gcc/cc1 -fpreprocessed wo_prof_malloc_size_var.i -quiet -dumpbase wo_prof_malloc_size_var.i -mtune=generic -auxbase wo_prof_malloc_size_var -O3 -version -fipa-struct-reorg -fdump-ipa-all -fwhole-program -fipa-type-escape -fno-show-column -o wo_prof_malloc_size_var.s ==27272== Invalid read of size 8 ==27272==at 0xEACB12: htab_traverse_noresize (hashtab.c:747) ==27272==by 0xEACBA5: htab_traverse (hashtab.c:765) ==27272==by 0xBB65D2: check_cond_exprs (ipa-struct-reorg.c:3547) ==27272==by 0xBB6FD3: collect_data_accesses (ipa-struct-reorg.c:3830) ==27272==by 0xBB7281: reorg_structs (ipa-struct-reorg.c:3944) ==27272==by 0xBB72A0: reorg_structs_drive (ipa-struct-reorg.c:3967) ==27272==by 0x7B7476: execute_one_pass (passes.c:1118) ==27272==by 0x7B7656: execute_ipa_pass_list (passes.c:1187) ==27272==by 0xB98102: ipa_passes (cgraphunit.c:1340) ==27272==by 0xB98215: cgraph_optimize (cgraphunit.c:1387) ==27272==by 0x431628: c_write_global_declarations (c-decl.c:8079) ==27272==by 0x87CAB6: compile_file (toplev.c:1055) ==27272== Address 0x587A700 is 16 bytes inside a block of size 104 free'd ==27272==at 0x4C2191B: free (in /usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so) ==27272==by 0xEAC5D4: htab_expand (hashtab.c:550) ==27272==by 0xEACB94: htab_traverse (hashtab.c:763) ==27272==by 0xBB014D: free_accesses (ipa-struct-reorg.c:1674) ==27272==by 0xBB1192: free_data_struct (ipa-struct-reorg.c:2111) ==27272==by 0xBB20C8: remove_structure (ipa-struct-reorg.c:2353) ==27272==by 0xBB5268: safe_cond_expr_check (ipa-struct-reorg.c:3090) ==27272==by 0xEACB34: htab_traverse_noresize (hashtab.c:750) ==27272==by 0xEACBA5: htab_traverse (hashtab.c:765) ==27272==by 0xBB65D2: check_cond_exprs (ipa-struct-reorg.c:3547) ==27272==by 0xBB6FD3: collect_data_accesses (ipa-struct-reorg.c:3830) ==27272==by 0xBB7281: reorg_structs (ipa-struct-reorg.c:3944) -- Summary: struct reorg valgrind failure Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: zadeck at naturalbridge dot com GCC host triplet: x86-64-linux-gni GCC target triplet: x86_64-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874
[Bug middle-end/34874] struct reorg valgrind failure
--- Comment #1 from zadeck at naturalbridge dot com 2008-01-19 20:13 --- I am about to commit the last fix to p34400 and at least on my machine, this patch will make this failure disappear from the test suite. however the bug is still there if you look with valgrind. pinskia, i am sorry, i am about to leave for the day I want to close 34400 and i did not get to do a dup check to see if this was already there. kenny. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874
[Bug middle-end/34874] struct reorg valgrind failure
--- Comment #2 from zadeck at naturalbridge dot com 2008-01-20 01:43 --- actually the commit for 34400 does not seem to effect this bug. but the bug does have that nice heisenbug quality to it. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #58 from zadeck at naturalbridge dot com 2008-01-20 02:13 --- The three patches that have been committed seem to have brought this under control. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #6 from zadeck at naturalbridge dot com 2008-01-20 13:53 --- I need a more info to reproduce this bug. I bootstrapped and regression tested on x86_64-unknown-linux-gnu with suse 10.3 and using --enable-languages=c,c++,fortran --disable-multilib before committing the patch and got === gfortran Summary === # of expected passes23538 # of expected failures 4 # of unsupported tests 18 i am not doubting that the failure is related to this patch. Given all of rest of the info, it smells like this patch is responsible, but i do not get the failure on my config. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884
[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #8 from zadeck at naturalbridge dot com 2008-01-20 15:24 --- Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > --- Comment #7 from dominiq at lps dot ens dot fr 2008-01-20 14:39 > --- > >> I need a more info to reproduce this bug. >> > > I have tried to give all the info I have been able to gather on my own. My > config is: > > Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w > --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info > --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java > --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib > --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib > > As far as I can tell, the bug appears after the tree optimization, but at this > point I don't know what I should dump. Having looked at the test-suite > results, > the problem appears on 32 bit x86 platforms. From > > >> --disable-multilib >> > > I infer that you cannot try with -m32, isn't it? > > > the first comment of the bug says linux/intel64. your config string looks like you are building on a mac "darwin" box. That would be the difference. I build on a real linux box that cannot run darwin. could you please send me two tar files: one tar file from the release with out this patch containing the test case case with the "-da" option and one from the release with the patch with the same option. This option will produce a large number of dump files and from those dumps i will fix the bug. Thanks in advance. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884
[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #10 from zadeck at naturalbridge dot com 2008-01-20 15:39 --- Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > --- Comment #9 from dominiq at lps dot ens dot fr 2008-01-20 15:30 > --- > >> you are building on a mac "darwin" box >> > > Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for > instance: > > http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html > > > i will build this on a 32 bit box. that is my problem. sorry, thanks. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884
[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #12 from zadeck at naturalbridge dot com 2008-01-20 15:52 --- Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 dominiq at lps dot ens dot fr wrote: > --- Comment #11 from dominiq at lps dot ens dot fr 2008-01-20 15:47 > --- > I have put the results of the compilation with -da with the patch at > > http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2 > > All the files will be in a directory tmp_fresh. Do you still need the same > without the patch? It will take some time to reverse the patch and to do the > rebuilding. > > > let me try to build a 32 bit compiler. that appears to be the problem. it will be easier if i can get it on my machine. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884
[Bug tree-optimization/34472] [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c doesn't work
--- Comment #9 from zadeck at naturalbridge dot com 2008-01-20 15:29 --- olga, even if the test case does not normally ice on your system, you be able to see the bug if you run the test with valgrind. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34472
[Bug tree-optimization/34472] [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c doesn't work
--- Comment #11 from zadeck at naturalbridge dot com 2008-01-20 16:34 --- Subject: Re: [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c doesn't work olga at gcc dot gnu dot org wrote: > --- Comment #10 from olga at gcc dot gnu dot org 2008-01-20 16:28 --- > (In reply to comment #9) > >> olga, >> even if the test case does not normally ice on your system, you be able to >> see >> the bug if you run the test with valgrind. >> > > Kenny, > > Thank you a lot for information. I was not aware about valgrid. Does it help > also with segfaults? > > The patch in comment #4 solves the ICE, but on some system it generates the > execution failures (PR 34534 and PR 34483). Can you see what it makes on your > system? > > Thank you a lot, > Olga > > > > generally it does. it is not perfect. it is very good at finding faults with malloc'ed memory. did you actually try valgrind with this bug? if you need some help, hop on irc and i will talk you thru it. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34472
[Bug fortran/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #14 from zadeck at naturalbridge dot com 2008-01-20 18:30 --- confirmed on my machine, i will have my best people work on it. kenny -- zadeck at naturalbridge dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |zadeck at naturalbridge dot |dot org |com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884