[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec

2006-04-26 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2006-04-26 14:50 
---
Yes janis, it is quite likely that that patch will fix this problem.
This looks like exactly the same failure as the other bug that that this patch
was submitted for.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855



[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec

2006-04-26 Thread zadeck at naturalbridge dot com


--- Comment #5 from zadeck at naturalbridge dot com  2006-04-26 20:51 
---
Subject: Re:  [4.2 Regression] ICE in add_deps_for_def
 with -fmodulo-sched -maltivec

janis at gcc dot gnu dot org wrote:
> --- Comment #4 from janis at gcc dot gnu dot org  2006-04-26 17:48 ---
> The patch doesn't apply cleanly now, which isn't surprising, but it also
> doesn't apply to mainline sources as of 2006-03-28, when it was submitted. 
> What date or revision can I start with to try this patch, without porting it
> forward to today's sources?
>
>
>   
I will redo the patch tomorrow on the way home from california.
kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855



[Bug rtl-optimization/26855] [4.2 Regression] ICE in add_deps_for_def with -fmodulo-sched -maltivec

2006-04-28 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2006-04-29 04:23 
---
Subject: Re:  [4.2 Regression] ICE in add_deps_for_def
 with -fmodulo-sched -maltivec

janis at gcc dot gnu dot org wrote:
> --- Comment #7 from janis at gcc dot gnu dot org  2006-04-29 00:02 ---
> I tried the patch at http://gcc.gnu.org/ml/gcc-patches/2006-04/msg01061.html 
> on
> powerpc64-linux and used the resulting compilers with "-O2 -fmodulo-sched" to
> build SPEC CPU2000 and run with the small, test input, and also built and ran
> the special version of HMMER (which uses AltiVec macros) with those same
> options.  I still get lots of failures: some tests ICE in the build, others 
> get
> runtime failures.  I got failures with different tests when I moved the
> compiler install tree to a different system, or when I ran it as a different
> user.  There's something very flaky going on.
>
>
>   
Janis,

I have not tried spec on my powerpc system.  could you send me some a
spec config file and any scripts you use and your special version of
HMMER.   I can build this over the weekend on my g5.

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26855



[Bug rtl-optimization/20972] Register allocator/reload uses auto-inc register in non-addressing operand

2006-06-16 Thread zadeck at naturalbridge dot com


--- Comment #10 from zadeck at naturalbridge dot com  2006-06-17 04:14 
---
(In reply to comment #9)
> The bug is in flow.c and fixed by the new df.c rewrite of dataflow.  Ken and I
> tripped over the same problem.
> 

While I thought this earlier, I do not believe it now.  There is a problem in
flow  that it fails to generate reg-dead notes for dead index regs in auto-inc
insns, but this is a separate problem.  


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20972



[Bug rtl-optimization/36365] [4.3/4.4 Regression] Hang in df_analyze

2008-12-06 Thread zadeck at naturalbridge dot com


--- Comment #13 from zadeck at naturalbridge dot com  2008-12-06 22:33 
---
Subject: Re:  [4.3/4.4 Regression] Hang in df_analyze

steven at gcc dot gnu dot org wrote:
> --- Comment #12 from steven at gcc dot gnu dot org  2008-12-06 21:25 
> ---
> Patch here:
> http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00409.html
>
> Approval mail never made it through, but you can see traces of it here:
> http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00410.html
>
>
>   
just to make it official, approved.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36365



[Bug rtl-optimization/38532] New: dse broken for frame related stores

2008-12-15 Thread zadeck at naturalbridge dot com
Some time ago, rth changed reload so that calls to
dse_record_singleton_alias_set and dse_invalidate_singleton_alias_set were
removed.   I believe that this was an accidental side effect of fixing some
other bug.  These calls identified these addresses as being "special", in the
sense that the values died at the end of the function.   

I had discussed this with vlad, because his method of allocating stack slots
was different than the old ra's and he was supposed to add these calls into
where ira allocates stack slots. 

As of this morning's trunk, this has not been done. So I am adding this
bugzilla as a reminder.  

Kenny


-- 
   Summary: dse broken for frame related stores
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Keywords: ra
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: vmakarov at gcc dot gnu dot org
    ReportedBy: zadeck at naturalbridge dot com
 GCC build triplet: all
  GCC host triplet: all
GCC target triplet: all


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38532



[Bug target/30271] -mstrict-align can an store extra for struct agrument passing

2008-12-15 Thread zadeck at naturalbridge dot com


--- Comment #9 from zadeck at naturalbridge dot com  2008-12-15 15:32 
---
Andrew, 

What is your point here?

1) Is it your claim that anything that is arg_pointer_rtx related would
automatically qualify as being safe enough to remove dead stores to?

or

2) Is it your claim that if we could generalize the game proposed in comment #7
to cover the arg_pointer_rtx's also?

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271



[Bug c++/37922] [4.3/4.4 Regression] code generation error

2008-12-16 Thread zadeck at naturalbridge dot com


--- Comment #16 from zadeck at naturalbridge dot com  2008-12-16 18:43 
---
and how would you ask that question in a machine independent way?

I am going to find the shift sequence and if it has a set or clobber of any
currently live hard reg, i will reject the sequence.

I am working on a fix now.
kenny


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37922



[Bug c++/37922] [4.3/4.4 Regression] code generation error

2008-12-18 Thread zadeck at naturalbridge dot com


--- Comment #20 from zadeck at naturalbridge dot com  2008-12-18 14:23 
---
committed patch to fix this.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37922



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2008-12-29 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2008-12-29 23:40 
---
additional info.

gcc.c-torture/compile/930523-1.c

on x86-32.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-01 Thread zadeck at naturalbridge dot com


--- Comment #4 from zadeck at naturalbridge dot com  2009-01-02 00:38 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

2009-01-01  Kenneth Zadeck 

PR rtl-optimization/35805
* df-problems.c (df_lr_finalize): Add recursive call to resolve lr
problem if fast dce is able to remove any instructions.
* dce.c (dce_process_block): Fix dump message.

This patch fixes the problem.  The comment in the patch describes the
issue.Since this was not really a failure, it would be hard to make
this issue into a testcase.

Ok to commit?

Bootstrapped and regression tested on x86*.

Kenny
Index: df-problems.c
===
--- df-problems.c   (revision 142954)
+++ df-problems.c   (working copy)
@@ -1001,22 +1001,32 @@ df_lr_transfer_function (int bb_index)
 /* Run the fast dce as a side effect of building LR.  */

 static void
-df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED)
+df_lr_finalize (bitmap all_blocks)
 {
   if (df->changeable_flags & DF_LR_RUN_DCE)
 {
   run_fast_df_dce ();
-  if (df_lr->problem_data && df_lr->solutions_dirty)
+
+  /* If dce deletes some instructions, we need to recompute the lr
+solution before proceeding further.  The problem is that fast
+dce is a pessimestic dataflow algorithm.  In the case where
+it deletes a statement S inside of a loop, the uses inside of
+S may not be deleted from the dataflow solution because they
+were carried around the loop.  While it is conservatively
+correct to leave these extra bits, the standards of df
+require that we maintain the best possible (least fixed
+point) solution.  The only way to do that is to redo the
+iteration from the beginning.  See PR35805 for an
+example.  */
+  if (df_lr->solutions_dirty)
{
- /* If we are here, then it is because we are both verifying
- the solution and the dce changed the function.  In that case
- the verification info built will be wrong.  So we leave the
- dirty flag true so that the verifier will skip the checking
- part and just clean up.*/
- df_lr->solutions_dirty = true;
+ df_clear_flags (DF_LR_RUN_DCE);
+ df_lr_alloc (all_blocks);
+ df_lr_local_compute (all_blocks);
+ df_worklist_dataflow (df_lr, all_blocks, df->postorder,
df->n_blocks);
+ df_lr_finalize (all_blocks);
+ df_set_flags (DF_LR_RUN_DCE);
}
-  else
-   df_lr->solutions_dirty = false;
 }
   else
 df_lr->solutions_dirty = false;
Index: dce.c
===
--- dce.c   (revision 142954)
+++ dce.c   (working copy)
@@ -601,7 +601,7 @@ dce_process_block (basic_block bb, bool

   if (dump_file)
 {
-  fprintf (dump_file, "processing block %d live out = ", bb->index);
+  fprintf (dump_file, "processing block %d lr out = ", bb->index);
   df_print_regset (dump_file, DF_LR_OUT (bb));
 }



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #6 from zadeck at naturalbridge dot com  2009-01-02 14:09 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

Paolo Bonzini wrote:
> Kenneth Zadeck wrote:
>   
>> 2009-01-01  Kenneth Zadeck 
>>
>> PR rtl-optimization/35805
>> * df-problems.c (df_lr_finalize): Add recursive call to resolve lr
>> problem if fast dce is able to remove any instructions.
>> * dce.c (dce_process_block): Fix dump message.
>>
>> This patch fixes the problem.  The comment in the patch describes the
>> issue.Since this was not really a failure, it would be hard to make
>> this issue into a testcase.
>> 
>
> IIUC the bugzilla comment trail, this caused
> gcc.c-torture/compile/930523-1.c to fail with --enable-checking=df;
> that's already a testcase.
>
>   
>> Ok to commit?
>> 
>
> Hmmm... I am not sure I like this patch, for two reasons.
>
> 1) it might incur a compile-time penalty for the sake of verification,
> even with df checking disabled.  OTOH having possibly different code for
> checking and non-checking compilation is even worse.
>
>   
There is a compile time penalty here but it is not for the sake of
verification.   It is for the sake of getting the best answer going
forward, into the computation of live.

There was a deeper bug here.   The code that was removed which cleared
the solutions_dirty flag is really wrong, because it lets the
conservative solution go forward and the next call to df_analyze will
not even try to redo anything and thus improve the solution. That was
how vlad saw the extra bits even though he was calling df_analyze before
using the bits.

 On the other hand, if you do not clear that flag in the old way, the
verifier will fail.
> 2) there are already provisions in dce.c to redo the analysis.  But they
> do not get to the least fixed point because they just rebuild the local
> bitmaps and iterate from the existing solution.  Instead of iterating
> "while (global_changed)", we could try doing only one iteration (it's a
> fast DCE after all, and the pessimistic dataflow makes me guess that
> subsequent DCE iterations won't find much?) and zap the solution there.
>  This has the advantage that we can skip the recomputation if
> global_changed is false.
>
> Did I miss anything?
>
>   
I think so.   The global changed flag allows it to delete the case:

loop:
  ... <- x  // This is dead.
 x- <- ...
go to loop

it just is not going to get rid of it if there is is no kill of x inside
the loop.

Anyway. the loop inside the fast dce code will only cause one extra
iteration of the blocks, and because of that it is still pessimistic.
>   


> Paolo
>   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2009-01-02 15:20 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

Paolo Bonzini wrote:
>> I think so.   The global changed flag allows it to delete the case:
>>
>> loop:
>>  ... <- x  // This is dead.
>>  x- <- ...
>> go to loop
>>
>> it just is not going to get rid of it if there is is no kill of x inside
>> the loop.
>> 
>
> I just don't think it's acceptable to load each and every "fast DCE"
> with the burden of a full df solution.  We need to find a way to limit
> this to the cases when it is needed, or at least not to be too
> conservative in ascertaining *when* it is needed.
>   
i am not, i am only doing it for each and every dce, only if the dce
actually deletes code. 

If there was a faster way to determine if the solution was too
conservative than redoing it, you would have an effective incremental
dataflow analysis algorithm.   I strongly believe that such a technique
does not exist.
> Hence my first and foremost question is: does it happen that the
> solution is wrong and global_changed never became true?
>
>   
The example in the pr exhibits this property.  the problem is that
deleting the use of pseudo 69 does not cause bit 69 to ever get turned
off because it was live at the bottom of the loop (since it had been
propagated around the loop to start with.)  Hence, when you get to the
top of the loop, there are no changes at all with respect to pseudo 69
and local_changed would not have been set.  (I do not know if it is
really true for the example that local_changes is not set, because the
deletion of the kill on the set side of the insn could have caused that
to happen.  But the point is that with respect to position 69, the use
in the deleted insn would not have caused local_changed to be set.)

> If the answer is "definitely no", then an alternative preferrable
> patch would be to move the code you added to df-problems.c into dce.c,
> so that the full analysis (including rebuilding the bitmaps and
> iterating possibly many times) is not run if it was to yield the same
> answer that was before in the bitmaps.
>
> Paolo
>   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #9 from zadeck at naturalbridge dot com  2009-01-02 15:34 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

On looking at the code, there is an issue with the first patch.   I
should have been clearing solutions_dirty flag at the start of the
function.   However, I do not think that this is the issue that you are
complaining about.  

What this corrects is the case where the solution was dirty before the
first call to df_analyze and dce finds nothing to delete.   In that
case, the code would have redone the lr solution for no reason. 

I will test this patch, but we still need to resolve your issues with my
approach.

Kenny


zadeck at naturalbridge dot com wrote:
> --- Comment #8 from zadeck at naturalbridge dot com  2009-01-02 15:20 
> ---
> Subject: Re:  [ira] error in start_allocno_priorities,
>  at ira-color.c:1806
>
> Paolo Bonzini wrote:
>   
>>> I think so.   The global changed flag allows it to delete the case:
>>>
>>> loop:
>>>  ... <- x  // This is dead.
>>>  x- <- ...
>>> go to loop
>>>
>>> it just is not going to get rid of it if there is is no kill of x inside
>>> the loop.
>>> 
>>>   
>> I just don't think it's acceptable to load each and every "fast DCE"
>> with the burden of a full df solution.  We need to find a way to limit
>> this to the cases when it is needed, or at least not to be too
>> conservative in ascertaining *when* it is needed.
>>   
>> 
> i am not, i am only doing it for each and every dce, only if the dce
> actually deletes code. 
>
> If there was a faster way to determine if the solution was too
> conservative than redoing it, you would have an effective incremental
> dataflow analysis algorithm.   I strongly believe that such a technique
> does not exist.
>   
>> Hence my first and foremost question is: does it happen that the
>> solution is wrong and global_changed never became true?
>>
>>   
>> 
> The example in the pr exhibits this property.  the problem is that
> deleting the use of pseudo 69 does not cause bit 69 to ever get turned
> off because it was live at the bottom of the loop (since it had been
> propagated around the loop to start with.)  Hence, when you get to the
> top of the loop, there are no changes at all with respect to pseudo 69
> and local_changed would not have been set.  (I do not know if it is
> really true for the example that local_changes is not set, because the
> deletion of the kill on the set side of the insn could have caused that
> to happen.  But the point is that with respect to position 69, the use
> in the deleted insn would not have caused local_changed to be set.)
>
>   
>> If the answer is "definitely no", then an alternative preferrable
>> patch would be to move the code you added to df-problems.c into dce.c,
>> so that the full analysis (including rebuilding the bitmaps and
>> iterating possibly many times) is not run if it was to yield the same
>> answer that was before in the bitmaps.
>>
>> Paolo
>>   
>> 
>
>
>   

Index: ChangeLog
===
--- ChangeLog   (revision 142954)
+++ ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2009-01-01  Kenneth Zadeck 
+
+   PR rtl-optimization/35805
+   * df-problems.c (df_lr_finalize): Add recursive call to resolve lr
+   problem if fast dce is able to remove any instructions.
+   * dce.c (dce_process_block): Fix dump message.
+   
 2008-12-29  Seongbae Park  

* tree-profile.c (tree_init_ic_make_global_vars): Make static
Index: df-problems.c
===
--- df-problems.c   (revision 142954)
+++ df-problems.c   (working copy)
@@ -1001,25 +1001,34 @@ df_lr_transfer_function (int bb_index)
 /* Run the fast dce as a side effect of building LR.  */

 static void
-df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED)
+df_lr_finalize (bitmap all_blocks)
 {
+  df_lr->solutions_dirty = false;
   if (df->changeable_flags & DF_LR_RUN_DCE)
 {
   run_fast_df_dce ();
-  if (df_lr->problem_data && df_lr->solutions_dirty)
+
+  /* If dce deletes some instructions, we need to recompute the lr
+solution before proceeding further.  The problem is that fast
+dce is a pessimestic dataflow algorithm.  In the case where
+it deletes a statement S inside of a loop, the uses inside of
+S may not be deleted from the dataflow solution because they
+were carried around the loop.  While it is conservatively
+correct to leave these extra bits, the standards of df
+   

[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2009-01-02 18:21 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

Paolo Bonzini wrote:
>> I will test this patch, but we still need to resolve your issues with my
>> approach.
>> 
>
> The problem is that you're really doubling the cost of computing the
> live registers.  I know that previously it was wrong, but at this point
> there's no difference with the full-blown pass...  Despite the idea of
> DF_LR_RUN_DCE being that it was "free", now it would do the same work as
> a pass_fast_rtl_dce modulo some O(#bbs) work.
>   
you are being too pessimistic.  most of the time, dce finds nothing.  
If DCE finds nothing, then the second pass does not run.

I considered just fixing the verification part (not clearing the
solutions_dirty flag) and letting the next call to df_analyze clean
things up.  In this way it would be like every other pass and leave
things dirty until the next pass that needed the info. 

StevenB talked me out of this because he considered it wrong to have the
client pass get conservative info.  I agreed with him but I am willing
to change my mind if you really want to push your case.  

> At this point, if your patch costs say 0.3%, and removing all traces of
> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in
> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to
> look for opportunities to remove some useless DCE.
>
> If it wasn't for verification, we could just decide that DF_LR_RUN_DCE
> is only for passes that can tolerate a little inaccurate info...
>
>   
This was in fact my argument to stevenb.  The point is that the live
info which is run after it will generally hide this conservativeness. 
On the other hand we do have standards that we always use the best info
 As i pointed out on irc, the only reason that vlad noticed this at
all was that he uses the wrong sets in his code (and he was running at
O1 in this case.)  At O2 and above he should be using the DF_LIVE sets.

Kenny

> Paolo
>   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #14 from zadeck at naturalbridge dot com  2009-01-02 18:54 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

Steven Bosscher wrote:
> On Fri, Jan 2, 2009 at 7:37 PM, Paolo Bonzini  wrote:
>   
>>>> At this point, if your patch costs say 0.3%, and removing all traces
>>>> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in
>>>> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to
>>>> look for opportunities to remove some useless DCE.
>>>> 
>> I'll try to do this for 4.5.
>> 
>
> It might be more worthwhile to just "fix" IRA to use DF_LIVE (which
> Vlad should have done in the first place). Then we wouldn't need
> Kenny's patch and DF_LR_RUN_DCE would still be essentially free.
>
> Gr.
> Steven
There is the issue of correctness vs rot.   I actually think that one of
the reasons that flow was so bad was that people went down this long
slippery slope of well it is good enough here ... and we really can get
away with it not being right here ... and after a while, all you have is
garbage.

The problem with this game is that it is not maintainable.   Those kinds
of decisions tend to get forgotten and lost as the personnel supporting
the compiler changes.Even if it is a fractional percentage slower,
the fact that you do not have to reason about it as the compiler evolves
is actually quite important.  

Thus, I plan to both fix this bug and add another one for vlad to fix
the sets that he uses. 

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #16 from zadeck at naturalbridge dot com  2009-01-03 00:35 
---
Subject: Re:  [ira] error in start_allocno_priorities,
 at ira-color.c:1806

Kenneth Zadeck wrote:
> Steven Bosscher wrote:
>   
>> On Fri, Jan 2, 2009 at 7:37 PM, Paolo Bonzini  wrote:
>>   
>> 
>>>>> At this point, if your patch costs say 0.3%, and removing all traces
>>>>> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in
>>>>> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to
>>>>> look for opportunities to remove some useless DCE.
>>>>> 
>>>>>   
>>> I'll try to do this for 4.5.
>>> 
>>>   
>> It might be more worthwhile to just "fix" IRA to use DF_LIVE (which
>> Vlad should have done in the first place). Then we wouldn't need
>> Kenny's patch and DF_LR_RUN_DCE would still be essentially free.
>>
>> Gr.
>> Steven
>> 
> There is the issue of correctness vs rot.   I actually think that one of
> the reasons that flow was so bad was that people went down this long
> slippery slope of well it is good enough here ... and we really can get
> away with it not being right here ... and after a while, all you have is
> garbage.
>
> The problem with this game is that it is not maintainable.   Those kinds
> of decisions tend to get forgotten and lost as the personnel supporting
> the compiler changes.Even if it is a fractional percentage slower,
> the fact that you do not have to reason about it as the compiler evolves
> is actually quite important.  
>
> Thus, I plan to both fix this bug and add another one for vlad to fix
> the sets that he uses. 
>
> Kenny
>   
2009-01-02  Kenneth Zadeck 

PR rtl-optimization/35805
* df-problems.c (df_lr_finalize): Add recursive call to resolve lr
problem if fast dce is able to remove any instructions.
* dce.c (dce_process_block): Fix dump message.

Rebootstrapped and regression tested on x86*.

Committed as revision 143027.

Kenny
Index: ChangeLog
===
--- ChangeLog   (revision 142954)
+++ ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2009-01-01  Kenneth Zadeck 
+
+   PR rtl-optimization/35805
+   * df-problems.c (df_lr_finalize): Add recursive call to resolve lr
+   problem if fast dce is able to remove any instructions.
+   * dce.c (dce_process_block): Fix dump message.
+   
 2008-12-29  Seongbae Park  

* tree-profile.c (tree_init_ic_make_global_vars): Make static
Index: df-problems.c
===
--- df-problems.c   (revision 142954)
+++ df-problems.c   (working copy)
@@ -1001,25 +1001,34 @@ df_lr_transfer_function (int bb_index)
 /* Run the fast dce as a side effect of building LR.  */

 static void
-df_lr_finalize (bitmap all_blocks ATTRIBUTE_UNUSED)
+df_lr_finalize (bitmap all_blocks)
 {
+  df_lr->solutions_dirty = false;
   if (df->changeable_flags & DF_LR_RUN_DCE)
 {
   run_fast_df_dce ();
-  if (df_lr->problem_data && df_lr->solutions_dirty)
+
+  /* If dce deletes some instructions, we need to recompute the lr
+solution before proceeding further.  The problem is that fast
+dce is a pessimestic dataflow algorithm.  In the case where
+it deletes a statement S inside of a loop, the uses inside of
+S may not be deleted from the dataflow solution because they
+were carried around the loop.  While it is conservatively
+correct to leave these extra bits, the standards of df
+require that we maintain the best possible (least fixed
+point) solution.  The only way to do that is to redo the
+iteration from the beginning.  See PR35805 for an
+example.  */
+  if (df_lr->solutions_dirty)
{
- /* If we are here, then it is because we are both verifying
- the solution and the dce changed the function.  In that case
- the verification info built will be wrong.  So we leave the
- dirty flag true so that the verifier will skip the checking
- part and just clean up.*/
- df_lr->solutions_dirty = true;
+ df_clear_flags (DF_LR_RUN_DCE);
+ df_lr_alloc (all_blocks);
+ df_lr_local_compute (all_blocks);
+ df_worklist_dataflow (df_lr, all_blocks, df->postorder,
df->n_blocks);
+ df_lr_finalize (all_blocks);
+ df_set_flags (DF_LR_RUN_DCE);
}
-  else
-   df_lr->solutions_dirty = false;
 }
-  else
-df_lr->solutions_dirty = false;
 }


Index: dce.c
===
--- dce.c   (revision 1429

[Bug rtl-optimization/38711] New: ira should not be using df-lr except at -O1.

2009-01-02 Thread zadeck at naturalbridge dot com
Ira should be using the DF-LIVE sets, which are smaller than the DF-LR sets
when they are available (typically at O2 and above).

The proper sets can be conveniently accessed using the df_get_live_[in,out]
functions which use DF-LIVE if it is available and fall back to DF-LR if it is
not.


-- 
   Summary: ira should not be using df-lr except at -O1.
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: ra
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: vmakarov at gcc dot gnu dot org
ReportedBy: zadeck at naturalbridge dot com
 GCC build triplet: all
  GCC host triplet: all
GCC target triplet: all


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38711



[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806

2009-01-02 Thread zadeck at naturalbridge dot com


--- Comment #17 from zadeck at naturalbridge dot com  2009-01-03 01:05 
---
patch committed to fix this.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805



[Bug rtl-optimization/38774] [4.4 Regression] ice in df_refs_verify, at df-scan.c:4307

2009-01-09 Thread zadeck at naturalbridge dot com


--- Comment #2 from zadeck at naturalbridge dot com  2009-01-09 12:41 
---
i will have my best people work on it.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38774



[Bug rtl-optimization/38774] [4.4 Regression] ice in df_refs_verify, at df-scan.c:4307

2009-01-09 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2009-01-10 01:57 
---
Created an attachment (id=17068)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17068&action=view)
patch to cause df to verify after every patch

this is a combine bug.  The df verification fails after combine makes some
modification to the cc arg of insn 28 in the foo function that bypasses the
implicit and explicit calls to mark the insn as being changed. 

I am looking into trying to figure out what path thru combine is doing this.  
However, if some combine expert (or just someone who wants to prove that they
have better skill with the debugger than I do) wants to get there first, be my
guest.   I have attached a patch that improves some of the debugging and causes
df to verify after every pass.   This patch causes the failure to move from
being in ira, to the start of if conversion after combine.  



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38774



[Bug rtl-optimization/36365] [4.3 Regression] Hang in df_analyze

2009-01-24 Thread zadeck at naturalbridge dot com


--- Comment #17 from zadeck at naturalbridge dot com  2009-01-24 20:28 
---
Subject: Re:  [4.3 Regression] Hang in df_analyze

rguenth at gcc dot gnu dot org wrote:
> --- Comment #16 from rguenth at gcc dot gnu dot org  2009-01-24 10:20 
> ---
> GCC 4.3.3 is being released, adjusting target milestone.
>
>
>   
steven,

did you fix this and forget to close it?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36365



[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented

2009-01-28 Thread zadeck at naturalbridge dot com


--- Comment #4 from zadeck at naturalbridge dot com  2009-01-28 16:03 
---
Subject: Re:  [4.3/4.4 Regression] life passes dump
 option still documented

rguenth at gcc dot gnu dot org wrote:
> --- Comment #3 from rguenth at gcc dot gnu dot org  2009-01-24 10:20 
> ---
> GCC 4.3.3 is being released, adjusting target milestone.
>
>
>   
This may be more a change than is acceptable right now for 4.4.   If so
I will sit on this patch until 4.5 opens up.   The patch is basically a
complete rewrite of the part of invoke.texi that deals with dump options
for the rtl pass.   This section had badly rotted. 

I started from a grep of the sources looking for "rtl_opt_pass" and
documented all of the passes that i found in mostly alphabetical
order.   Where the old version documented several passes together, I
kept that unless things had changed.   In total there were about a half
dozen passes that were no longer there and about a dozen new passes that
had not been documented.  

I did make some changes in the code, which is the reason that this may
not be acceptable to 4.4.  The changes are pretty harmless:  all of them
involve either removing the pass name or changing it.  

1) Pass names that contained dashes had the dashes changed to
underscores.   About half used slashes and half underscores and I went
with underscores to avoid a possible ambiguity with the options parsing.

2) I also removed the pass name from 6 passes that do not print anything
or dump the code. 

3) Files that contained multiple passes with names of the form xx,
xx2... were renamed xx1,xx2.
This later change causes a test suit failure which was fixed.

All of these changes are pretty minor.  The only possible failure these
can cause are in the test suite where dump files are scanned.  

I tested this on x86 and ppc both 32 and 64.  It is possible that there
are platform specific regression tests that scan for dump files that
were not caught on these four targets.

I also left in lreg and greg.These are at the end and need to be
deleted along with those passes.

I have enclosed a copy of the new text.  The diff is unreadable. 

ok for 4.4 or should i wait for 4.5?

Kenny


2009-01-28  Kenneth Zadeck 

PR middle-end/35854
* doc/invoke.texi (rtl debug options): Complete rewrite.
* auto-inc-dec.c (pass_inc_dec): Rename pass from "auto-inc-dec"
to auto_inc_dec".
* df-core.c (df_pass_initialize_opt, df_pass_initialize_no_opt,
df_pass_finish): Removed pass name.
* mode-switching.c (pass_mode_switching): Rename pass from
"mode-sw" to "mode_sw".
* except.c (pass_convert_to_eh_ranges): Rename pass from
"eh-ranges" to "eh_ranges".
* regclass.c (pass_regclass_init, pass_subregs_of_mode_init,
pass_subregs_of_mode_finish): Removed pass name.
* lower-subreg.c (pass_lower_subreg): Renamed pass from "subreg"
to "subreg1".


2009-01-28  Kenneth Zadeck 

PR middle-end/35854
* gcc.dg/lower-subreg-1.c: Renamed dump pass from "subreg" to
"subreg1"   


==
@item -...@var{letters}
@itemx -fdump-r...@var{pass}
@opindex d
Says to make debugging dumps during compilation at times specified by
@var{letters}.This is used for debugging the RTL-based passes of the
compiler.  The file names for most of the dumps are made by appending a
pass number and a word to the @var{dumpname}.  @var{dumpname} is generated
from the name of the output file, if explicitly specified and it is not
an executable, otherwise it is the basename of the source file. These
switches may have different effects when @option{-E} is used for
preprocessing.

Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
@option{-d} option @var{letters}.  Here are the possible
letters for use in @var{pass} and @var{letters}, and their meanings:

@table @gcctabopt

@item -fdump-rtl-alignments
@opindex fdump-rtl-alignments
Dump after branch alignments have been computed.

@item -fdump-rtl-asmcons
@opindex fdump-rtl-asmcons
Dump after fixing rtl statements that have unsatisfied in/out constraints.

@item -fdump-rtl-auto_inc_dec
@opindex fdump-rtl-auto_inc_dec
Dump after auto-inc-dec discovery.  This pass is only run on
architectures that have auto inc or auto dec instructions.

@item -fdump-rtl-barriers
@opindex fdump-rtl-barriers
Dump after cleaning up the barrier instructions.

@item -fdump-rtl-bbpart
@opindex fdump-rtl-bbpart
Dump after partitioning hot and cold basic blocks.

@item -fdump-rtl-bbro
@opindex fdump-rtl-bbro
Dump after block reordering.

@item -fdump-rtl-btl1
@itemx -fdump-rtl-btl2
@opindex fdump-rtl-btl2
@opindex fdump-rtl-btl2
@option{-fdump-rtl-btl1} and @option{-fdump-rtl-btl2} enable dumping
after the two branch
target load optimization passes.

@item -fdump-rtl-bypass
@opindex fdump-rtl-bypass
Dump after jump bypassing and control flow optimizatio

[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented

2009-01-29 Thread zadeck at naturalbridge dot com


--- Comment #7 from zadeck at naturalbridge dot com  2009-01-29 14:38 
---
Subject: Re:  [4.3/4.4 Regression] life passes dump
  option still documented

Richard Guenther wrote:
> On Wed, Jan 28, 2009 at 5:03 PM, Kenneth Zadeck
>  wrote:
>   
>> rguenth at gcc dot gnu dot org wrote:
>> 
>>> --- Comment #3 from rguenth at gcc dot gnu dot org  2009-01-24 10:20 
>>> ---
>>> GCC 4.3.3 is being released, adjusting target milestone.
>>>
>>>
>>>
>>>   
>> This may be more a change than is acceptable right now for 4.4.   If so
>> I will sit on this patch until 4.5 opens up.   The patch is basically a
>> complete rewrite of the part of invoke.texi that deals with dump options
>> for the rtl pass.   This section had badly rotted.
>>
>> I started from a grep of the sources looking for "rtl_opt_pass" and
>> documented all of the passes that i found in mostly alphabetical
>> order.   Where the old version documented several passes together, I
>> kept that unless things had changed.   In total there were about a half
>> dozen passes that were no longer there and about a dozen new passes that
>> had not been documented.
>>
>> I did make some changes in the code, which is the reason that this may
>> not be acceptable to 4.4.  The changes are pretty harmless:  all of them
>> involve either removing the pass name or changing it.
>>
>> 1) Pass names that contained dashes had the dashes changed to
>> underscores.   About half used slashes and half underscores and I went
>> with underscores to avoid a possible ambiguity with the options parsing.
>>
>> 2) I also removed the pass name from 6 passes that do not print anything
>> or dump the code.
>> 
>
> I think this change is agains what was asked for in the past.  We want to have
> pass names for all passes.
>
>   
>> 3) Files that contained multiple passes with names of the form xx,
>> xx2... were renamed xx1,xx2.
>> This later change causes a test suit failure which was fixed.
>>
>> All of these changes are pretty minor.  The only possible failure these
>> can cause are in the test suite where dump files are scanned.
>>
>> I tested this on x86 and ppc both 32 and 64.  It is possible that there
>> are platform specific regression tests that scan for dump files that
>> were not caught on these four targets.
>>
>> I also left in lreg and greg.These are at the end and need to be
>> deleted along with those passes.
>>
>> I have enclosed a copy of the new text.  The diff is unreadable.
>>
>> ok for 4.4 or should i wait for 4.5?
>> 
>
> This is ok for 4.4 if you remove the parts that remove pass names.  Please
> wait a day for comments from others.
>
> Thanks,
> Richard.
>
>   
>
I put those pass names back, but I documented them as producing no
output.   I also removed the lreg and greg part since the RA removal
patch has been approved. 

committed as revision 143756

kenny

2009-01-29  Kenneth Zadeck 

 PR middle-end/35854
* doc/invoke.texi (rtl debug options): Complete rewrite.
* auto-inc-dec.c (pass_inc_dec): Rename pass from "auto-inc-dec"
to auto_inc_dec".
* mode-switching.c (pass_mode_switching): Rename pass from
"mode-sw" to "mode_sw".
* except.c (pass_convert_to_eh_ranges): Rename pass from
"eh-ranges" to "eh_ranges".
* lower-subreg.c (pass_lower_subreg): Renamed pass from "subreg"
to "subreg1".
  2009-01-29  Kenneth Zadeck 

 PR middle-end/35854
* gcc.dg/lower-subreg-1.c: Renamed dump pass from "subreg" to
"subreg1"   






Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 143754)
+++ doc/invoke.texi (working copy)
@@ -4545,172 +4545,275 @@ preprocessing.

 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible
-letters for use in @var{letters} and @var{pass}, and their meanings:
+letters for use in @var{pass} and @var{letters}, and their meanings:

 @table @gcctabopt
-...@item -dA
-...@opindex dA
-Annotate the assembler output with miscellaneous debugging information.
+
+...@item -fdump-rtl-alignments
+...@opindex fdump-rtl-alignments
+Dump after branch alignments have been computed.
+
+...@item -fdump-rtl-asmcons
+...@opindex fdump-rtl-asmcons
+Dump after fixing rtl statements that have unsatisfied in/out constraints.
+
+...@item -fdump-rtl-auto_inc_dec
+...@opindex fdump-rtl-auto_inc_dec
+Dump after auto-inc-dec discovery.  This pass is only run on
+architectures that hav

[Bug middle-end/35854] [4.3/4.4 Regression] life passes dump option still documented

2009-01-29 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2009-01-29 14:42 
---
patch committed.
closed for 4.4.

richi said not to backport to 4.3 on irc.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35854



[Bug rtl-optimization/25483] [4.2 Regression] ICE on valid code with -O2 -fmove-loop-invariants

2005-12-19 Thread zadeck at naturalbridge dot com


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25483



[Bug rtl-optimization/25483] [4.2 Regression] ICE on valid code with -O2 -fmove-loop-invariants

2005-12-19 Thread zadeck at naturalbridge dot com


--- Comment #7 from zadeck at naturalbridge dot com  2005-12-19 19:43 
---
I had messed up the original change to df.c.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25483



[Bug rtl-optimization/25799] [42. Regression] cc1 stalled with -O1 -fmodulo-sched

2006-01-16 Thread zadeck at naturalbridge dot com


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-01-16 19:11:26
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25799



[Bug rtl-optimization/25799] [4.2 Regression] cc1 stalled with -O1 -fmodulo-sched

2006-01-19 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2006-01-20 01:33 
---
2005-01-19  Kenneth Zadeck <[EMAIL PROTECTED]>

PR rtl-optimization/25799 
* df-problems.c (df_ru_confluence_n, df_rd_confluence_n):
Corrected confluence operator to remove bits from op2 before oring
with op1 rather than removing bits from op1.
(df_ru_transfer_function): Corrected test on wrong bitmap which
caused infinite loop.  Both of these problems were introduced in
the dataflow rewrite.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25799



[Bug target/29083] useless clrlwi instruction produced for 16-bit bitfield

2006-09-14 Thread zadeck at naturalbridge dot com


--- Comment #2 from zadeck at naturalbridge dot com  2006-09-14 12:51 
---
Subject: Re:  useless clrlwi instruction produced for 16-bit
 bitfield

bonzini at gnu dot org wrote:
> --- Comment #1 from bonzini at gnu dot org  2006-09-14 12:07 ---
> The sole difference in the IR is
>
> ;; if ((int) node->x == a) goto ; else (void) 0;
> (insn 19 18 20 (set (reg:HI 125)
> (mem/s/j:HI (reg/v/f:SI 123 [ node ])
>   [2 .x+0 S2 A32])) -1 (nil)
>
> ;; if ((int) MEM[base: (short unsigned int *) node] == a) goto ; else
> (void) 0;
> (insn 20 19 21 (set (reg:HI 125)
> (mem/s:HI (reg/v/f:SI 123 [ node ])
>   [3 .x+0 S2 A8])) -1 (nil)
>  (nil))
>
> (COMPONENT_REF vs. TARGET_MEM_REF, the first produces A32 and the second A8)
>
>
> 
>
>
> It's actually flow's fault, because it fails to recognize a PRE_MODIFY 
> address,
> and things go downhill from there: life1 dump is
>
>16 r121:SI=r121:SI+0x1 |17 r122:SI=r122:SI+0x1
>18 r123:SI=r123:SI-0x4 |20 r126:HI=[--r124:SI]
>19 r125:HI=[r123:SI]   |   REG_INC: r124:SI
>20 r124:SI=zero_extend(r125:HI)|21 r125:SI=zero_extend(r126:HI)
>   REG_DEAD: r125:HI   |   REG_DEAD: r126:HI
>21 r126:CC=cmp(r124:SI,r121:SI)|22 r127:CC=cmp(r125:SI,r122:SI)
>   REG_DEAD: r124:SI   |   REG_DEAD: r125:SI
>22 pc={(r126:CC==0x0)?L13:pc}  |23 pc={(r127:CC==0x0)?L14:pc}
>   REG_DEAD: r126:CC   |   REG_DEAD: r127:CC
>   REG_BR_PROB: 0x22c4 REG_BR_PROB: 0x22c4
>24 NOTE_INSN_BASIC_BLOCK   |25 NOTE_INSN_BASIC_BLOCK
>28 NOTE_INSN_FUNCTION_END  |29 NOTE_INSN_FUNCTION_END
>31 r3:SI=r121:SI   |32 r3:SI=r122:SI
>   REG_DEAD: r121:SI   |   REG_DEAD: r122:SI
>37 use r3:SI   |38 use r3:SI
>
> while combine dump is
>
>14 NOTE_INSN_BASIC_BLOCK   |15 NOTE_INSN_BASIC_BLOCK
>16 r121:SI=r121:SI+0x1 |17 r122:SI=r122:SI+0x1
>18 NOTE_INSN_DELETED   |20 NOTE_INSN_DELETED
>19 {r125:HI=[r123:SI-0x4];r123:SI= |21 r125:SI=zero_extend([--r124:SI]
>20 r124:SI=zero_extend(r125:HI)|   REG_INC: r124:SI
>   REG_DEAD: r125:HI   |22 r127:CC=cmp(r125:SI,r122:SI)
>21 r126:CC=cmp(r124:SI,r121:SI)|   REG_DEAD: r125:SI
>   REG_DEAD: r124:SI
>
> where it has synthesized a movsi_movhi_update1, but then failed to implement
> the merged.
>
> Could this be fixed on dataflow-branch?
>
>
>   
The current flow does not recognize any pre modify cases. What flow does
do is recognize pre_increment, which is a subset of pre_modify that has
the restriction that the width of the load be equal to the amount of the
increment.  By changing the type of x, you made the example fit into the
restrictions of the current code.

The post side of things in flow is a little more general than the pre
side because this was hacked for the ia-64.

My code on the dataflow branch knows what the machine is capable of
doing and would get this case, since the ppc is capable of much more
general updates. 

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29083



[Bug debug/31412] [4.3] inf loop/long compile time, time spent in var-tracking.c

2007-04-03 Thread zadeck at naturalbridge dot com


--- Comment #14 from zadeck at naturalbridge dot com  2007-04-03 16:47 
---
Subject: Re:  [4.3] inf loop/long compile time, time spent
 in var-tracking.c

steven at gcc dot gnu dot org wrote:
> --- Comment #13 from steven at gcc dot gnu dot org  2007-04-03 16:40 
> ---
> So this may be a non-monotonous dataflow problem...?
>
> Do we have the dataflow equations of the var-tracking problem somewhere?  It'd
> be interesting to check them against the actual implementation.
>
>
>   
this is a pretty complex problem.  I gave it a cursory once over and it
looks like the problem may not terminate if the location (stack offset)
of a variable is not the same on all paths into a block.  (the code may
be different than the comments and i did just scan this) I assume that
this case has a "bug" where a variable appears to be at a different
location coming across an exception edge. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31412



[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-16 Thread zadeck at naturalbridge dot com


--- Comment #17 from zadeck at naturalbridge dot com  2007-07-16 23:26 
---
Subject: Re:  [4.3 regression]: gfortran.dg/auto_array_1.f90

hjl at lucon dot org wrote:
> --- Comment #16 from hjl at lucon dot org  2007-07-16 19:27 ---
> revision 125923 works. Kenny, it looks like your patch
>
> http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html
>
> causes this regression. Can you look into it? Thanks.
>
>
>   
I will look into this as soon as the bootstrap starts working again on
the ia-64.


Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749



[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-26 Thread zadeck at naturalbridge dot com


--- Comment #21 from zadeck at naturalbridge dot com  2007-07-26 17:35 
---
Subject: Re:  [4.3 regression]: gfortran.dg/auto_array_1.f90

Seongbae Park (???, ???) wrote:
> On 7/26/07, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:
>> This patch extends the fix in
>> http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html
>> to handle the case of clobbers inside conditional calls.
>>
>> This problem caused the regression of gfortran.dg/matmul_3.f90 on the
>> ia-64 in addition to the regression cited in this pr.
>>
>> Tested on ppc-32, ia-64 and x86-64.
>>
>> 2007-07-26 Kenneth Zadeck <[EMAIL PROTECTED]>
>>
>> PR middle-end/32749
>>
>> * df-problems.c (df_note_bb_compute): Handle case of clobber
>> inside conditional call.
>>
>> ok to commit?
>
> This change is OK.
> Though I wonder if we need to do similar checking
> for the regular insn case below.
No the checking is done in df_create_unused_note. The only reason you
have to do it here is that you are not calling that.

thanks

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749



[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-26 Thread zadeck at naturalbridge dot com


--- Comment #19 from zadeck at naturalbridge dot com  2007-07-26 11:51 
---
Subject: Re:  [4.3 regression]: gfortran.dg/auto_array_1.f90

This patch extends the fix in
http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01557.html
to handle the case of clobbers inside conditional calls.

This problem caused the regression of gfortran.dg/matmul_3.f90 on the
ia-64 in addition to the regression cited in this pr.

Tested on ppc-32, ia-64 and x86-64.

2007-07-26  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/32749

* df-problems.c (df_note_bb_compute): Handle case of clobber
inside conditional call.  


ok to commit?

kenny
Index: df-problems.c
===
--- df-problems.c   (revision 126918)
+++ df-problems.c   (working copy)
@@ -3989,7 +3989,7 @@ df_note_bb_compute (unsigned int bb_inde
  /* However a may or must clobber still needs to kill the
 reg so that REG_DEAD notes are later placed
 appropriately.  */ 
- else 
+ else if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL |
DF_REF_CONDITIONAL)))
bitmap_clear_bit (live, DF_REF_REGNO (def));
}
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749



[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-25 Thread zadeck at naturalbridge dot com


--- Comment #18 from zadeck at naturalbridge dot com  2007-07-25 18:41 
---
i am testing a patch.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED
   Last reconfirmed|2007-07-13 00:25:37 |2007-07-25 18:41:41
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749



[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-27 Thread zadeck at naturalbridge dot com


--- Comment #25 from zadeck at naturalbridge dot com  2007-07-27 17:29 
---
Subject: Re:  [4.3 regression]: gfortran.dg/auto_array_1.f90

This patch rearranges the updating of the local dataflow info when
building reg_dead notes.  The need for this was that processing was not
correctly handled for clobbers that occurred within conditional call
insns.  A rare case but one that at least happens on the ia-64.

This patch not only fixes the regressions listed in pr32749, but also
fixes the gfortran.dg/matmul_3.f90 on the ia-64 regressions. 

This patch was bootstrapped and regression tested yesterday on x86-64
and ia-64 and was again bootstrapped this morning on x86-64 (just to
make sure there were no interactions with richard sandiford's fixes to
closely related code that was just committed.)

Committed as revision 126987.

Kenny

2007-07-26  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/32749

* df-problems.c (df_create_unused_note): Removed do_not_gen parm
and the updating of the live and do_not_gen sets.
(df_note_bb_compute): Added updating of live and do_not_gen sets
for regular defs so that the case of clobber inside conditional
call is processed correctly.

Index: df-problems.c
===
--- df-problems.c   (revision 126979)
+++ df-problems.c   (working copy)
@@ -3868,13 +3868,12 @@ df_set_dead_notes_for_mw (rtx insn, rtx 
 }


-/* Create a REG_UNUSED note if necessary for DEF in INSN updating LIVE
-   and DO_NOT_GEN.  Do not generate notes for registers in artificial
-   uses.  */
+/* Create a REG_UNUSED note if necessary for DEF in INSN updating
+   LIVE.  Do not generate notes for registers in ARTIFICIAL_USES.  */

 static rtx
 df_create_unused_note (rtx insn, rtx old, struct df_ref *def, 
-  bitmap live, bitmap do_not_gen, bitmap artificial_uses)
+  bitmap live, bitmap artificial_uses)
 {
   unsigned int dregno = DF_REF_REGNO (def);

@@ -3899,12 +3898,6 @@ df_create_unused_note (rtx insn, rtx old
 #endif
 }

-  if (!(DF_REF_FLAGS (def) & (DF_REF_MUST_CLOBBER + DF_REF_MAY_CLOBBER)))
-bitmap_set_bit (do_not_gen, dregno);
-  
-  /* Kill this register if it is not a subreg store or conditional store.  */
-  if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
-bitmap_clear_bit (live, dregno);
   return old;
 }

@@ -3915,7 +3908,7 @@ df_create_unused_note (rtx insn, rtx old

 static void
 df_note_bb_compute (unsigned int bb_index, 
- bitmap live, bitmap do_not_gen, bitmap artificial_uses)
+   bitmap live, bitmap do_not_gen, bitmap artificial_uses)
 {
   basic_block bb = BASIC_BLOCK (bb_index);
   rtx insn;
@@ -4012,17 +4005,17 @@ df_note_bb_compute (unsigned int bb_inde
  for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++)
{
  struct df_ref *def = *def_rec;
- if (!(DF_REF_FLAGS (def) & (DF_REF_MUST_CLOBBER |
DF_REF_MAY_CLOBBER)))
-   old_unused_notes
- = df_create_unused_note (insn, old_unused_notes, 
-  def, live, do_not_gen, 
-  artificial_uses);
-
- /* However a may or must clobber still needs to kill the
-reg so that REG_DEAD notes are later placed
-appropriately.  */ 
- else 
-   bitmap_clear_bit (live, DF_REF_REGNO (def));
+ unsigned int dregno = DF_REF_REGNO (def);
+ if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MUST_CLOBBER |
DF_REF_MAY_CLOBBER))
+   {
+ old_unused_notes
+   = df_create_unused_note (insn, old_unused_notes, 
+def, live, artificial_uses);
+ bitmap_set_bit (do_not_gen, dregno);
+   }
+
+ if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL |
DF_REF_CONDITIONAL))
+   bitmap_clear_bit (live, dregno);
}
}
   else
@@ -4043,10 +4036,16 @@ df_note_bb_compute (unsigned int bb_inde
  for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++)
{
  struct df_ref *def = *def_rec;
+ unsigned int dregno = DF_REF_REGNO (def);
  old_unused_notes
= df_create_unused_note (insn, old_unused_notes, 
-def, live, do_not_gen, 
-artificial_uses);
+def, live, artificial_uses);
+
+ if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MUST_CLOBBER |
DF_REF_MAY_CLOBBER))
+   bitmap_set_bit (do_not_gen, dregno);
+
+ if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL |
DF_REF_CONDITIONAL))
+   bitmap_clear_bit (live, dregno);
}
}



-- 


http://gcc.gnu.org/b

[Bug middle-end/32749] [4.3 regression]: gfortran.dg/auto_array_1.f90

2007-07-27 Thread zadeck at naturalbridge dot com


--- Comment #26 from zadeck at naturalbridge dot com  2007-07-27 17:33 
---
revision 126987


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32749



[Bug target/32431] [4.3 Regression] ICE in df_refs_verify, at df-scan.c:4066

2007-08-02 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2007-08-02 19:19 
---
Given that the rtl passes are moving to not allow illegally shared rtl, i do
not believe that the resolution of this bug has anything to do with the
dataflow port.

If this bug is to be resolved, it will be done by cleaning up this back end.  


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 CC|zadeck at naturalbridge dot |
   |com |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32431



[Bug rtl-optimization/32300] [4.3 Regression] ICE with -O2 -fsee

2007-08-17 Thread zadeck at naturalbridge dot com


--- Comment #10 from zadeck at naturalbridge dot com  2007-08-17 12:48 
---
Subject: Re:  [4.3 Regression] ICE with -O2 -fsee

wouter dot vermaelen at scarlet dot be wrote:
> --- Comment #9 from wouter dot vermaelen at scarlet dot be  2007-08-17 
> 12:44 ---
> Here is a simpler testcase:
>
> int f(int i) { return 100LL / (1 + i); }
>
>
>   
thanks,

everyone knows what the problems with see.c are, it is simply a matter
of having the authors fix their code.  Virtually anything that invokes
this pass will cause it to fail.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32300



[Bug target/33151] Invalid insn with pre_inc

2007-08-23 Thread zadeck at naturalbridge dot com


--- Comment #4 from zadeck at naturalbridge dot com  2007-08-23 18:59 
---
Subject: Re:  Invalid insn with pre_inc

pinskia at gcc dot gnu dot org wrote:
> --- Comment #3 from pinskia at gcc dot gnu dot org  2007-08-22 22:41 
> ---
> I think we need a new predicate for this rtl instruction, currently we just
> have:
>(clobber (match_operand:DF 4 "memory_operand" "=o"))
>
>
>   
After thinking about this last night, i believe that this problem should
be solved at the machine description level, not by changing
auto-inc-dec.c.  Auto-inc-dec.c uses all of the standard interfaces to
keep from generating invalid rtl.  So it seems proper to have the md
level not allow the creation of this insn. 

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33151



[Bug middle-end/32758] [4.3 Regression] ecj1 hangs

2007-08-29 Thread zadeck at naturalbridge dot com


--- Comment #30 from zadeck at naturalbridge dot com  2007-08-29 15:34 
---
Subject: Re:  [4.3 Regression] ecj1 hangs

bonzini at gnu dot org wrote:
> --- Comment #29 from bonzini at gnu dot org  2007-08-29 14:16 ---
> (When I said "post your first patch", I meant the first one from comment #26;
> if my "fixing the mess" works, it'll not be necessary anymore).
>
>
>   
For some reason, I was not copied on any of the postings for this patch
until this morning.

First, thankyou Jakub and Andreas for going this.

I think that it is obvious that you have spotted the exact problem: in
some way shape form of fashion, the artificial uses at the end of the
block need to be re added into the live set after the processing of each
insn in the block.

There are two ways of doing this (assume that you have a local variable
called artificial_uses_fixup which is a pointer to either
df->eh_block_artificial_uses or
df->regular_block_artificial_uses depending on if the block has eh preds) :

1) you can explicitly or artificial_uses_fixup into local_live after
processing each insn.
2) you can test artificial_uses_fixup along with local_live when setting
needed.

As noted, (1) has the problem that may cause an infinite loop.  This
infinite loop could be fixed by changing the equation for block_changed
to be

!bitmap_equal (local_live, DF_LR_IN (BB) || artificial_uses_fixup)

i.e. the infinite loop is because DF_LR_IN may be deficient in some of
the bits in artificial_uses_fixup for basically the same reason that
caused the bug in the first place.

I personally think that solution (1) is preferable to (2) because it is
fewer bitmap operations even though it will require a extra temp bitmap
to hold the or.

But either patch is a reasonable approach. 

As far as why there are all of the df_simulate functions that do things
in different ways, the answer is that the code has evolved and sometimes
things get missed.

The addition of the df->eh_block_artificial_uses and
df->regular_block_artificial_uses sets is fairly recent and it would
most likely be useful to replace walks  of artificial_uses with them. 

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32758



[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)

2007-08-30 Thread zadeck at naturalbridge dot com


--- Comment #4 from zadeck at naturalbridge dot com  2007-08-30 14:43 
---
Subject: Re:  failing rtl iv analysis (maybe due
 to df)

dorit at gcc dot gnu dot org wrote:
> --- Comment #3 from dorit at gcc dot gnu dot org  2007-08-30 08:12 ---
> (In reply to comment #2)
>   
>> I suspect this might be due to not updating the rd information after 
>> unrolling.
>> Can you check if 
>> analyze_insns_in_loop() (which calls df_analyze()) is being called just 
>> before
>> the problematic unrolling ?
>> 
>
> it looks like it's called just before the unroller actually transforms
> somthing, but not before the (failing) analysis. But when I add a call to it 
> in
> decide_peel_completely the analysis still fails.
>
>
>   
dorit,

i am having trouble exactly reproducing this example because you did not
give the svn revision and so all of the numbers are a little bit
different. 

However, I am going to submit a patch which improves the dump
information a lot for these passes and we should talk about it after we
can get on the same page.

However, from looking at your posting, there are some issues that you
may want to look at before we talk:

The reaching defs problem makes a scan for all of the defs in the blocks
in the region.  Once all of the defs are found, they are sorted where
the primary key is the regno. 
The id's (DF_REF_ID) are then assigned based on this sorting.  The
reaching defs problem actually depends on all of the defs for a regno to
be contigious.

The DF_REF_IDs are not stable between calls to df_set_blocks and any def
outside of the region has an undefined DF_REF_ID.

In your posting you have:

> Below is the output of df_ref_debug for adef in each iteration of the loop in
> latch_dominating_def:

> d40 reg 187 bb 3 insn 255 flag 0x0 type 0x0 loc 0xf7da4608(0xf7d9a4e0) chain 
> { }
> d93 reg 187 bb 2 insn 40 flag 0x0 type 0x0 loc 0xf7d89cc8(0xf7d9a4e0) chain { 
> }

The number after the first "d" is the DF_REF_ID.  Note that they are not
contiguous. 
Given the sorting that occurred, they must be contiguous.  I assume from this
that 
someone is holding on to old id's.  This is not correct.

If you are going to play the game with df_set_blocks, you are allowed to hold
onto a 
def, but not the DF_REF_ID, you cannot look at the DF_REF_ID for a def 
that is not in the blocks set by df_set_blocks.   

Kenny
Index: df-core.c
===
--- df-core.c   (revision 127917)
+++ df-core.c   (working copy)
@@ -1761,6 +1761,7 @@ df_print_regset (FILE *file, bitmap r)


 /* Dump dataflow info.  */
+
 void
 df_dump (FILE *file)
 {
@@ -1778,6 +1779,33 @@ df_dump (FILE *file)
 }


+/* Dump dataflow info for df->blocks_to_analyze.  */
+
+void
+df_dump_region (FILE *file)
+{
+  if (df->blocks_to_analyze)
+{
+  bitmap_iterator bi;
+  unsigned int bb_index;
+
+  fprintf (file, "\n\nstarting region dump\n");
+  df_dump_start (file);
+  
+  EXECUTE_IF_SET_IN_BITMAP (df->blocks_to_analyze, 0, bb_index, bi) 
+   {
+ basic_block bb = BASIC_BLOCK (bb_index);
+ 
+ df_print_bb_index (bb, file);
+ df_dump_top (bb, file);
+ df_dump_bottom (bb, file);
+   }
+  fprintf (file, "\n");
+}
+  else df_dump (file);
+}
+
+
 /* Dump the introductory information for each problem defined.  */

 void
Index: df.h
===
--- df.h(revision 127917)
+++ df.h(working copy)
@@ -836,6 +836,7 @@ extern bool df_reg_used (rtx, rtx);
 extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int);
 extern void df_print_regset (FILE *file, bitmap r);
 extern void df_dump (FILE *);
+extern void df_dump_region (FILE *);
 extern void df_dump_start (FILE *);
 extern void df_dump_top (basic_block, FILE *);
 extern void df_dump_bottom (basic_block, FILE *);
Index: loop-invariant.c
===
--- loop-invariant.c(revision 127917)
+++ loop-invariant.c(working copy)
@@ -644,6 +644,7 @@ find_defs (struct loop *loop, basic_bloc

   if (dump_file)
 {
+  df_dump_region (dump_file);
   fprintf (dump_file, "*starting processing of loop  **\n");
   print_rtl_with_bb (dump_file, get_insns ());
   fprintf (dump_file, "*ending processing of loop  **\n");
Index: loop-iv.c
===
--- loop-iv.c   (revision 127917)
+++ loop-iv.c   (working copy)
@@ -280,7 +280,7 @@ iv_analysis_loop_init (struct loop *loop
   df_set_blocks (blocks);
   df_analyze ();
   if (dump_file)
-df_dump (dump_file);
+df_dump_region (dump_file);

   check_iv_ref_table_size ();
   BITMAP_FREE (blocks);


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224



[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)

2007-08-30 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2007-08-30 18:51 
---
Subject: Re:  failing rtl iv analysis (maybe due
 to df)

rakdver at kam dot mff dot cuni dot cz wrote:
> --- Comment #7 from rakdver at kam dot mff dot cuni dot cz  2007-08-30 
> 18:09 ---
> Subject: Re:  failing rtl iv analysis (maybe due to df)
>
>   
>> The only thing that you are allowed to do with the DF_REF_ID is to get
>> it from a df_def
>> AFTER YOU ARE SURE THAT THE DEF IS IN THE REGION
>> 
>
> OK, this might be the problem; the code takes the defs from the reg->def
> lists, and checks whether the defs are set in the reaching def bitmaps.
> Naturally, it assumes that when the region is set by df_set_blocks, the
> reaching def bitmaps will only contain the defs that belong to the
> region (which used to be true before your changes).
>
>   
And it is still true now.  The set of bits in the bitmaps are EXACTLY
the set of defs inside the region.  The thing that has changed is that
the location (slot) in the bitmap is only defined after the calls to
df_set_blocks and df_analyze, i.e. the slots in the bitvectors are moved
around by these calls. 

In your example, you asked about 2 defs.  One of those defs is in the
region and one of them is outside the region.  It is not that the bits
are zero for a def outside of the region, there is no slot in the
bitvectors that corresponds to that def in the bitvectors.  You are not
allowed to look in the bitmap for the def outside of the region as ask
any questions at all if they involve the DF_REF_ID.  For the def that is
in the region, you can ask but you cannot use the DF_REF_ID that it had
before the call to set_blocks.  That old one is trash.

What has changed, and this was a very old change, from the time that
danny still worked at ibm, was that the DF_REF_ID's are not stable and
the slots change after setting the blocks in the region.  One of the
first df patches that was committed by us was to reorganize the bits so
that all of the refs for a single reg were contiguous.  This gave a
factor of 7 speedup over the old code because it allowed for the use of
new bitmap operations that worked over dense range indexes.  I assume
that this code has not really worked since then. 

> Anyway, it would be nice to have some documentation for df (there
> is only a short notice in
> http://gcc.gnu.org/onlinedocs/gccint/Liveness-information.html#Liveness-information,
> which appears wrong given the importance of this api), in particular
> pointing out such non-obvious traps would be great.
>
>
>   
This is only an issue if you use df_set_blocks and the only passes that
use it are these zdenek's loop passes.  If I had my way (and infinite
free time) I would get rid of df_set_blocks anyway.  The information
that it provides is generally wrong since it ignores information that
enters a block from the outside, but if you are very careful to only ask
a very limited range of questions, as Zdenek did, it can give you what
you want relatively inexpensively.

Furthermore, it has been a real pain to keep it correct as the rest of
df has evolved. 



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224



[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)

2007-08-30 Thread zadeck at naturalbridge dot com


--- Comment #9 from zadeck at naturalbridge dot com  2007-08-30 18:57 
---
Subject: Re:  failing rtl iv analysis (maybe due
 to df)

zadeck at naturalbridge dot com wrote:
> --- Comment #8 from zadeck at naturalbridge dot com  2007-08-30 18:51 
> ---
> Subject: Re:  failing rtl iv analysis (maybe due
>  to df)
>
> rakdver at kam dot mff dot cuni dot cz wrote:
>   
>> --- Comment #7 from rakdver at kam dot mff dot cuni dot cz  2007-08-30 
>> 18:09 ---
>> Subject: Re:  failing rtl iv analysis (maybe due to df)
>>
>>   
>> 
>>> The only thing that you are allowed to do with the DF_REF_ID is to get
>>> it from a df_def
>>> AFTER YOU ARE SURE THAT THE DEF IS IN THE REGION
>>> 
>>>   
>> OK, this might be the problem; the code takes the defs from the reg->def
>> lists, and checks whether the defs are set in the reaching def bitmaps.
>> Naturally, it assumes that when the region is set by df_set_blocks, the
>> reaching def bitmaps will only contain the defs that belong to the
>> region (which used to be true before your changes).
>>
>>   
>> 
> And it is still true now.  The set of bits in the bitmaps are EXACTLY
> the set of defs inside the region.  The thing that has changed is that
> the location (slot) in the bitmap is only defined after the calls to
> df_set_blocks and df_analyze, i.e. the slots in the bitvectors are moved
> around by these calls. 
>
> In your example, you asked about 2 defs.  One of those defs is in the
> region and one of them is outside the region.  It is not that the bits
> are zero for a def outside of the region, there is no slot in the
> bitvectors that corresponds to that def in the bitvectors.  You are not
> allowed to look in the bitmap for the def outside of the region as ask
> any questions at all if they involve the DF_REF_ID.  For the def that is
> in the region, you can ask but you cannot use the DF_REF_ID that it had
> before the call to set_blocks.  That old one is trash.
>
> What has changed, and this was a very old change, from the time that
> danny still worked at ibm, was that the DF_REF_ID's are not stable and
> the slots change after setting the blocks in the region.  One of the
> first df patches that was committed by us was to reorganize the bits so
> that all of the refs for a single reg were contiguous.  This gave a
> factor of 7 speedup over the old code because it allowed for the use of
> new bitmap operations that worked over dense range indexes.  I assume
> that this code has not really worked since then. 
>
>   
>> Anyway, it would be nice to have some documentation for df (there
>> is only a short notice in
>> http://gcc.gnu.org/onlinedocs/gccint/Liveness-information.html#Liveness-information,
>> which appears wrong given the importance of this api), in particular
>> pointing out such non-obvious traps would be great.
>>
>>
>>   
>> 
> This is only an issue if you use df_set_blocks and the only passes that
> use it are these zdenek's loop passes.  If I had my way (and infinite
> free time) I would get rid of df_set_blocks anyway.  The information
> that it provides is generally wrong since it ignores information that
> enters a block from the outside, but if you are very careful to only ask
> a very limited range of questions, as Zdenek did, it can give you what
> you want relatively inexpensively.
>
> Furthermore, it has been a real pain to keep it correct as the rest of
> df has evolved. 
>
>
>
>   
sorry zdenek, i misread who this was from, i would not have referred to
you in the third person if i had read it correctly.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224



[Bug rtl-optimization/33224] failing rtl iv analysis (maybe due to df)

2007-08-30 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2007-08-30 21:46 
---
Subject: Re:  failing rtl iv analysis (maybe due
 to df)

rakdver at gcc dot gnu dot org wrote:
> --- Comment #10 from rakdver at gcc dot gnu dot org  2007-08-30 20:05 
> ---
> I know how to fix the problem, now.
>
>
>   
thanks

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33224



[Bug bootstrap/32161] stage1 libgcc is being built unoptimized

2007-08-31 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2007-08-31 21:34 
---
At least on the x86-32, libgcc is currently being built optimized, but the
options are slightly different.  the stage1 build does not do
-fomit-frame-pointer.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32161



[Bug rtl-optimization/32300] [4.3 Regression] ICE with -O2 -fsee

2007-09-04 Thread zadeck at naturalbridge dot com


--- Comment #13 from zadeck at naturalbridge dot com  2007-09-05 01:24 
---
Subject: Re:  [4.3 Regression] ICE with -O2 -fsee

jakub at gcc dot gnu dot org wrote:
> --- Comment #12 from jakub at gcc dot gnu dot org  2007-09-04 23:37 
> ---
> Fixed.
>
>
>   
jakub

thanks for doing this.  The changes to df are fine, but i think that it
exceeds my authority to approve more than that.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32300



[Bug target/32481] ICE in df_refs_verify, at df-scan.c:4058

2007-10-04 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2007-10-04 20:51 
---
spark fixed this in comment #10.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32481



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-05 Thread zadeck at naturalbridge dot com


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-05 Thread zadeck at naturalbridge dot com


--- Comment #12 from zadeck at naturalbridge dot com  2007-10-05 13:02 
---
Subject: Re:  [4.3 regression]: wrong code with
 -fforce-addr

rguenth at gcc dot gnu dot org wrote:
> --- Comment #11 from rguenth at gcc dot gnu dot org  2007-10-05 12:36 
> ---
> But powf is pure/const, so the call is not a use.
>
>
>   
that is the reason that the call did not kill the potential set of dead
stores.

I will look at this later today.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-05 Thread zadeck at naturalbridge dot com


--- Comment #15 from zadeck at naturalbridge dot com  2007-10-05 20:17 
---
Subject: Re:  [4.3 regression]: wrong code with
 -fforce-addr

kargl at gcc dot gnu dot org wrote:
> --- Comment #13 from kargl at gcc dot gnu dot org  2007-10-05 17:50 
> ---
> (In reply to comment #9)
>   
>>> Hope this helps.
>>>   
>> Sure, I've got the problem. The problem is actually in RTL optimization, 
>> where
>> dse1 pass removes wrong insn.
>>
>> Suprisingly, the problem is in line 61 of comunpack.f:
>>
>> -->   bscale = 2.0**real(idrstmpl(2))
>>   dscale = 10.0**real(-idrstmpl(3))
>>
>> 
>
> This meant for Manfred instead of Uros, but it does contain the 
> relevant info.  Manfred, you told me elsewhere that you use -fforce-addr
> to achieve better performance.  Whoever wrote this code should be
> flogged.  idrstmpl is an INTEGER variable, and gfortran can generate
> much faster code for integer exponents than calling __builtin_powf.
>
> Try changing the lines to
>
>   bscale = 2.0**idrstmpl(2)
>   dscale = 10.0**(-idrstmpl(3))
>
> This, of course, doesn't fix the underlying bug.
>
>
>   
neither richi nor myself are able to reproduced the problem. 

./xgcc -B. -O2 -march=pentium4 -c mova2i.c -DLINUX
./gfortran -fforce-addr -B. -B../i686-pc-linux-gnu/libgfortran/.libs  -O2 -
march=pentium4 -o main main.f comunpack.f rdieee.f gbytesc.f mova2i.o

and i get the same thing with and without -fforce-addr


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-05 Thread zadeck at naturalbridge dot com


--- Comment #7 from zadeck at naturalbridge dot com  2007-10-06 04:11 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

hjl at lucon dot org wrote:
> --- Comment #5 from hjl at lucon dot org  2007-10-06 02:07 ---
> Kenny, does your patch
>
> http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00124.html
>
> handle cases where number of consecutive hard regs needed to hold some mode > 
> 1
> correctly? IA32 needs 2 hard registers to hold long long and your patch
> miscompiles the testcase in comment #4.
>
>
>   
I will look into it.  It should do this correctly.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-06 Thread zadeck at naturalbridge dot com


--- Comment #17 from zadeck at naturalbridge dot com  2007-10-06 12:27 
---
Subject: Re:  [4.3 regression]: wrong code with
 -fforce-addr

ubizjak at gmail dot com wrote:
> --- Comment #16 from ubizjak at gmail dot com  2007-10-06 06:49 ---
> (In reply to comment #14)
>   
>> The testcase works for me, that is, it produces the expected output good.out.
>> 
>
> Uh, you have to un-comment the line 315 of the comunpack.f test. The testcase,
> as attached, produces good code. Un-commenting line 315, you will get:
>
>
>   
you are making this into something of a scavenger hunt.

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-06 Thread zadeck at naturalbridge dot com


--- Comment #18 from zadeck at naturalbridge dot com  2007-10-06 13:07 
---
Subject: Re:  [4.3 regression]: wrong code with
 -fforce-addr

ubizjak at gmail dot com wrote:
> --- Comment #16 from ubizjak at gmail dot com  2007-10-06 06:49 ---
> (In reply to comment #14)
>   
>> The testcase works for me, that is, it produces the expected output good.out.
>> 
>
> Uh, you have to un-comment the line 315 of the comunpack.f test. The testcase,
> as attached, produces good code. Un-commenting line 315, you will get:
>
> .L80:
> movl$0x4000, (%esp)
> callpowf
> fstps   -152(%ebp)
> negl-136(%ebp)
> fildl   -136(%ebp)
> fstps   4(%esp)
> movl$0x4120, (%esp)
> callpowf
>
> Note that only one argument is loaded to the stack before first powf.
>
> Without -fforce-address on un-commented testcase, we got:
>
> .L80:
> fildl   -132(%ebp)
> fstps   4(%esp)
> movl$0x4000, (%esp)
> callpowf
> fstps   -140(%ebp)
> negl-128(%ebp)
> fildl   -128(%ebp)
> fstps   4(%esp)
> movl$0x4120, (%esp)
> callpowf
>
>
>   
ian,

As you may remember, the dse code assumes that it can "see" all of the
stores that are frame_related.   It appears that with the  -fforce-addr
option this is not true.  in this particular example, a frame related
pointer gets loaded into register 755 very early on (in a different
block) and since const calls only disqualify frame-related stores,
(since they may push params onto the stack), the parameter push is
considered dead.

My question to you, is the proper fix to check flag_force-addr and if it
is set just assume that every store may be frame related or is there
some sort of tea leaf that i might have access to know that reg 755 is
used in this way?

(note that you have to jump thru a few hoops to recreate this, since
comunpack.f is in a separate attachment from the rest of the code and
you have to uncomment line 315 to recreate the bug.)


Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr

2007-10-06 Thread zadeck at naturalbridge dot com


--- Comment #20 from zadeck at naturalbridge dot com  2007-10-06 21:20 
---
Subject: Re:  [4.3 regression]: wrong code with
 -fforce-addr

ubizjak at gmail dot com wrote:
> --- Comment #19 from ubizjak at gmail dot com  2007-10-06 19:58 ---
> In dse.c, scan_insn(), we have:
>
>   if ((GET_CODE (PATTERN (insn)) == CLOBBER)
>   || volatile_refs_p (PATTERN (insn))
>   || (flag_non_call_exceptions && may_trap_p (PATTERN (insn)))
>   || (RTX_FRAME_RELATED_P (insn))
>   || find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL_RTX))
> insn_info->cannot_delete = true;
>
> And since the docs say that:
>
> `RTX_FRAME_RELATED_P (X)'
>  Nonzero in an `insn', `call_insn', `jump_insn', `barrier', or
>  `set' which is part of a function prologue and sets the stack
>  pointer, sets the frame pointer, or saves a register.  This flag
>  should also be set on an instruction that sets up a temporary
>  register to use in place of the frame pointer.  Stored in the
>  `frame_related' field and printed as `/f'.
>
> I wonder if the insn that stores to (or uses(?)) this temporary register (in
> place of the frame pointer) should also be marked as frame related insn?
>
> So, all the insns in the sequence of
>
> set tmpreg, FP + const
> ...
> store (tmpreg)
>
> should be marked as frame related insns.
>
>
>   
i was not referring to the frame_related flag, though i guess it could
be taken over for this purpose.  Note that the frame_related flag is for
use by the prologue and this is not.  This is just a register that
happens to point into the frame, which i think is only ever created if
you say -fforce-addr.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33638



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-06 Thread zadeck at naturalbridge dot com


--- Comment #9 from zadeck at naturalbridge dot com  2007-10-07 03:18 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

hj,

here is a fix.  I will most likely post the patch on monday after i get
it really tested on a bunch of platforms.  The fix is in the third
stanza, the rest is better logging.

The failure only happens if you have a block with 2 or more uses of a
multiword pseudo register that is local to this block and has been
allocated by local_alloc.  The uses must be in a particular form: the
last use was a subreg use that only used some of the hard registers and
a previous non subreg use of the multiword register.

When all of this happens, the code did not properly expand this to a
whole multiregister when the second to last use is encounterd in the
backwards scan.

I.e. a lot of things have to happen to get this to fail.

Thanks for the small test case, that really helped.

Kenny



Index: ra-conflict.c
===
--- ra-conflict.c(revision 129036)
+++ ra-conflict.c(working copy)
@@ -76,7 +76,7 @@ record_one_conflict_between_regnos (enum
 enum machine_mode mode2, int r2)
 {
   if (dump_file)
-fprintf (dump_file, "  rocbr adding %d<=>%d\n", r1, r2);
+fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2);
   if (reg_allocno[r1] >= 0 && reg_allocno[r2] >= 0)
 {
   int tr1 = reg_allocno[r1];
@@ -293,9 +293,6 @@ set_conflicts_for_earlyclobber (rtx insn
 recog_data.operand[use + 1]);
 }
 }
-
-  if (dump_file)
-fprintf (dump_file, "  finished early clobber conflicts.\n");
 }


@@ -876,7 +873,7 @@ global_conflicts (void)
 allocnum, renumber);
 }

-  else if (GET_ALLOCNO_LIVE (allocnos_live, allocnum) == 0)
+  else
 {
   if (dump_file)
 fprintf (dump_file, "dying pseudo\n");
@@ -963,6 +960,8 @@ global_conflicts (void)
  FIXME: We should consider either adding a new kind of
  clobber, or adding a flag to the clobber distinguish
  these two cases.  */
+  if (dump_file && VEC_length (df_ref_t, clobbers))
+fprintf (dump_file, "  clobber conflicts\n");
   for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--)
 {
   struct df_ref *def = VEC_index (df_ref_t, clobbers, k);
@@ -1024,6 +1023,8 @@ global_conflicts (void)
   if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets
(insn))
 {
   int j;
+  if (dump_file)
+fprintf (dump_file, "  multiple sets\n");
   for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--)
 {
   int used_in_output = 0;


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-07 Thread zadeck at naturalbridge dot com


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED
 GCC target triplet||linux/ia32
   Last reconfirmed|2007-10-07 09:41:07 |2007-10-07 11:36:14
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-07 Thread zadeck at naturalbridge dot com


--- Comment #10 from zadeck at naturalbridge dot com  2007-10-07 21:57 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

This patch fixes pr33669.

The failure only happens if you have a block with 2 or more uses of a
multiword pseudo register that is local to this block and has been
allocated by local_alloc.  The uses must be in a particular form: the
last use must be a subreg use that only used some of the hard registers and
a previous non subreg use of the multiword register.

When all of this happens, the code did not properly expand this to a
whole multiregister when the second to last use is encountered in the
backwards scan.

I.e. a lot of things have to happen to get this to fail.

I have tested this patch on ia-64, x86-{64,32} and ppc-32.

Ok for commit?

Kenny

2007-10-07  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33669
* ra-conflict.c (record_one_conflict_between_regnos,
set_conflicts_for_earlyclobber, global_conflicts): Improved logging.
(global_conflicts): Removed incorrect check.

2007-10-07  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33669
* gcc.c-torture/execute/pr33669.c: New.


Index: ra-conflict.c
===
--- ra-conflict.c   (revision 129053)
+++ ra-conflict.c   (working copy)
@@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum
   int allocno2 = reg_allocno[r2];

   if (dump_file)
-fprintf (dump_file, "  rocbr adding %d<=>%d\n", r1, r2);
+fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2);

   if (allocno1 >= 0 && allocno2 >= 0)
 set_conflict (allocno1, allocno2);
@@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn
recog_data.operand[use +
1]);
}
}
-
-  if (dump_file) 
-fprintf (dump_file, "  finished early clobber conflicts.\n");
 }


@@ -984,7 +981,7 @@ global_conflicts (void)
allocnum, renumber);
}

- else if (!sparseset_bit_p (allocnos_live, allocnum))
+ else
{
  if (dump_file)
fprintf (dump_file, "dying pseudo\n");
@@ -1071,6 +1068,8 @@ global_conflicts (void)
 FIXME: We should consider either adding a new kind of
 clobber, or adding a flag to the clobber distinguish
 these two cases.  */
+ if (dump_file && VEC_length (df_ref_t, clobbers))
+   fprintf (dump_file, "  clobber conflicts\n");
  for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--)
{
  struct df_ref *def = VEC_index (df_ref_t, clobbers, k);
@@ -1132,6 +1131,8 @@ global_conflicts (void)
  if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets
(insn))
{ 
  int j;
+ if (dump_file)
+   fprintf (dump_file, "  multiple sets\n");
  for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--)
{
  int used_in_output = 0;
Index: testsuite/gcc.c-torture/execute/pr33669.c
===
--- testsuite/gcc.c-torture/execute/pr33669.c   (revision 0)
+++ testsuite/gcc.c-torture/execute/pr33669.c   (revision 0)
@@ -0,0 +1,40 @@
+extern void abort (void);
+
+typedef struct foo_t
+{ 
+  unsigned int blksz;
+  unsigned int bf_cnt; 
+} foo_t;
+
+#define _RNDUP(x, unit)  x) + (unit) - 1) / (unit)) * (unit))
+#define _RNDDOWN(x, unit)  ((x) - ((x)%(unit)))
+
+long long
+foo (foo_t *const pxp,  long long offset, unsigned int extent)
+{
+  long long blkoffset = _RNDDOWN(offset, (long long )pxp->blksz);
+  unsigned int diff = (unsigned int)(offset - blkoffset);
+  unsigned int blkextent = _RNDUP(diff + extent, pxp->blksz);
+
+  if (pxp->blksz < blkextent)
+return -1LL;
+
+  if (pxp->bf_cnt > pxp->blksz)
+pxp->bf_cnt = pxp->blksz;
+
+  return blkoffset;
+}
+
+int
+main ()
+{
+  foo_t x;
+  long long xx;
+
+  x.blksz = 8192;
+  x.bf_cnt = 0;
+  xx = foo (&x, 0, 4096);
+  if (xx != 0LL)
+abort ();
+  return 0;
+}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH

2007-10-07 Thread zadeck at naturalbridge dot com


--- Comment #2 from zadeck at naturalbridge dot com  2007-10-08 03:53 
---


*** This bug has been marked as a duplicate of 33669 ***


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-07 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2007-10-08 03:53 
---
*** Bug 33662 has been marked as a duplicate of this bug. ***


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 CC||kkojima at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-09 Thread zadeck at naturalbridge dot com


--- Comment #14 from zadeck at naturalbridge dot com  2007-10-09 15:32 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

hjl at gcc dot gnu dot org wrote:
> --- Comment #13 from hjl at gcc dot gnu dot org  2007-10-09 14:00 ---
> Subject: Bug 33669
>
> Author: hjl
> Date: Tue Oct  9 14:00:11 2007
> New Revision: 129166
>
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129166
> Log:
> gcc/
>
> 2007-10-09  Kenneth Zadeck <[EMAIL PROTECTED]>
>
> PR middle-end/33669
> * ra-conflict.c (record_one_conflict_between_regnos,
> set_conflicts_for_earlyclobber, global_conflicts): Improved
> logging.
> (global_conflicts): Removed incorrect check.
>
> gcc/testsuite/
>
> 2007-10-09  Kenneth Zadeck <[EMAIL PROTECTED]>
>
> PR middle-end/33669
> * gcc.c-torture/execute/pr33669.c: New.
>
> Added:
> trunk/gcc/testsuite/gcc.c-torture/execute/pr33669.c
> Modified:
> trunk/gcc/ChangeLog
> trunk/gcc/ra-conflict.c
> trunk/gcc/testsuite/ChangeLog
>
>
>   
please back this out.  i have a different patch that i have finished
testing. this one is too conservative.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-09 Thread zadeck at naturalbridge dot com


--- Comment #15 from zadeck at naturalbridge dot com  2007-10-09 15:41 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

This patch fixes the problem in a slightly different way.  The other
patch was too conservative in that it ended up setting the added flag
too often what has some downstream quality issues.

I just finished testing this on x86-64, x86-32, ppc-32 and ia-64

kenny


2007-10-07  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33669
* ra-conflict.c (record_one_conflict_between_regnos,
set_conflicts_for_earlyclobber, global_conflicts): Improved logging.
(global_conflicts): Removed incorrect check.

2007-10-07  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33669
* gcc.c-torture/execute/pr33669.c: New.


Index: ra-conflict.c
===
--- ra-conflict.c   (revision 129053)
+++ ra-conflict.c   (working copy)
@@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum
   int allocno2 = reg_allocno[r2];

   if (dump_file)
-fprintf (dump_file, "  rocbr adding %d<=>%d\n", r1, r2);
+fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2);

   if (allocno1 >= 0 && allocno2 >= 0)
 set_conflict (allocno1, allocno2);
@@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn
recog_data.operand[use +
1]);
}
}
-
-  if (dump_file) 
-fprintf (dump_file, "  finished early clobber conflicts.\n");
 }


@@ -983,12 +980,12 @@ global_conflicts (void)
set_renumbers_live (&renumbers_live, live_subregs,
live_subregs_used, 
allocnum, renumber);
}
- 
- else if (!sparseset_bit_p (allocnos_live, allocnum))
+ else if (live_subregs_used[allocnum] > 0
+  || !sparseset_bit_p (allocnos_live, allocnum))
{
  if (dump_file)
-   fprintf (dump_file, "dying pseudo\n");
- 
+   fprintf (dump_file, "%sdying pseudo\n", 
+(live_subregs_used[allocnum] > 0) ? "partially
": "");
  /* Resetting the live_subregs_used is
 effectively saying do not use the subregs
 because we are reading the whole pseudo.  */
@@ -1071,6 +1068,8 @@ global_conflicts (void)
 FIXME: We should consider either adding a new kind of
 clobber, or adding a flag to the clobber distinguish
 these two cases.  */
+ if (dump_file && VEC_length (df_ref_t, clobbers))
+   fprintf (dump_file, "  clobber conflicts\n");
  for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--)
{
  struct df_ref *def = VEC_index (df_ref_t, clobbers, k);
@@ -1132,6 +1131,8 @@ global_conflicts (void)
  if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets
(insn))
{ 
  int j;
+ if (dump_file)
+   fprintf (dump_file, "  multiple sets\n");
  for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--)
{
  int used_in_output = 0;
@@ -1166,7 +1167,7 @@ global_conflicts (void)
}
}

-   /* Add the renumbers live to the hard_regs_live for the next few
+  /* Add the renumbers live to the hard_regs_live for the next few
 calls.  All of this gets recomputed at the top of the loop so
 there is no harm.  */
   IOR_HARD_REG_SET (hard_regs_live, renumbers_live);
Index: testsuite/gcc.c-torture/execute/pr33669.c
===
--- testsuite/gcc.c-torture/execute/pr33669.c   (revision 0)
+++ testsuite/gcc.c-torture/execute/pr33669.c   (revision 0)
@@ -0,0 +1,40 @@
+extern void abort (void);
+
+typedef struct foo_t
+{ 
+  unsigned int blksz;
+  unsigned int bf_cnt; 
+} foo_t;
+
+#define _RNDUP(x, unit)  x) + (unit) - 1) / (unit)) * (unit))
+#define _RNDDOWN(x, unit)  ((x) - ((x)%(unit)))
+
+long long
+foo (foo_t *const pxp,  long long offset, unsigned int extent)
+{
+  long long blkoffset = _RNDDOWN(offset, (long long )pxp->blksz);
+  unsigned int diff = (unsigned int)(offset - blkoffset);
+  unsigned int blkextent = _RNDUP(diff + extent, pxp->blksz);
+
+  if (pxp->blksz < blkextent)
+return -1LL;
+
+  if (pxp->bf_cnt > pxp->blksz)
+pxp->bf_cnt = pxp->blksz;
+
+  return blkoffset;
+}
+
+int
+main ()
+{
+  foo_t x;
+  long long xx;
+
+  x.blksz = 8192;
+  x.bf_cnt = 0;
+  xx = foo (&x, 0, 4096);
+  if (xx != 0LL)
+abort ();
+  return 0;
+}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-09 Thread zadeck at naturalbridge dot com


--- Comment #18 from zadeck at naturalbridge dot com  2007-10-10 03:39 
---
Subject: Re:  [4.3 Regression]  Revision 128957
 miscompiles 481.wrf

HJ,

Sorry about the committing snafu.  I should have posted the irc log of
seonbae's comments to the log for the bug.  Also I had a meeting in the
city tonight, so there was not time to commit it between when seonbae
gave the final approval and when i had to catch my train.

I have committed the corrected patch as revision 129193.  It looks like
you had left the testcase when you reverted so there is no test case in
this patch.

This patch was tested on ia-64, ppc-32, xa6-{64,32}.

Kenny

> This patch fixes pr33669 <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669>.
>
> The failure only happens if you have a block with 2 or more uses of a
> multiword pseudo register that is local to this block and has been
> allocated by local_alloc.  The uses must be in a particular form: the
> last use must be a subreg use that only used some of the hard registers and
> a previous non subreg use of the multiword register.
>
> When all of this happens, the code did not properly expand this to a
> whole multiregister when the second to last use is encountered in the
> backwards scan.
>
> I.e. a lot of things have to happen to get this to fail.
>
> I have tested this patch on ia-64, x86-{64,32} and ppc-32.
>



2007-10-07  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33669
* ra-conflict.c (record_one_conflict_between_regnos,
set_conflicts_for_earlyclobber, global_conflicts): Improved logging.
(global_conflicts): Enhanced incorrect check.


Index: ra-conflict.c
===
--- ra-conflict.c   (revision 129192)
+++ ra-conflict.c   (working copy)
@@ -196,7 +196,7 @@ record_one_conflict_between_regnos (enum
   int allocno2 = reg_allocno[r2];

   if (dump_file)
-fprintf (dump_file, "  rocbr adding %d<=>%d\n", r1, r2);
+fprintf (dump_file, "rocbr adding %d<=>%d\n", r1, r2);

   if (allocno1 >= 0 && allocno2 >= 0)
 set_conflict (allocno1, allocno2);
@@ -401,9 +401,6 @@ set_conflicts_for_earlyclobber (rtx insn
recog_data.operand[use +
1]);
}
}
-
-  if (dump_file) 
-fprintf (dump_file, "  finished early clobber conflicts.\n");
 }


@@ -983,12 +980,12 @@ global_conflicts (void)
set_renumbers_live (&renumbers_live, live_subregs,
live_subregs_used, 
allocnum, renumber);
}
- 
- else if (!sparseset_bit_p (allocnos_live, allocnum))
+ else if (live_subregs_used[allocnum] > 0
+  || !sparseset_bit_p (allocnos_live, allocnum))
{
  if (dump_file)
-   fprintf (dump_file, "dying pseudo\n");
- 
+   fprintf (dump_file, "%sdying pseudo\n", 
+(live_subregs_used[allocnum] > 0) ? "partially
": "");
  /* Resetting the live_subregs_used is
 effectively saying do not use the subregs
 because we are reading the whole pseudo.  */
@@ -1071,6 +1068,8 @@ global_conflicts (void)
 FIXME: We should consider either adding a new kind of
 clobber, or adding a flag to the clobber distinguish
 these two cases.  */
+ if (dump_file && VEC_length (df_ref_t, clobbers))
+   fprintf (dump_file, "  clobber conflicts\n");
  for (k = VEC_length (df_ref_t, clobbers) - 1; k >= 0; k--)
{
  struct df_ref *def = VEC_index (df_ref_t, clobbers, k);
@@ -1132,6 +1131,8 @@ global_conflicts (void)
  if (GET_CODE (PATTERN (insn)) == PARALLEL && multiple_sets
(insn))
{ 
  int j;
+ if (dump_file)
+   fprintf (dump_file, "  multiple sets\n");
  for (j = VEC_length (df_ref_t, dying_regs) - 1; j >= 0; j--)
{
  int used_in_output = 0;
@@ -1166,7 +1167,7 @@ global_conflicts (void)
}
}

-   /* Add the renumbers live to the hard_regs_live for the next few
+  /* Add the renumbers live to the hard_regs_live for the next few
 calls.  All of this gets recomputed at the top of the loop so
 there is no harm.  */
   IOR_HARD_REG_SET (hard_regs_live, renumbers_live);


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33669] [4.3 Regression] Revision 128957 miscompiles 481.wrf

2007-10-09 Thread zadeck at naturalbridge dot com


--- Comment #19 from zadeck at naturalbridge dot com  2007-10-10 03:41 
---
patch committed to fix this.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33669



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-10 Thread zadeck at naturalbridge dot com


--- Comment #12 from zadeck at naturalbridge dot com  2007-10-10 11:41 
---
I will look at it today.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-10 Thread zadeck at naturalbridge dot com


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

  BugsThisDependsOn|33669   |
 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH

2007-10-10 Thread zadeck at naturalbridge dot com


--- Comment #4 from zadeck at naturalbridge dot com  2007-10-10 13:33 
---
Subject: Re:  [4.3 Regression] Wrong register allocation
 on SH

kkojima at gcc dot gnu dot org wrote:
> --- Comment #3 from kkojima at gcc dot gnu dot org  2007-10-10 13:28 
> ---
> Not fixed by r129192.  I see
>
> FAIL: gcc.c-torture/execute/pr33669.c execution,  -O1
> FAIL: gcc.c-torture/execute/pr33669.c execution,  -O2
> FAIL: gcc.c-torture/execute/pr33669.c execution,  -Os
>
> on sh4-unknown-linux-gnu with r129192.
>
>
>   
i am so embarrassed.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-11 Thread zadeck at naturalbridge dot com


--- Comment #14 from zadeck at naturalbridge dot com  2007-10-11 11:43 
---
Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
 Segmentation fault, wrong code with -fomit-frame-pointer

ebotcazou at gcc dot gnu dot org wrote:
> --- Comment #13 from ebotcazou at gcc dot gnu dot org  2007-10-11 11:14 
> ---
>   
>> Revision 128957 causes this regression.
>> 
>
> There is a suspect non-documented hunk in the commit:
>
> * reload1.c (compute_use_by_pseudos): Change DF_RA_LIVE
> usage to DF_LIVE usage.
>
> --- trunk/gcc/reload1.c 2007/10/02 12:47:13 128956
> +++ trunk/gcc/reload1.c 2007/10/02 13:10:07 128957
> @@ -548,7 +548,7 @@
>if (r < 0)
> {
>   /* reload_combine uses the information from
> -DF_RA_LIVE_IN (BASIC_BLOCK), which might still
> +DF_LIVE_IN (BASIC_BLOCK), which might still
>  contain registers that have not actually been allocated
>  since they have an equivalence.  */
>   gcc_assert (reload_completed);
> @@ -1158,10 +1158,7 @@
>
>if (! frame_pointer_needed)
>  FOR_EACH_BB (bb)
> -  {
> -   bitmap_clear_bit (df_get_live_in (bb), HARD_FRAME_POINTER_REGNUM);
> -   bitmap_clear_bit (df_get_live_top (bb), HARD_FRAME_POINTER_REGNUM);
> -  }
> +  bitmap_clear_bit (df_get_live_in (bb), HARD_FRAME_POINTER_REGNUM);
>
>/* Come here (with failure set nonzero) if we can't get enough spill
>   regs.  */
>
>
>   

That is fine, there are no top sets anymore.

the problem is the code that builds the reload insn chain.  the new code
uses the cfg and does not add the label or the jump table that lives
between basic blocks to the chain.  I will post a patch as soon as my
tests finish.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-11 Thread zadeck at naturalbridge dot com


--- Comment #16 from zadeck at naturalbridge dot com  2007-10-11 12:40 
---
Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
 Segmentation fault, wrong code with -fomit-frame-pointer

ebotcazou at gcc dot gnu dot org wrote:
> --- Comment #15 from ebotcazou at gcc dot gnu dot org  2007-10-11 12:24 
> ---
>   
>> That is fine, there are no top sets anymore.
>> 
>
> Thanks for the explanation, please fix the ChangeLog though.
>   
I will, sorry for the oversight.
>   
>> the problem is the code that builds the reload insn chain.  the new code
>> uses the cfg and does not add the label or the jump table that lives
>> between basic blocks to the chain.  I will post a patch as soon as my
>> tests finish.
>> 
>
> OK.
>
>
>   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-11 Thread zadeck at naturalbridge dot com


--- Comment #17 from zadeck at naturalbridge dot com  2007-10-11 16:21 
---
Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
 Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33676
* global.c (build_insn_chain): Include insn that occur between
basic blocks.

2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33676
* gcc.c-torture/gcc.dg/torture/pr33676.c: New.



When I rewrote this code to use backward scanning rather than forwards
scanning, I converted it to properly use the cfg, since it is generally
considered outmoded to just scan the insns.

However, the reload_insn_chain actually needs the insns that appear
between basic blocks, in particular the labels in front of branch
tables.  I added code here to check for insns that may be in front of a
basic block after scanning that block. 

There are a lot of ways that I could have done this, for instance, I
could have just written in terms of the PREV_INSN as the old code was. 
I think that in doing it the way that i have done it, it is obvious what
needs to be done if someone really does get rid of the branch tables
between the blocks.

This has been bootstrapped and regression tested on x86-{64,32} ppc-32,
and ia-64.  However it is not clear to me how many platforms use this
kind of table branch.  The bug appears to only be on the -march=i586, so
the reviewers may wish to comment on my choice of dg options on the test. 


Ok to commit?

Kenny
Index: testsuite/gcc.dg/torture/pr33676.c
===
--- testsuite/gcc.dg/torture/pr33676.c  (revision 0)
+++ testsuite/gcc.dg/torture/pr33676.c  (revision 0)
@@ -0,0 +1,53 @@
+/* { dg-do run } */ 
+/* { dg-options "-march=i586 -fomit-frame-pointer" { target { { i?86-*-*
x86_64-*-* } && ilp32 } } } */
+
+// Small testcase, compile with "-march=i586 -O0 -fomit-frame-pointer":
+
+__attribute__((noreturn,noinline)) void abrt (const char *fi, const char *fu)
+{
+  __builtin_abort ();
+}
+
+__attribute__((noinline)) int f (int k)
+{
+  return k;
+}
+
+__attribute__((noinline)) int g (int t, int k)
+{
+  int b;
+
+  switch (t)
+{
+case 0:
+  abrt (__FILE__, __FUNCTION__);
+
+case 1:
+  b = f (k);
+  break;
+
+case 2:
+  b = f (k);
+  break;
+
+case 3:
+  b = f (k);
+  break;
+
+case 4:
+  b = f (k);
+  break;
+
+default:
+  abrt (__FILE__, __FUNCTION__);
+}
+
+  return b;
+}
+
+int main (void)
+{
+  if (g (3, 1337) != 1337)
+  abrt (__FILE__, __FUNCTION__);
+  return 0;
+}
Index: global.c
===
--- global.c(revision 129224)
+++ global.c(working copy)
@@ -1575,6 +1575,37 @@ build_insn_chain (void)
  }
}
}
+
+  /* FIXME!! The following code is a disaster.  Reload needs to see the
+labels and jump tables that are just hanging out in between
+the basic blocks.  See pr33676.  */
+
+  insn = BB_HEAD (bb);
+
+  /* Skip over the barriers and cruft.  */
+  while (insn && (BARRIER_P (insn) || NOTE_P (insn) || BLOCK_FOR_INSN
(insn) == bb))
+   insn = PREV_INSN (insn);
+   
+  /* Look for labels and jump tables.  */
+  while (insn)
+   {
+ if (!NOTE_P (insn) && !BARRIER_P (insn))
+   {
+ if (BLOCK_FOR_INSN (insn))
+   break;
+
+ c = new_insn_chain ();
+ c->next = next;
+ next = c;
+ *p = c;
+ p = &c->prev;
+ 
+ c->insn = insn;
+ c->block = bb->index;
+ bitmap_copy (&c->live_throughout, live_relevant_regs);
+   } 
+ insn = PREV_INSN (insn);
+   }
 }

   for (i = 0; i < (unsigned int)max_regno; i++)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug middle-end/33662] [4.3 Regression] Wrong register allocation on SH

2007-10-11 Thread zadeck at naturalbridge dot com


--- Comment #7 from zadeck at naturalbridge dot com  2007-10-11 21:50 
---
kazumoto, 

there was a set of miscommunications associated with the final patch for
pr33669.

hj had checked in an earlier version of the patch and that testcase and i asked
him to revert it because there were issues with it.  He only reverted the code
and left the testcase in.  You tested against version 129192 and i checked in
the corrected patch as 129193.

given that, pr33669.c should have failed.  seongbae has verified that pr33669.c
and the testcase here no longer fails on the current truck with sh-elf.

I am going to assume that this is closed unless you find some other issue. 
Sorry for the mess up.

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33662



[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-11 Thread zadeck at naturalbridge dot com


--- Comment #20 from zadeck at naturalbridge dot com  2007-10-11 22:35 
---
Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
 Segmentation fault, wrong code with -fomit-frame-pointer

zadeck at naturalbridge dot com wrote:
> --- Comment #17 from zadeck at naturalbridge dot com  2007-10-11 16:21 
> ---
> Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
>  Segmentation fault, wrong code with -fomit-frame-pointer
>
>
>
> When I rewrote this code to use backward scanning rather than forwards
> scanning, I converted it to properly use the cfg, since it is generally
> considered outmoded to just scan the insns.
>
> However, the reload_insn_chain actually needs the insns that appear
> between basic blocks, in particular the labels in front of branch
> tables.  I added code here to check for insns that may be in front of a
> basic block after scanning that block. 
>
> There are a lot of ways that I could have done this, for instance, I
> could have just written in terms of the PREV_INSN as the old code was. 
> I think that in doing it the way that i have done it, it is obvious what
> needs to be done if someone really does get rid of the branch tables
> between the blocks.
>
> This has been bootstrapped and regression tested on x86-{64,32} ppc-32,
> and ia-64.  However it is not clear to me how many platforms use this
> kind of table branch.  The bug appears to only be on the -march=i586, so
> the reviewers may wish to comment on my choice of dg options on the test. 
>
>   
2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33676
* global.c (build_insn_chain): Include insn that occur between
basic blocks.

2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/33676
* gcc.dg/torture/pr33676.c: New.

bootstrapped and regression tested on x86-32 x86-64, ppc-32 and ia-64.

committed as revision 129244.

Kenny
Index: ChangeLog
===
--- ChangeLog   (revision 129243)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>
+
+   PR middle-end/33676
+   * global.c (build_insn_chain): Include insn that occur between
+   basic blocks.
+   
 2007-10-11  Tom Tromey  <[EMAIL PROTECTED]>

* gengtype-yacc.y: Delete.
Index: testsuite/gcc.dg/torture/pr33676.c
===
--- testsuite/gcc.dg/torture/pr33676.c  (revision 0)
+++ testsuite/gcc.dg/torture/pr33676.c  (revision 0)
@@ -0,0 +1,51 @@
+/* { dg-do run } */ 
+/* { dg-options "-march=i586 -fomit-frame-pointer" { target { { i?86-*-*
x86_64-*-* } && ilp32 } } } */
+
+__attribute__((noreturn,noinline)) void abrt (const char *fi, const char *fu)
+{
+  __builtin_abort ();
+}
+
+__attribute__((noinline)) int f (int k)
+{
+  return k;
+}
+
+__attribute__((noinline)) int g (int t, int k)
+{
+  int b;
+
+  switch (t)
+{
+case 0:
+  abrt (__FILE__, __FUNCTION__);
+
+case 1:
+  b = f (k);
+  break;
+
+case 2:
+  b = f (k);
+  break;
+
+case 3:
+  b = f (k);
+  break;
+
+case 4:
+  b = f (k);
+  break;
+
+default:
+  abrt (__FILE__, __FUNCTION__);
+}
+
+  return b;
+}
+
+int main (void)
+{
+  if (g (3, 1337) != 1337)
+  abrt (__FILE__, __FUNCTION__);
+  return 0;
+}
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 129243)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>
+
+   PR middle-end/33676
+   * gcc.dg/torture/pr33676.c: New.
+
 2007-10-11  Paolo Carlini  <[EMAIL PROTECTED]>

PR c++/31441
Index: global.c
===
--- global.c(revision 129243)
+++ global.c(working copy)
@@ -1575,6 +1575,41 @@ build_insn_chain (void)
  }
}
}
+
+  /* FIXME!! The following code is a disaster.  Reload needs to see the
+labels and jump tables that are just hanging out in between
+the basic blocks.  See pr33676.  */
+
+  insn = BB_HEAD (bb);
+
+  /* Skip over the barriers and cruft.  */
+  while (insn && (BARRIER_P (insn) || NOTE_P (insn) || BLOCK_FOR_INSN
(insn) == bb))
+   insn = PREV_INSN (insn);
+   
+  /* While we add anything except barriers and notes, the focus is
+to get the labels and jump tables into the
+reload_insn_chain.  */
+  while (insn)
+   {
+ if (!NOTE_P (insn) && !BARRIER_P (insn))
+   {
+ if (BLOCK_FOR_INSN (insn))
+   break;
+
+ c = new_insn_chain ();
+ c->next = next;
+ next = c;
+   

[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-12 Thread zadeck at naturalbridge dot com


--- Comment #24 from zadeck at naturalbridge dot com  2007-10-12 14:38 
---
Subject: Re:  libgfortran bootstrap failure: selected_int_kind.f90:22:
  Segmentation fault, wrong code with -fomit-frame-pointer

Eric Botcazou wrote:
>> 2007-10-11  Kenneth Zadeck <[EMAIL PROTECTED]>
>>
>> PR middle-end/33676
>> * global.c (build_insn_chain): Include insn that occur between
>> basic blocks.
>> 
>
> Who approved this patch?
>
>   
>> However, the reload_insn_chain actually needs the insns that appear
>> between basic blocks, in particular the labels in front of branch
>> tables.  I added code here to check for insns that may be in front of a
>> basic block after scanning that block.
>>
>> There are a lot of ways that I could have done this, for instance, I
>> could have just written in terms of the PREV_INSN as the old code was.
>> I think that in doing it the way that i have done it, it is obvious what
>> needs to be done if someone really does get rid of the branch tables
>> between the blocks.
>> 
>
> Sure, but the code in build_insn_chain is now more convoluted than in the 
> original version (and twice as big).  And, please, fix the formatting.
>
>   
it was approved by seonbae, a register allocation reviewer.The
reason that it is longer is that it is more precise.  The code to
properly handle subregs, as well as properly dealing with registers live
thru insns,  accounts for most of the expansion over the old code.

formatting fixes committed as revision 129262.

kenny
Index: global.c
===
--- global.c(revision 129260)
+++ global.c(working copy)
@@ -1358,6 +1358,8 @@ mark_elimination (int from, int to)
 }
 }

+/* Print chain C to FILE.  */
+
 static void
 print_insn_chain (FILE *file, struct insn_chain *c)
 {
@@ -1366,6 +1368,9 @@ print_insn_chain (FILE *file, struct ins
   bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n");
 }

+
+/* Print all reload_insn_chains to FILE.  */
+
 static void
 print_insn_chains (FILE *file)
 {
@@ -1373,8 +1378,11 @@ print_insn_chains (FILE *file)
   for (c = reload_insn_chain; c ; c = c->next)
 print_insn_chain (file, c);
 }
+
+
 /* Walk the insns of the current function and build reload_insn_chain,
and record register life information.  */
+
 static void
 build_insn_chain (void)
 {
@@ -1450,7 +1458,7 @@ build_insn_chain (void)
  {
if (regno < FIRST_PSEUDO_REGISTER)
  {
-   if (! fixed_regs[regno])
+   if (!fixed_regs[regno])
  bitmap_set_bit (&c->dead_or_set, regno);
  }
else if (reg_renumber[regno] >= 0)
@@ -1461,16 +1469,20 @@ build_insn_chain (void)
&& (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL)))
  {
rtx reg = DF_REF_REG (def);
+
/* We can model subregs, but not if they are
   wrapped in ZERO_EXTRACTS.  */
if (GET_CODE (reg) == SUBREG
&& !DF_REF_FLAGS_IS_SET (def, DF_REF_EXTRACT))
  {
unsigned int start = SUBREG_BYTE (reg);
-   unsigned int last = start + GET_MODE_SIZE (GET_MODE
(reg));
+   unsigned int last = start 
+ + GET_MODE_SIZE (GET_MODE (reg));

-   ra_init_live_subregs (bitmap_bit_p
(live_relevant_regs, regno), 
- live_subregs,
live_subregs_used,
+   ra_init_live_subregs (bitmap_bit_p
(live_relevant_regs, 
+   regno), 
+ live_subregs, 
+ live_subregs_used,
  regno, reg);
/* Ignore the paradoxical bits.  */
if ((int)last > live_subregs_used[regno])
@@ -1535,7 +1547,7 @@ build_insn_chain (void)
  {
if (regno < FIRST_PSEUDO_REGISTER)
  {
-   if (! fixed_regs[regno])
+   if (!fixed_regs[regno])
  bitmap_set_bit (&c->dead_or_set, regno);
  }
else if (reg_renumber[regno] >= 0)
@@ -1548,10 +1560,13 @@ build_insn_chain (void)
&& !DF_REF_FLAGS_IS_SET (use, DF_REF_EXTRACT)) 
   

[Bug rtl-optimization/33676] libgfortran bootstrap failure: selected_int_kind.f90:22: Segmentation fault, wrong code with -fomit-frame-pointer

2007-10-12 Thread zadeck at naturalbridge dot com


--- Comment #22 from zadeck at naturalbridge dot com  2007-10-12 11:59 
---
it seems to be clean now.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33676



[Bug rtl-optimization/33644] [4.3 Regression] ICE in local_cprop_pass with -ftrapv for crafty

2007-10-15 Thread zadeck at naturalbridge dot com


--- Comment #2 from zadeck at naturalbridge dot com  2007-10-15 13:11 
---
Subject: Re:  [4.3 Regression] ICE in local_cprop_pass
 with -ftrapv for crafty

> On Sun, Oct 14, 2007 at 12:29:44PM -0400, Kenneth Zadeck wrote:

> > > I have not looked at this bug.  I am happy to if you want.  I am sure
> > > that it will be trivial to modify the pass that moved/created the insn
> > > in the middle of the libcall to inherit the LIB_CALL_ID from the
> > > previous insn. 
>   

> That is not desirable, if anything in this case the insn should be
> added before the whole libcall sequence rather than before the insn
> that actually needs it.  Otherwise, useless insns added to the libcall
> sequences wouldn't be ever DCEd.

> While it might be easy to modify the instantiate_virtual_regs, there
> are dozens of other passes that do similar things, so at least for 4.3 it is
> highly unlikely they will be all modified.

>   Jakub


Jakub, i will fix this by moving the insn before the libcall.  It may take me a
day of so because i am under the weather.  But i will do it soon.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33644



[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code

2007-10-17 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2007-10-17 11:25 
---
Subject: Re:  valgrind error with -O2 for linux
 kernel code

bergner at gcc dot gnu dot org wrote:
> --- Comment #2 from bergner at gcc dot gnu dot org  2007-10-17 04:46 
> ---
> Although valgrind is correct that we are doing an uninitialized read, the code
> is actually working as designed and is correct.
>
> When we allocate a sparseset, we only need to set set->members to 0 to clear
> the set.  The arrays set->sparse[] and set->dense[] are not and do not need to
> be initialized.  To test a value "n" for membership in "set", it needs to
> statisfy two properties:
>
>set->sparse[n] < set->members
>
> and
>
>set->dense[set->sparse[n]] == n
>
> The uninitialized read occurs when "n" is not (and never has been) a member of
> "set".  In this case, set->sparse[n] will be uninitialized and could be any
> value.  If set->sparse[n] happens to be >= set->members, we luckily (but
> correctly) return that "n" is not a member of the set.  If the uninitialized
> set->sparse[n] is < set->members, we continue on to verify that
> set->dense[set->sparse[n]] == n.  This test cannot be true since all
> set->dense[i] entries for i < set->members are initialized and "n" is not a
> member of the set.  So yes we do some uninitialized accesses to the sparse
> array, but that's ok.  It's also a benefit of sparseset, given that we don't
> have to memset/clear the whole sparseset data structure before using it, so
> it's fast.
>
>
>   
peter,

i think that this is clever and nice but it is not going to fly.  people
will be running valgrind and this will hit them over and over again. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796



[Bug middle-end/37448] [4.3 Regression] gcc 4.3.1 cannot compile big function

2008-09-27 Thread zadeck at naturalbridge dot com


--- Comment #23 from zadeck at naturalbridge dot com  2008-09-27 12:44 
---
I do not believe honza.  

My measurements at -O0 on x86-42 are about 15 refs per insn.  
This is based on the following stats.  (These can be reproduced using a patch
that i am about to submit).

;;total ref usage 8428419{7601408d,827011u,0e} in 570685{406804 regular +
163881 call} insns.

This yields about 15 refs per insn.   While this number is large, it is
reasonable considering that slightly less than 30% of the insns are call
instructions.   Call instructions have a lot of clobbers.   It is possible that
some mechanism could be devised to share these refs, but this will mess up
things like building chains so it is certainly not something that is going to
be easy to do.

The df patch that i have submitted makes modest progress on reducing the size
of df-refs.   Hopefully bonzini will finish reviewing this soon.

I should also point out that honza's alloc pool stats were completely bogus.  I
have submitted a patch that fixes the way stats are accumulated for
alloc-pools.  We can account for all of the df-refs and the peak usage
according to the new alloc-pool stats is very close to the number used by the
largest function.

Once those patches are installed, I will consider this bugzilla resolved with
respect to the df issues.  


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37448



[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86

2008-10-11 Thread zadeck at naturalbridge dot com


--- Comment #3 from zadeck at naturalbridge dot com  2008-10-12 04:56 
---
Created an attachment (id=16485)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16485&action=view)
possible patch to fix the problem

I am pretty sure that this fixes it, but i need to do more testing.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808



[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86

2008-10-12 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2008-10-12 21:13 
---
Subject: Re:  [4.4 Regression]: Revision 141067 breaks Linux/x86

andreast at gcc dot gnu dot org wrote:
> --- Comment #7 from andreast at gcc dot gnu dot org  2008-10-12 20:31 
> ---
> I see a failure on sparc-solaris8/10 too. Configury of stage2 fails.
> Applying the mentioned patch cures compilation.
> My sparc config is with multilib. 32-bit/64-bit.
>
>
>   
The problem is that the bb is no longer kept in the df-ref, and is
instead extracted from the insn.
This particular problem was caused by insns being deleted in a pass that
defers rescanning but that also changes register numbers.   The fix
checks to make sure the insn is still in a basic block before trying to
mark the block as being dirty.

2008-10-12  Kenneth Zadeck <[EMAIL PROTECTED]>

PR middle-end/37808
* df-scan.c (df_ref_change_reg_with_loc_1): Added test to make
sure that ref has valid bb.

Tested by me on both x86-32 and x86-64.   Also tested by andreast on
spark-solaris and by keating.

OK to commit?

kenny
Index: df-scan.c
===
--- df-scan.c   (revision 141071)
+++ df-scan.c   (working copy)
@@ -1980,7 +1980,8 @@ df_ref_change_reg_with_loc_1 (struct df_
DF_REF_PREV_REG (new_df->reg_chain) = the_ref;
  new_df->reg_chain = the_ref;
  new_df->n_refs++;
- df_set_bb_dirty (DF_REF_BB (the_ref));
+ if (DF_REF_BB (the_ref))
+   df_set_bb_dirty (DF_REF_BB (the_ref));

  /* Need to sort the record again that the ref was in because
 the regno is a sorting key.  First, find the right


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808



[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86

2008-10-12 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2008-10-12 21:19 
---
Subject: Re:  [4.4 Regression]: Revision 141067 breaks Linux/x86

Richard Guenther wrote:
> On Sun, Oct 12, 2008 at 11:12 PM, Kenneth Zadeck
> <[EMAIL PROTECTED]> wrote:
>   
>> andreast at gcc dot gnu dot org wrote:
>> 
>>> --- Comment #7 from andreast at gcc dot gnu dot org  2008-10-12 20:31 
>>> ---
>>> I see a failure on sparc-solaris8/10 too. Configury of stage2 fails.
>>> Applying the mentioned patch cures compilation.
>>> My sparc config is with multilib. 32-bit/64-bit.
>>>
>>>
>>>
>>>   
>> The problem is that the bb is no longer kept in the df-ref, and is
>> instead extracted from the insn.
>> This particular problem was caused by insns being deleted in a pass that
>> defers rescanning but that also changes register numbers.   The fix
>> checks to make sure the insn is still in a basic block before trying to
>> mark the block as being dirty.
>> 
>
> Ok.  I think it's odd that we keep refs to deleted insns - but that's probably
> because of the deferred re-scan, right?
>
> Thanks,
> Richard.
>   
yes, this only is because of the deferred rescan.

committed as revision 14178.

kenny
>   
>> 2008-10-12  Kenneth Zadeck <[EMAIL PROTECTED]>
>>
>>PR middle-end/37808
>>* df-scan.c (df_ref_change_reg_with_loc_1): Added test to make
>>sure that ref has valid bb.
>>
>> Tested by me on both x86-32 and x86-64.   Also tested by andreast on
>> spark-solaris and by keating.
>>
>> OK to commit?
>>
>> kenny
>>
>> 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808



[Bug target/37808] [4.4 Regression]: Revision 141067 breaks Linux/x86

2008-10-12 Thread zadeck at naturalbridge dot com


--- Comment #12 from zadeck at naturalbridge dot com  2008-10-12 21:22 
---
fixed with the above patch.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37808



[Bug target/37378] [4.4 Regression] Revision 139827 causes Divide_X

2008-10-24 Thread zadeck at naturalbridge dot com


--- Comment #20 from zadeck at naturalbridge dot com  2008-10-24 18:44 
---
Subject: Re:  [4.4 Regression] Revision 139827 causes Divide_X

jakub at gcc dot gnu dot org wrote:
> --- Comment #19 from jakub at gcc dot gnu dot org  2008-10-24 18:09 
> ---
> This hunk in df-scan.c confuses me:
>
>   /* These registers are live everywhere.  */
>   if (!reload_completed)
> {
> #ifdef EH_USES
>   /* The ia-64, the only machine that uses this, does not define these 
>  until after reload.  */
>   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> if (EH_USES (i))
>   {
> bitmap_set_bit (entry_block_defs, i);
>   }
> #endif
>
> Indeed, ia64 is the only port that defines EH_USES ever to non-zero value, and
> only if reload_completed.  So this is a nice nop, but supposedly just changing
> the guarding condition to if (reload_completed) could fix this up.
>
>
>   
I cannot justify the existing code, either by looking at it or what used
to be in flow.c. 
I do agree that the existing code is a noop and should be either fixed
or deleted. 

I must admit, that i think that the proper solution is going to be have
to be one that adds the eh_uses onto the uses of instructions that can
trap because the block of code referenced here only effects the forwards
dataflow problem. 

However, this problem is really not so much about dataflow analysis as
it is about the meaning of these target specific macros.   What ever the
solution is, i think that it should be at least blessed by iant, or jim
wilson rather than just a dataflow maintainer. 

I would also point out that dealing with the EH_USES is not going to
make any difference to the "similar" problem that happens on the cris.

Kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37378



[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #50 from zadeck at naturalbridge dot com  2008-01-17 21:06 
---
Subject:  [4.3 regression] bad interaction between DF
 and SJLJ exceptions

This is the second of three patches to fix 34400.  This patch also makes
some progress on 26854 but more work is required that is not going to be
done in 4.3 to fix the problems here. 

This patch uses the output of the df_lr problem to make the df_live
problem converge faster. 
This not only saves time but also space since the size of the df_live
bitmaps never grows and the space of our bitmaps is proportional to the
number of 1 bits.

This has been tested on several platforms and along with the patch just
committed cuts the time on the 34400 problems significantly.  I believe
that this patch also has some modest improvement on bootstrap time, i.e
regular programs.

The change to df_live_reset is a slightly related latent bug fix.

Ok to commit?

Kenny


2008-01-17  Kenneth Zadeck  <[EMAIL PROTECTED]>
Steven Bosscher  <[EMAIL PROTECTED]>

PR rtl-optimization/26854
PR rtl-optimization/34400
* df-problems.c (df_live_scratch): New scratch bitmap.
(df_live_alloc): Allocate df_live_scratch when doing df_live.
(df_live_reset): Clear the proper bitmaps.
(df_live_bb_local_compute): Only process the artificial defs once
since the order is not important.
(df_live_init): Init the df_live sets only with the variables
found live by df_lr.
(df_live_transfer_function): Use the df_lr sets to prune the
df_live sets as they are being computed.  
(df_live_free): Free df_live_scratch.

Index: df-problems.c
===
--- df-problems.c   (revision 130752)
+++ df-problems.c   (working copy)
@@ -1323,6 +1323,8 @@ struct df_live_problem_data
   bitmap *out;
 };

+/* Scratch var used by transfer functions.  */
+static bitmap df_live_scratch;

 /* Set basic block info.  */

@@ -1366,6 +1368,8 @@ df_live_alloc (bitmap all_blocks ATTRIBU
   if (!df_live->block_pool)
 df_live->block_pool = create_alloc_pool ("df_live_block pool", 
   sizeof (struct df_live_bb_info),
100);
+  if (!df_live_scratch)
+df_live_scratch = BITMAP_ALLOC (NULL);

   df_grow_bb_info (df_live);

@@ -1401,7 +1405,7 @@ df_live_reset (bitmap all_blocks)

   EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
 {
-  struct df_lr_bb_info *bb_info = df_lr_get_bb_info (bb_index);
+  struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index);
   gcc_assert (bb_info);
   bitmap_clear (bb_info->in);
   bitmap_clear (bb_info->out);
@@ -1420,13 +1424,6 @@ df_live_bb_local_compute (unsigned int b
   struct df_ref **def_rec;
   int luid = 0;

-  for (def_rec = df_get_artificial_defs (bb_index); *def_rec; def_rec++)
-{
-  struct df_ref *def = *def_rec;
-  if (DF_REF_FLAGS (def) & DF_REF_AT_TOP)
-   bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def));
-}
-
   FOR_BB_INSNS (bb, insn)
 {
   unsigned int uid = INSN_UID (insn);
@@ -1467,8 +1464,7 @@ df_live_bb_local_compute (unsigned int b
   for (def_rec = df_get_artificial_defs (bb_index); *def_rec; def_rec++)
 {
   struct df_ref *def = *def_rec;
-  if ((DF_REF_FLAGS (def) & DF_REF_AT_TOP) == 0)
-   bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def));
+  bitmap_set_bit (bb_info->gen, DF_REF_REGNO (def));
 }
 }

@@ -1504,8 +1500,11 @@ df_live_init (bitmap all_blocks)
   EXECUTE_IF_SET_IN_BITMAP (all_blocks, 0, bb_index, bi)
 {
   struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index);
+  struct df_lr_bb_info *bb_lr_info = df_lr_get_bb_info (bb_index);

-  bitmap_copy (bb_info->out, bb_info->gen);
+  /* No register may reach a location where it is not used.  Thus
+we trim the rr result to the places where it is used.  */
+  bitmap_and (bb_info->out, bb_info->gen, bb_lr_info->out);
   bitmap_clear (bb_info->in);
 }
 }
@@ -1531,12 +1530,18 @@ static bool
 df_live_transfer_function (int bb_index)
 {
   struct df_live_bb_info *bb_info = df_live_get_bb_info (bb_index);
+  struct df_lr_bb_info *bb_lr_info = df_lr_get_bb_info (bb_index);
   bitmap in = bb_info->in;
   bitmap out = bb_info->out;
   bitmap gen = bb_info->gen;
   bitmap kill = bb_info->kill;

-  return bitmap_ior_and_compl (out, gen, in, kill);
+  bitmap_and (df_live_scratch, gen, bb_lr_info->out);
+  /* No register may reach a location where it is not used.  Thus
+ we trim the rr result to the places where it is used.  */
+  bitmap_and_into (in, bb_lr_info->in);
+
+  return bitmap_ior_and_compl (out, df_live_scratch, in, kill);
 }


@@ -1591,6 +1596,9 @@ df_live_free (void)
   free_alloc_pool (df_live->block_pool);
   df_live->block_info_size = 0;
   free (df_live->block_info);
+
+

[Bug tree-optimization/26854] Inordinate compile times on large routines

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #50 from zadeck at naturalbridge dot com  2008-01-17 21:20 
---
Subject: 

Mark,

Am I allowed to set the target milestone for a patch or is that your job?

26854 is not going to get fixed for 4.3. We made a lot of progress on it
with the patches to 34400, but largest remaining problem is the space
that the current representation of def-use and use-def chains requires. 
I should be able to almost cut this in half if we move to something like
a vec rather than a linked list.

But this is a big patch and i do not want to start this until stage I. 

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854



[Bug tree-optimization/26854] Inordinate compile times on large routines

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #52 from zadeck at naturalbridge dot com  2008-01-17 21:46 
---
Subject: Re:  Inordinate compile times on large
 routines

rguenth at gcc dot gnu dot org wrote:
> --- Comment #51 from rguenth at gcc dot gnu dot org  2008-01-17 21:43 
> ---
> As this isn't even marked at a regression, you can fix it whenever you like ;)
>
> Only regressions have a target milestone before they are actually fixed,
> though.
>
>
>   
just between you and me this is most likely a regression, on the other
hand, i think that people who write functions this large should be
thrown into a pit.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854



[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #53 from zadeck at naturalbridge dot com  2008-01-17 22:37 
---
Subject: Re:  [4.3 regression] bad interaction between
 DF and SJLJ exceptions

seongbae dot park at gmail dot com wrote:
> --- Comment #52 from seongbae dot park at gmail dot com  2008-01-17 22:31 
> ---
> Subject: Re:  [4.3 regression] bad interaction between DF and SJLJ exceptions
>
> I just talked to Kenny on the phone, and my suggestion is wrong
> since it changes the return value - doing my naive suggestion
> would lead to infinite loop, as the transfer function will almost always
> return true, even when the out set didn't change.
> Can you add a comment to that effect there ?
> Also please add a comment above df_live_scratch definition
> that this is an optimization to reduce memory allocation overhead
> for the scratch.
>
>   
will do.

> Can you explain why the hunk in df_live_bb_local_compute() is correct ?
> As this seems to change what DF_REF_AT_TOP means for artificial defs...
>
>   
In the old code we went thru the artificial defs twice, once for the
defs at the bottom and once for the defs at the top.  This is a waste of
time.  we only need to go thru them once since, for this problem, the
processing is order independent.


> Seongbae
>
> On Jan 17, 2008 1:31 PM, Seongbae Park (¹Ú¼º¹è, ÚÓà÷ÛÆ)
> <[EMAIL PROTECTED]> wrote:
>   
>> In df_live_transfer_function:
>>
>> Doesn't look like we need df_live_scratch - can't we do:
>>
>> bitmap_and (out, gen, bb_lr_info->out);
>> bitmap_and_into (in, bb_lr_info->in);
>> return bitmap_ior_and_compl_into (out, in, kill);
>>
>> ?
>>
>> Seongbae
>>
>>
>> On Jan 17, 2008 1:05 PM, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:
>> 
>>> This is the second of three patches to fix 34400.  This patch also makes
>>> some progress on 26854 but more work is required that is not going to be
>>> done in 4.3 to fix the problems here.
>>>
>>> This patch uses the output of the df_lr problem to make the df_live
>>> problem converge faster.
>>> This not only saves time but also space since the size of the df_live
>>> bitmaps never grows and the space of our bitmaps is proportional to the
>>> number of 1 bits.
>>>
>>> This has been tested on several platforms and along with the patch just
>>> committed cuts the time on the 34400 problems significantly.  I believe
>>> that this patch also has some modest improvement on bootstrap time, i.e
>>> regular programs.
>>>
>>> The change to df_live_reset is a slightly related latent bug fix.
>>>
>>> Ok to commit?
>>>
>>> Kenny
>>>
>>>
>>> 2008-01-17  Kenneth Zadeck  <[EMAIL PROTECTED]>
>>> Steven Bosscher  <[EMAIL PROTECTED]>
>>>
>>> PR rtl-optimization/26854
>>> PR rtl-optimization/34400
>>> * df-problems.c (df_live_scratch): New scratch bitmap.
>>> (df_live_alloc): Allocate df_live_scratch when doing df_live.
>>> (df_live_reset): Clear the proper bitmaps.
>>> (df_live_bb_local_compute): Only process the artificial defs once
>>> since the order is not important.
>>> (df_live_init): Init the df_live sets only with the variables
>>> found live by df_lr.
>>> (df_live_transfer_function): Use the df_lr sets to prune the
>>> df_live sets as they are being computed.
>>> (df_live_free): Free df_live_scratch.
>>>
>>>
>>>   
>>
>> --
>> #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";
>>
>> 
>
>
>   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400



[Bug tree-optimization/26854] Inordinate compile times on large routines

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #55 from zadeck at naturalbridge dot com  2008-01-17 22:57 
---
Subject: Re:  Inordinate compile times on large
 routines

lucier at math dot purdue dot edu wrote:
> --- Comment #54 from lucier at math dot purdue dot edu  2008-01-17 22:39 
> ---
> Created an attachment (id=14963)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14963&action=view)
> memory details for 131610
>
> This is the detailed memory usage for the compiler
>
> euler-5% /pkgs/gcc-mainline/bin/gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../../mainline/configure --prefix=/pkgs/gcc-mainline
> --enable-languages=c --enable-checking=release --with-gmp=/pkgs/gmp-4.2.2
> --with-mpfr=/pkgs/gmp-4.2.2 --enable-gather-detailed-mem-stats
> Thread model: posix
> gcc version 4.3.0 20080117 (experimental) [trunk revision 131610] (GCC) 
>
> The maximum memory I observed in top was 10.2 GB.
>
> Kenny, I can't tell whether your patch from
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c50
>
> has been committed; will that improve the situation, too?
>
>
>   
it could, but it is not the big issue here, the big issue is the size of
the def use chains.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854



[Bug tree-optimization/26854] Inordinate compile times on large routines

2008-01-17 Thread zadeck at naturalbridge dot com


--- Comment #57 from zadeck at naturalbridge dot com  2008-01-18 02:10 
---
Subject: Re:  Inordinate compile times on large
 routines

lucier at math dot purdue dot edu wrote:
> --- Comment #56 from lucier at math dot purdue dot edu  2008-01-18 01:38 
> ---
> gcc is now 5-6 times faster than it was nearly two years ago when this was
> first reported; many changes have made significant improvements in cpu time.
>
> But Steven Bosscher's patch from December still improved things more on this
> test case.
>
> In particular, on 12/20/2007, without the patch, CPU time from
>
> http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799
>
> was
>
>  TOTAL : 300.2119.16   319.52
> 778432 kB
>
> After Steven Bosscher's patch
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400#c28
>
> it was
>
>  TOTAL : 210.9715.80   226.88
> 778432 kB
>
> and today it's
>
>  TOTAL : 281.0818.03   299.41
> 776514 kB
>
> Would it still be a good idea to apply Steven's patch?
>
>
>   
the plan is to apply all of the patches,  they each deal with a
different problem and the improvement should be additive.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854



[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions

2008-01-19 Thread zadeck at naturalbridge dot com


--- Comment #56 from zadeck at naturalbridge dot com  2008-01-19 13:09 
---
Subject: Re:  [4.3 regression] bad interaction between DF and SJLJ exceptions

Let me commit the patch first.

Sent from my iPod

On Jan 19, 2008, at 4:41 AM, "steven at gcc dot gnu dot org"
<[EMAIL PROTECTED] 
 > wrote:

>
>
> --- Comment #55 from steven at gcc dot gnu dot org  2008-01-19  
> 09:41 ---
> IMHO we can close this one now as fixed.  Objections to that, anyone?
>
>
> -- 
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
>
> --- You are receiving this mail because: ---
> You are on the CC list for the bug, or are watching someone who is.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400



[Bug middle-end/34874] New: struct reorg valgrind failure

2008-01-19 Thread zadeck at naturalbridge dot com
FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (internal compiler error)
FAIL: gcc.dg/struct/wo_prof_malloc_size_var.c (test for excess errors)


 valgrind --tool=memcheck --db-attach=yes --error-limit=no 
/home/zadeck/gbB2/gcc/cc1 -fpreprocessed wo_prof_malloc_size_var.i -quiet
-dumpbase wo_prof_malloc_size_var.i -mtune=generic -auxbase
wo_prof_malloc_size_var -O3 -version -fipa-struct-reorg -fdump-ipa-all
-fwhole-program -fipa-type-escape -fno-show-column -o wo_prof_malloc_size_var.s


==27272== Invalid read of size 8
==27272==at 0xEACB12: htab_traverse_noresize (hashtab.c:747)
==27272==by 0xEACBA5: htab_traverse (hashtab.c:765)
==27272==by 0xBB65D2: check_cond_exprs (ipa-struct-reorg.c:3547)
==27272==by 0xBB6FD3: collect_data_accesses (ipa-struct-reorg.c:3830)
==27272==by 0xBB7281: reorg_structs (ipa-struct-reorg.c:3944)
==27272==by 0xBB72A0: reorg_structs_drive (ipa-struct-reorg.c:3967)
==27272==by 0x7B7476: execute_one_pass (passes.c:1118)
==27272==by 0x7B7656: execute_ipa_pass_list (passes.c:1187)
==27272==by 0xB98102: ipa_passes (cgraphunit.c:1340)
==27272==by 0xB98215: cgraph_optimize (cgraphunit.c:1387)
==27272==by 0x431628: c_write_global_declarations (c-decl.c:8079)
==27272==by 0x87CAB6: compile_file (toplev.c:1055)
==27272==  Address 0x587A700 is 16 bytes inside a block of size 104 free'd
==27272==at 0x4C2191B: free (in
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==27272==by 0xEAC5D4: htab_expand (hashtab.c:550)
==27272==by 0xEACB94: htab_traverse (hashtab.c:763)
==27272==by 0xBB014D: free_accesses (ipa-struct-reorg.c:1674)
==27272==by 0xBB1192: free_data_struct (ipa-struct-reorg.c:2111)
==27272==by 0xBB20C8: remove_structure (ipa-struct-reorg.c:2353)
==27272==by 0xBB5268: safe_cond_expr_check (ipa-struct-reorg.c:3090)
==27272==by 0xEACB34: htab_traverse_noresize (hashtab.c:750)
==27272==by 0xEACBA5: htab_traverse (hashtab.c:765)
==27272==by 0xBB65D2: check_cond_exprs (ipa-struct-reorg.c:3547)
==27272==by 0xBB6FD3: collect_data_accesses (ipa-struct-reorg.c:3830)
==27272==by 0xBB7281: reorg_structs (ipa-struct-reorg.c:3944)


-- 
   Summary: struct reorg valgrind failure
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zadeck at naturalbridge dot com
  GCC host triplet: x86-64-linux-gni
GCC target triplet: x86_64-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874



[Bug middle-end/34874] struct reorg valgrind failure

2008-01-19 Thread zadeck at naturalbridge dot com


--- Comment #1 from zadeck at naturalbridge dot com  2008-01-19 20:13 
---
I am about to commit the last fix to p34400 and at least on my machine, this
patch will make this failure disappear from the test suite.  however the bug is
still there if you look with valgrind.  

pinskia, i am sorry, i am about to leave for the day I want to close 34400 and
i did not get to do a dup check to see if this was already there.  

kenny.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874



[Bug middle-end/34874] struct reorg valgrind failure

2008-01-19 Thread zadeck at naturalbridge dot com


--- Comment #2 from zadeck at naturalbridge dot com  2008-01-20 01:43 
---
actually the commit for 34400 does not seem to effect this bug. 
but the bug does have that nice heisenbug quality to it.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34874



[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions

2008-01-19 Thread zadeck at naturalbridge dot com


--- Comment #58 from zadeck at naturalbridge dot com  2008-01-20 02:13 
---
The three patches that have been committed seem to have brought this under
control.


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400



[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #6 from zadeck at naturalbridge dot com  2008-01-20 13:53 
---
I need a more info to reproduce this bug.  I bootstrapped and regression tested
on x86_64-unknown-linux-gnu with suse 10.3 and using
--enable-languages=c,c++,fortran  --disable-multilib before committing the
patch and got 


=== gfortran Summary ===

# of expected passes23538
# of expected failures  4
# of unsupported tests  18

i am not doubting that the failure is related to this patch.  Given all of rest
of the info, it smells like this patch is responsible, but i do not get the
failure on my config.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884



[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #8 from zadeck at naturalbridge dot com  2008-01-20 15:24 
---
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> --- Comment #7 from dominiq at lps dot ens dot fr  2008-01-20 14:39 
> ---
>   
>> I need a more info to reproduce this bug.
>> 
>
> I have tried to give all the info I have been able to gather on my own. My
> config is:
>
> Configured with: ../gcc-4.3-work/configure --prefix=/opt/gcc/gcc4.3w
> --mandir=/opt/gcc/gcc4.3w/share/man --infodir=/opt/gcc/gcc4.3w/share/info
> --build=i686-apple-darwin9 --enable-languages=c,c++,fortran,objc,obj-c++,java
> --with-gmp=/sw --with-libiconv-prefix=/usr --with-system-zlib
> --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib
>
> As far as I can tell, the bug appears after the tree optimization, but at this
> point I don't know what I should dump. Having looked at the test-suite 
> results,
> the problem appears on 32 bit x86 platforms. From
>
>   
>> --disable-multilib
>> 
>
> I infer that you cannot try with -m32, isn't it?
>
>
>   
the first comment of the bug says linux/intel64. 

your config string looks like you are building on a mac "darwin" box. 
That would be the difference. I build on a real linux box that cannot
run darwin. 

could you please send me two tar files:
one tar file from the release with out this patch containing the test
case case with the "-da" option and one from the release with the patch
with the same option.  This option will produce a large number of dump
files and from those dumps i will fix the bug. 

Thanks in advance.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884



[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #10 from zadeck at naturalbridge dot com  2008-01-20 15:39 
---
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> --- Comment #9 from dominiq at lps dot ens dot fr  2008-01-20 15:30 
> ---
>   
>> you are building on a mac "darwin" box
>> 
>
> Yes indeed, but the bug is also present for i686-pc-linux-gnu, see for
> instance:
>
> http://gcc.gnu.org/ml/gcc-testresults/2008-01/msg00914.html
>
>
>   
i will build this on a 32 bit box.  that is my problem. sorry, thanks.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884



[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #12 from zadeck at naturalbridge dot com  2008-01-20 15:52 
---
Subject: Re:  [4.3 Regression] gfortran.dg/array_constructor_9.f90

dominiq at lps dot ens dot fr wrote:
> --- Comment #11 from dominiq at lps dot ens dot fr  2008-01-20 15:47 
> ---
> I have put the results of the compilation with -da with the patch at
>
> http://www.lps.ens.fr/~dominiq/gcc/tmp_fresh.tar.bz2
>
> All the files will be in a directory tmp_fresh.  Do you still need the same
> without the patch? It will take some time to reverse the patch and to do the
> rebuilding.
>
>
>   
let me try to build a 32 bit compiler.  that appears to be the problem.
it will be easier if i can get it on my machine. 

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884



[Bug tree-optimization/34472] [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c doesn't work

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #9 from zadeck at naturalbridge dot com  2008-01-20 15:29 
---
olga, 

even if the test case does not normally ice on your system, you be able to see
the bug if you run the test with valgrind.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34472



[Bug tree-optimization/34472] [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c doesn't work

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #11 from zadeck at naturalbridge dot com  2008-01-20 16:34 
---
Subject: Re:  [4.3 Regression] gcc.dg/struct/wo_prof_malloc_size_var.c
 doesn't work

olga at gcc dot gnu dot org wrote:
> --- Comment #10 from olga at gcc dot gnu dot org  2008-01-20 16:28 ---
> (In reply to comment #9)
>   
>> olga, 
>> even if the test case does not normally ice on your system, you be able to 
>> see
>> the bug if you run the test with valgrind.
>> 
>
> Kenny,
>
> Thank you a lot for information. I was not aware about valgrid. Does it help
> also with segfaults?
>
> The patch in comment #4 solves the ICE, but on some system it generates the
> execution failures (PR 34534 and PR 34483). Can you see what it makes on your
> system?
>
> Thank you a lot,
> Olga
>
>
>
>   
generally it does. it is not perfect.   it is very good at finding
faults with malloc'ed memory.
did you actually try valgrind with this bug?  if you need some help, hop
on irc and i will talk you thru it.

kenny


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34472



[Bug fortran/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90

2008-01-20 Thread zadeck at naturalbridge dot com


--- Comment #14 from zadeck at naturalbridge dot com  2008-01-20 18:30 
---
confirmed on my machine,
i will have my best people work on it.

kenny


-- 

zadeck at naturalbridge dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |zadeck at naturalbridge dot
   |dot org |com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884



  1   2   3   >