[Bug rtl-optimization/32283] Missed induction variable optimization

2007-09-05 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #9 from ramana dot radhakrishnan at celunite dot com  
2007-09-05 11:46 ---
The above mentioned testcase works ok and generates auto-increments in Comment
#8 . I'd still be interested in looking at why the volatile case cannot work. 

Adding Zdenek to the CC for this case. 


-- 

ramana dot radhakrishnan at celunite dot com changed:

   What|Removed |Added

 CC||rakdver at atrey dot karlin
   ||dot mff dot cuni dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32283



[Bug tree-optimization/33404] New: Predictive commoning + ivopts possibly introducing extra sign extensions.

2007-09-12 Thread ramana dot radhakrishnan at celunite dot com
Hi , 

There's a difference in the code generated between O2 and O3 in the case below. 



void fred(long in, short *out1)
{
int i;
for (i=0;i<100;i++)
out1[i+1] = out1[i]*in;
}

With O2 we generate at expand time - 

fred (in, out1)
{
  unsigned int ivtmp.24;

:
  ivtmp.24 = (unsigned int) out1;

:
  MEM[index: ivtmp.24, offset: 2] = (short int) (in * (long int) MEM[index:
ivtmp.24]);
  ivtmp.24 = ivtmp.24 + 2;
  if (ivtmp.24 != (unsigned int) (out1 + 200))
goto ;
  else
goto ;

:
  return;

}

With O3 we generate . 

fred (in, out1)
{
  unsigned int ivtmp.23;
  short int D__lsm0.18;
  long int D.1212;

:
  D__lsm0.18 = *out1;
  ivtmp.23 = 1;

:
  D.1212 = (long int) D__lsm0.18 * in;
  D__lsm0.18 = (short int) D.1212;
  MEM[base: out1, index: ivtmp.23 * 2] = D__lsm0.18;
  ivtmp.23 = ivtmp.23 + 1;
  if (ivtmp.23 != 101)
goto ;
  else
goto ;

:
  return;

}


-- 
   Summary: Predictive commoning + ivopts possibly introducing extra
sign extensions.
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ramana dot radhakrishnan at celunite dot com
 GCC build triplet: i686-linux-gnu
  GCC host triplet: i686-linux-gnu
GCC target triplet: arm-none-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33404



[Bug tree-optimization/33508] New: tree struct aliasing goes into a loop marking call clobbers.

2007-09-20 Thread ramana dot radhakrishnan at celunite dot com
l
 11 kB ( 0%) ggc
tree PRE  :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall
192 kB ( 1%) ggc
tree FRE  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall
113 kB ( 0%) ggc
tree code sinking :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall
  5 kB ( 0%) ggc
tree forward propagate:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
 40 kB ( 0%) ggc
tree conservative DCE :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
  0 kB ( 0%) ggc
tree loop bounds  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
  4 kB ( 0%) ggc
tree iv optimization  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 38 kB ( 0%) ggc
tree loop init:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall
 12 kB ( 0%) ggc
tree SSA to normal:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall
 31 kB ( 0%) ggc
tree SSA verifier :   4.07 ( 0%) usr   0.27 ( 2%) sys   5.43 ( 0%) wall
 99 kB ( 0%) ggc
tree STMT verifier:   0.18 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall
  0 kB ( 0%) ggc
dominance computation :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
  0 kB ( 0%) ggc
expand:   0.96 ( 0%) usr   0.06 ( 0%) sys   9.88 ( 0%) wall  
25260 kB (66%) ggc
forward prop  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 32 kB ( 0%) ggc
CSE   :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.46 ( 0%) wall
 18 kB ( 0%) ggc
dead code elimination :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall
  0 kB ( 0%) ggc
dead store elim1  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall
 17 kB ( 0%) ggc
dead store elim2  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 24 kB ( 0%) ggc
loop analysis :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
 15 kB ( 0%) ggc
CPROP 1   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 15 kB ( 0%) ggc
PRE   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall
  6 kB ( 0%) ggc
CPROP 2   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall
 15 kB ( 0%) ggc
bypass jumps  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 14 kB ( 0%) ggc
auto inc dec  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
  7 kB ( 0%) ggc
CSE 2 :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall
  8 kB ( 0%) ggc
combiner  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall
 37 kB ( 0%) ggc
regmove   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall
  2 kB ( 0%) ggc
scheduling:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall
 39 kB ( 0%) ggc
local alloc   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
 98 kB ( 0%) ggc
global alloc  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall
 35 kB ( 0%) ggc
reload CSE regs   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall
 52 kB ( 0%) ggc
thread pro- & epilogue:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 34 kB ( 0%) ggc
peephole 2:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
  0 kB ( 0%) ggc
rename registers  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall
  0 kB ( 0%) ggc
scheduling 2  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall
  7 kB ( 0%) ggc
machine dep reorg :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall
  4 kB ( 0%) ggc
final :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.74 ( 0%) wall
  0 kB ( 0%) ggc
symout:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall
  0 kB ( 0%) ggc
TOTAL :3617.3513.09  4205.33 
38504 kB


-- 
   Summary: tree struct aliasing goes into a loop marking call
clobbers.
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ramana dot radhakrishnan at celunite dot com
 GCC build triplet: i686-linux-gnu
  GCC host triplet: i686-linux-gnu
GCC target triplet: arm-none-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33508



[Bug tree-optimization/33508] tree struct aliasing goes into a loop marking call clobbers.

2007-09-20 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #1 from ramana dot radhakrishnan at celunite dot com  
2007-09-20 10:44 ---
Created an attachment (id=14229)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14229&action=view)
testcase. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33508



[Bug tree-optimization/33508] [4.3 Regression] tree struct aliasing goes into a loop marking call clobbers.

2007-09-20 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #6 from ramana dot radhakrishnan at celunite dot com  
2007-09-20 13:52 ---
(In reply to comment #4)
> Created an attachment (id=14230)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14230&action=view) [edit]
> patch fixing the problem
> 
> This fixes it.  The idea is to keep track of which parent vars we need to add
> all subvars to the call clobbered list in a bitmap and process them after the
> first walk.
> 

Yep it does - Thanks for the quick fix. I am testing it now and will let you
know in a bit . 


(In reply to comment #5)
> 4.2 doesn't have this extra loop.
> 


-- 

ramana dot radhakrishnan at celunite dot com changed:

   What|Removed |Added

Version|4.3.0   |unknown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33508



[Bug rtl-optimization/34849] Missed autoincrement oppurtunities thanks to a different basic block structure.

2008-01-18 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #3 from ramana dot radhakrishnan at celunite dot com  
2008-01-18 14:37 ---
Add CC


-- 

ramana dot radhakrishnan at celunite dot com changed:

   What|Removed |Added

 CC||pranav dot bhandarkar at
   ||gmail dot com, dave at
   ||icerasemi dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34849



[Bug rtl-optimization/34849] Missed autoincrement oppurtunities thanks to a different basic block structure.

2008-01-18 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #2 from ramana dot radhakrishnan at celunite dot com  
2008-01-18 14:35 ---
(In reply to comment #1)
> Which optimization level?
-O2 . 

> 
> Why does cross-jumping not optimize this?
> 

Will check on cross-jumping and get back. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34849



[Bug rtl-optimization/34849] New: Missed autoincrement oppurtunities thanks to a different basic block structure.

2008-01-18 Thread ramana dot radhakrishnan at celunite dot com
Whilst investigating a missed optimization oppurtunity in comparison to gcc 3.4
I came across this case. 

void foo (int n, int in[], int res[])
{
  int i;
  for (i=0; i:
  if (n > 0)
goto ;
  else
goto ;

:
  i = 0;
  ivtmp.19 = 0;

:
  if (MEM[base: in, index: ivtmp.19] != 0)
goto ;
  else
goto ;

:
  MEM[base: res, index: ivtmp.19] = 4660;
  goto ;

:
  MEM[base: res, index: ivtmp.19] = 39030;

:
  i = i + 1;
  ivtmp.19 = ivtmp.19 + 4;
  if (n > i)
goto ;
  else
goto ;

:
  return;

}

If you notice ivtmp.19 can be used for post-increment based addressing modes. 


Note that GCC 3.4 did not have another basic block for the else case, the basic
block for the else case got merged with the tail block of the loop and hence
auto-inc could get generated in the else case and not in the if side of things.
Can be reproduced with today's head of 4.3.0


-- 
   Summary: Missed autoincrement oppurtunities thanks to a different
basic block structure.
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ramana dot radhakrishnan at celunite dot com
 GCC build triplet: i686-linux-gnu
  GCC host triplet: i686-linux-gnu
GCC target triplet: arm-none-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34849



[Bug c++/32716] [4.2/4.3 Regression] Wrong code generation. Alias and C++ virtual bases problem.

2007-07-10 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #4 from ramana dot radhakrishnan at celunite dot com  
2007-07-10 15:14 ---
(In reply to comment #3)
> Fixed with "take3.diff".
> 

Did you forget to attach take3.diff ? 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32716



[Bug tree-optimization/32721] New: CCP removes volatile qualifiers.

2007-07-10 Thread ramana dot radhakrishnan at celunite dot com
With today's trunk on a private port . consider the following testcase. 


volatile int spinlock[2];
void main (void)
{
volatile int * spinlock0;
volatile int * spinlock1;
spinlock0 = &spinlock[0];
spinlock1 = &spinlock[1];

*spinlock0 = 0;
*spinlock1 = 0;
 while (*spinlock0);
}

CCP folds this into the following form 
Simulating block 4

Simulating block 3

Substituing values and folding statements

Folded statement: *spinlock0_1 = 0;
into: spinlock[0] = 0;

Folded statement: *spinlock1_2 = 0;
into: spinlock[1] = 0;

Folded statement: D.1498_3 = *spinlock0_1;
into: D.1498_3 = spinlock[0];

main ()
{
  volatile int * spinlock1;
  volatile int * spinlock0;
  int D.1498;

:
  spinlock0_1 = &spinlock[0];
  spinlock1_2 = &spinlock[1];
  spinlock[0] = 0;
  spinlock[1] = 0;

:
  D.1498_3 = spinlock[0]; ---> This folding seems to be wrong. 
  if (D.1498_3 != 0)
goto ;
  else
goto ;

:
  return;

}


-- 
   Summary: CCP removes volatile qualifiers.
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: ramana dot radhakrishnan at celunite dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32721



[Bug tree-optimization/32721] CCP removes volatile qualifiers.

2007-07-10 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #3 from ramana dot radhakrishnan at celunite dot com  
2007-07-10 20:14 ---
(In reply to comment #2)
> As the decl is volatile as well this is clearly a bogus optimization.
> 

Putting a breakpoint on evaluate_stmt in tree-ssa-ccp.c shows that stmt_ann of
the stmt does not have has_volatile_ops set to true. 


(gdb) p stmt 
(gdb) pt
 
sizes-gimplified unsigned SI
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 0xb7d44bd0>
visited var  def_stmt

version 1>
arg 1 
constant invariant
arg 0 
arg 0 
arg 1 >>
try.c:5>
(gdb) (gdb) p *(stmt->base->ann)
$17 = {common = {type = STMT_ANN, aux = 0x0, value_handle = 0x0}, vdecl = {
common = {type = STMT_ANN, aux = 0x0, value_handle = 0x0}, 
out_of_ssa_tag = 0, base_var_processed = 0, used = 0, 
need_phi_state = NEED_PHI_STATE_NO, in_vuse_list = 1, in_vdef_list = 0, 
is_heapvar = 0, call_clobbered = 1, noalias_state = NO_ALIAS_GLOBAL, 
mpt = 0xb7cb6804, symbol_mem_tag = 0x0, partition = 0, base_index = 0, 
current_def = 0x0, subvars = 0x0, escape_mask = 3084210192}, fdecl = {
common = {type = STMT_ANN, aux = 0x0, value_handle = 0x0}, 
reference_vars_info = 0xb7eed528}, stmt = {common = {type = STMT_ANN, 
  aux = 0x0, value_handle = 0x0}, bb = 0xb7eed528, operands = {
  def_ops = 0xb7cb6804, use_ops = 0x0, vdef_ops = 0x0, vuse_ops = 0x0, 
  stores = 0x0, loads = 0x0}, addresses_taken = 0xb7d55010, uid = 0, 
references_memory = 0, modified = 0, has_volatile_ops = 0, 
makes_clobbering_call = 0}}

Shouldn't has_volatile_ops get set to true in this case because the stmt
essentially has one volatile operand here ? 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32721



[Bug rtl-optimization/32283] Missed induction variable optimization

2007-11-27 Thread ramana dot radhakrishnan at celunite dot com


--- Comment #14 from ramana dot radhakrishnan at celunite dot com  
2007-11-27 11:00 ---
(In reply to comment #13)
> This patch sometimes confuses loop2_doloop.  On ia64 this prevents use of
> countable loop branch machine idiom (br.cloop).  On the example used in this
> thread loop2_doloop complains:
> 
> Loop 1 is simple:
>   simple exit 5 -> 6
>   infinite if: (expr_list:REG_DEP_TRUE (subreg:SI (and:DI (plus:DI (minus:DI
> (reg:DI 391)
> (reg:DI 370 [ ivtmp.16 ]))
> (const:DI (plus:DI (symbol_ref:DI ("a") [flags 0x2]  0x2abd7000 a>)
> (const_int -2 [0xfffe]
> (const_int 1 [0x1])) 0)
> (nil))   
>   number of iterations: (lshiftrt:DI (plus:DI (minus:DI (reg:DI 392)
> (reg:DI 370 [ ivtmp.16 ]))
> (const_int -2 [0xfffe]))
> (const_int 1 [0x1]))
>   upper bound: -1
> Doloop: Possible infinite iteration case.
> Doloop: The loop is not suitable.
> 
> The "infinite if" condition is:
> ((r391 - r370) + ('a' - 2)) & 1 == 1
> where r370 is &(a[i]) and r391 is len*sizeof(a[0]), so that r391+'a' is
> &a[len].  Of course, such "infinite if" condition is always false, but
> loop2_doloop does not see that.
> 

This is pretty much the case that causes things to go worse even on the private
port I work on after this patch.

Debugging this on ia64 or my port shows the same point at which this detection
fails,  though mine would fail for the SI case rather than the DI case.

The infinite if case is detected in loop_iv.c : iv_number_of_iterations when it
can't simplify the above mentioned expression. I looked through tree-ssa-ivopts
but I can't see how this can get fixed there unless we change it in loop_iv.c . 

I wonder if we could use DF info to recursively figure out the reaching
definition at insn of r391 and r370 and substitute the RHS in this to simplify
this further.  

However whether that effort would be worthwhile depends on the number of such
cases that we detect in any useful benchmark. My guess is that since this is a
pretty normal construct we'd find it in quite a number of loops that are rather
self-respecting.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32283