Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-24 Thread fjahanian


On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:


I wonder why combine can do the simplification though which is why  
still

produce good code for the simple testcase:
void f1(double *d,float *f2)
{
  *f2 = 0.0;
  *d = 0.0;
}

It is hard to reproduce the simple test case, exhibiting the same  
problem (-O1 producing better code than -O2). Yes, small test cases  
move the desired simplification to other phases.


- fariborz


Thanks,
Andrew Pinski






Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-30 Thread fjahanian


On Jun 24, 2005, at 5:06 PM, Steven Bosscher wrote:


On Saturday 25 June 2005 01:48, fjahanian wrote:


On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:


I wonder why combine can do the simplification though which is why
still
produce good code for the simple testcase:
void f1(double *d,float *f2)
{
  *f2 = 0.0;
  *d = 0.0;
}



It is hard to reproduce the simple test case, exhibiting the same
problem (-O1 producing better code than -O2). Yes, small test cases
move the desired simplification to other phases.



It often helps if you know what function your poorer code is in.  You
could e.g. try to make the .optimized dump of that function compilable
and see if the problem shows up there again.  Then work your way down
to something small.


Yes, I am planning to do this. My first question was though if the  
RTL generated by -O2, which does not get simplified, is correct and  
should be optimized in one of the rtl optimizers. If not, then focus  
shifts to tree optimizers.


- Thanks ,fariborz



Gr.
Steven






Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-30 Thread fjahanian


On Jun 24, 2005, at 5:20 PM, fjahanian wrote:



On Jun 24, 2005, at 5:06 PM, Steven Bosscher wrote:



On Saturday 25 June 2005 01:48, fjahanian wrote:



On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:



I wonder why combine can do the simplification though which is why
still
produce good code for the simple testcase:
void f1(double *d,float *f2)
{
  *f2 = 0.0;
  *d = 0.0;
}




It is hard to reproduce the simple test case, exhibiting the same
problem (-O1 producing better code than -O2). Yes, small test cases
move the desired simplification to other phases.




It often helps if you know what function your poorer code is in.  You
could e.g. try to make the .optimized dump of that function  
compilable

and see if the problem shows up there again.  Then work your way down
to something small.



Yes, I am planning to do this. My first question was though if the  
RTL generated by -O2, which does not get simplified, is correct and  
should be optimized in one of the rtl optimizers. If not, then  
focus shifts to tree optimizers.


This email went through late and superseded by earlier exchanges, It  
turned out to be all RTL related issues.


- faribrz



- Thanks ,fariborz




Gr.
Steven










Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86

2005-06-30 Thread fjahanian


On Jun 27, 2005, at 2:50 PM, Fariborz Jahanian wrote:



On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:



Hmm.  I would suspect this is obsolete now.  We'll have forced
everything into "registers" (or something equivalent that we
can work with) during tree optimization.  Any CSEs that can be
made should have been made.




I will do  sanity check followed by SPEC runs (x86 and ppc darwin)  
and see if behavior changes by obsoleting -fforce-mem  in -O2  (or  
higher).


Bootstrapped and dejagnu tested on apple-x86-darwin and apple-ppc- 
darwin.


We also observed that on ppc, SPEC did not show any performance  
change either way. On apple-x86-darwin 252.eon improved by 7% as  
expected, with no noticeable change in other benchmarks. One caveat  
to all these is that this may expose optimization bugs which were  
previously hidden by inclusion of -fforce-mem.


OK for check-in?

- fariborz

ChangeLog:

2005-06-30  Fariborz Jahanian <[EMAIL PROTECTED]>

  * opts.c (decode_options): Don't set -fforce-mem with -O2 and  
more.



Index: opts.c
===
RCS file: /cvs/gcc/gcc/gcc/opts.c,v
retrieving revision 1.114
diff -c -p -r1.114 opts.c
*** opts.c  24 Jun 2005 03:09:45 -  1.114
--- opts.c  30 Jun 2005 15:55:15 -
*** decode_options (unsigned int argc, const
*** 559,565 
flag_rerun_cse_after_loop = 1;
flag_rerun_loop_opt = 1;
flag_caller_saves = 1;
-   flag_force_mem = 1;
flag_peephole2 = 1;
  #ifdef INSN_SCHEDULING
flag_schedule_insns = 1;
--- 559,564 



- Thanks, fariborz




r~