PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
This sounds like a dumb question I know. However the following code
snippet results in many more machine instructions under 4.4.2 than under
2.9.5 (I am running a cygwin->PowerPC cross):

  typedef unsigned int U32;
  typedef union
  {
U32 R;
struct
{
  U32 BF1:2;
  U32 :8;
  U32 BF2:2;
  U32 BF3:2;
  U32 :18;
} B;
  } TEST_t;
  U32 testFunc(void)
  {
TEST_t t;
t.R=0;
t.B.BF1=2;
t.B.BF2=3;
t.B.BF3=1;
return t.R;
  }

Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
gcc-test-442.s):

  li 0,2
  li 3,0
  rlwimi 3,0,30,0,1
  li 0,3
  rlwimi 3,0,20,10,11
  li 0,1
  rlwimi 3,0,18,12,13
  blr

Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
gcc-test-295.s):

  lis 3,0x8034
  blr

Is there any way to improve this behaviour? I have been using 2.9.5 very
successfully for years and am now looking at 4.4.2, but have many such
examples in my code (for clarity of commenting and maintainability).

I have also noticed that 4.4.2 seems to use significantly larger stack
frames, and consequently more register-stacking instructions than 2.9.5
for the same functions. Am I missing something? Many thanks if you can
shed any light on this.

Mark


*
This email has been checked by the altohiway Mailcontroller Service
*


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
> >> Is there any way to improve this behaviour? I have been using 2.9.5
> very
> >> successfully for years and am now looking at 4.4.2, but have many
> such
> >> examples in my code (for clarity of commenting and maintainability).
> >
> > This is very strange.  On x86_64, gcc 4.4.1 generates
> >
> >        movl    $7170, %eax
> >        ret
> >
> > This optimization is done by the first RTL cse pass.  I can't
> understand
> > why it's not being done for your target.  I guess this will need a
> > powerpc expert.

Thanks Andrew for checking this on your system.

> Known bug, see http://gcc.gnu.org/PR22141
> 
> I hope Jakub will finish this work for gcc 4.5.
> 
> Ciao!
> Steven

Thanks Steven. At least I have a handle on it now. Fingers crossed for 4.5 :-)


*
This email has been checked by the altohiway Mailcontroller Service
*


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-04 Thread Mark Colby
> On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote:
> > > This optimization is done by the first RTL cse pass.  I can't
> understand
> > > why it's not being done for your target.  I guess this will need a
> > > powerpc expert.
> >
> > Known bug, see http://gcc.gnu.org/PR22141
> 
> That's unrelated.  PR22141 is about (lack of) merging of adjacent stores
> of
> constant values into memory, but there are no memory stores involved
> here,
> everything is in registers, so PR22141 patch will make zero difference
> here.
> 
> IMHO we really should have some late tree pass that converts adjacent
> bitfield operations into integral operations on non-bitfields (likely
> with
> alias set of the whole containing aggregate), as at the RTL level many
> cases
> are simply too many instructions for combine etc. to optimize them
> properly,
> while at the tree level it could be simpler.

Ah. I take it that v2's optimisation was structured differently, as it does 
spot and take care of this case?


*
This email has been checked by the altohiway Mailcontroller Service
*


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-06 Thread Mark Colby
>>> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?
>> 
>> Probably because the RTL on x86_64 uses and's and ior's, but PPC uses
>> set's of zero_extract's (insvsi).
>
> Aha!  Yes, that'll probably be it.  It should be easy to fix cse to
> recognize those too.
>
> Andrew

I'm not familiar with the gcc source yet, but just in case I get the time to 
look at this, could anyone give me a file/line ref to dive into and examine? 
Thanks for your attention on this.

Mark


*
This email has been checked by the altohiway Mailcontroller Service
*


RE: PowerPC : GCC2 optimises better than GCC4???

2010-01-06 Thread Mark Colby
>>> Aha!  Yes, that'll probably be it.  It should be easy to fix cse to
>>> recognize those too.

>> I'm not familiar with the gcc source yet, but just in case I get the
>> time to look at this, could anyone give me a file/line ref to dive
>> into and examine?

> Would you believe cse.c?  :-)

Ha! I'll look before asking next time :-)

> I can't find the line without investigating further.

> Andrew.

> P.S.  This is a nontrivial task if you don't know gcc, but might be a
> good place for a beginner to start.  OTOH, might be hard: no way to
> know without digging.

Many thanks. I'll take a look.


*
This email has been checked by the altohiway Mailcontroller Service
*