PowerPC : GCC2 optimises better than GCC4???
This sounds like a dumb question I know. However the following code snippet results in many more machine instructions under 4.4.2 than under 2.9.5 (I am running a cygwin->PowerPC cross): typedef unsigned int U32; typedef union { U32 R; struct { U32 BF1:2; U32 :8; U32 BF2:2; U32 BF3:2; U32 :18; } B; } TEST_t; U32 testFunc(void) { TEST_t t; t.R=0; t.B.BF1=2; t.B.BF2=3; t.B.BF3=1; return t.R; } Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o gcc-test-442.s): li 0,2 li 3,0 rlwimi 3,0,30,0,1 li 0,3 rlwimi 3,0,20,10,11 li 0,1 rlwimi 3,0,18,12,13 blr Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o gcc-test-295.s): lis 3,0x8034 blr Is there any way to improve this behaviour? I have been using 2.9.5 very successfully for years and am now looking at 4.4.2, but have many such examples in my code (for clarity of commenting and maintainability). I have also noticed that 4.4.2 seems to use significantly larger stack frames, and consequently more register-stacking instructions than 2.9.5 for the same functions. Am I missing something? Many thanks if you can shed any light on this. Mark * This email has been checked by the altohiway Mailcontroller Service *
RE: PowerPC : GCC2 optimises better than GCC4???
> >> Is there any way to improve this behaviour? I have been using 2.9.5 > very > >> successfully for years and am now looking at 4.4.2, but have many > such > >> examples in my code (for clarity of commenting and maintainability). > > > > This is very strange. On x86_64, gcc 4.4.1 generates > > > > movl $7170, %eax > > ret > > > > This optimization is done by the first RTL cse pass. I can't > understand > > why it's not being done for your target. I guess this will need a > > powerpc expert. Thanks Andrew for checking this on your system. > Known bug, see http://gcc.gnu.org/PR22141 > > I hope Jakub will finish this work for gcc 4.5. > > Ciao! > Steven Thanks Steven. At least I have a handle on it now. Fingers crossed for 4.5 :-) * This email has been checked by the altohiway Mailcontroller Service *
RE: PowerPC : GCC2 optimises better than GCC4???
> On Mon, Jan 04, 2010 at 12:18:50PM +0100, Steven Bosscher wrote: > > > This optimization is done by the first RTL cse pass. I can't > understand > > > why it's not being done for your target. I guess this will need a > > > powerpc expert. > > > > Known bug, see http://gcc.gnu.org/PR22141 > > That's unrelated. PR22141 is about (lack of) merging of adjacent stores > of > constant values into memory, but there are no memory stores involved > here, > everything is in registers, so PR22141 patch will make zero difference > here. > > IMHO we really should have some late tree pass that converts adjacent > bitfield operations into integral operations on non-bitfields (likely > with > alias set of the whole containing aggregate), as at the RTL level many > cases > are simply too many instructions for combine etc. to optimize them > properly, > while at the tree level it could be simpler. Ah. I take it that v2's optimisation was structured differently, as it does spot and take care of this case? * This email has been checked by the altohiway Mailcontroller Service *
RE: PowerPC : GCC2 optimises better than GCC4???
>>> Yabbut, how come RTL cse can handle it in x86_64, but PPC not? >> >> Probably because the RTL on x86_64 uses and's and ior's, but PPC uses >> set's of zero_extract's (insvsi). > > Aha! Yes, that'll probably be it. It should be easy to fix cse to > recognize those too. > > Andrew I'm not familiar with the gcc source yet, but just in case I get the time to look at this, could anyone give me a file/line ref to dive into and examine? Thanks for your attention on this. Mark * This email has been checked by the altohiway Mailcontroller Service *
RE: PowerPC : GCC2 optimises better than GCC4???
>>> Aha! Yes, that'll probably be it. It should be easy to fix cse to >>> recognize those too. >> I'm not familiar with the gcc source yet, but just in case I get the >> time to look at this, could anyone give me a file/line ref to dive >> into and examine? > Would you believe cse.c? :-) Ha! I'll look before asking next time :-) > I can't find the line without investigating further. > Andrew. > P.S. This is a nontrivial task if you don't know gcc, but might be a > good place for a beginner to start. OTOH, might be hard: no way to > know without digging. Many thanks. I'll take a look. * This email has been checked by the altohiway Mailcontroller Service *