On 01/04/2010 10:51 AM, Mark Colby wrote:
> This sounds like a dumb question I know. However the following code
> snippet results in many more machine instructions under 4.4.2 than under
> 2.9.5 (I am running a cygwin->PowerPC cross):
>
> typedef unsigned int U32;
> typedef union
> {
> U32 R;
> struct
> {
> U32 BF1:2;
> U32 :8;
> U32 BF2:2;
> U32 BF3:2;
> U32 :18;
> } B;
> } TEST_t;
> U32 testFunc(void)
> {
> TEST_t t;
> t.R=0;
> t.B.BF1=2;
> t.B.BF2=3;
> t.B.BF3=1;
> return t.R;
> }
>
> Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
> gcc-test-442.s):
>
> li 0,2
> li 3,0
> rlwimi 3,0,30,0,1
> li 0,3
> rlwimi 3,0,20,10,11
> li 0,1
> rlwimi 3,0,18,12,13
> blr
>
> Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
> gcc-test-295.s):
>
> lis 3,0x8034
> blr
>
> Is there any way to improve this behaviour? I have been using 2.9.5 very
> successfully for years and am now looking at 4.4.2, but have many such
> examples in my code (for clarity of commenting and maintainability).
This is very strange. On x86_64, gcc 4.4.1 generates
movl $7170, %eax
ret
This optimization is done by the first RTL cse pass. I can't understand
why it's not being done for your target. I guess this will need a
powerpc expert.
Andrew.