http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47764



Uros Bizjak <ubizjak at gmail dot com> changed:



           What    |Removed                     |Added

----------------------------------------------------------------------------

             Target|arm-linux-androideabi       |

                 CC|                            |ubizjak at gmail dot com

          Component|target                      |rtl-optimization

      Known to fail|                            |4.7.0, 4.8.0



--- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2013-01-24 07:25:04 
UTC ---

This is a problem with rtl-optimization, gcse2 pass.



Following testcase also fails on x86_64, with 4.8 [1] that removes (!o,F)

alternative.



Following test, when compiled with -O3 hoists memory load out of the loop:



--cut here--

volatile double y;



void

test ()

{

  int z;



  for (z = 0; z < 1000; z++)

    y = 0.1;

}

--cut here--



_.210r.postreload:



   15: L15:

    8: NOTE_INSN_BASIC_BLOCK 3

   23: xmm0:DF=[`*.LC0']

   10: [`y']=xmm0:DF

      REG_DEAD xmm0:DF

   11: NOTE_INSN_DELETED

   12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}

   13: pc={(flags:CCZ!=0)?L15:pc}

      REG_BR_PROB 0x26ab



_.211r.gcse2:



   26: xmm0:DF=[`*.LC0']

   15: L15:

    8: NOTE_INSN_BASIC_BLOCK 3

   10: [`y']=xmm0:DF

      REG_DEAD xmm0:DF

   11: NOTE_INSN_DELETED

   12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}

   13: pc={(flags:CCZ!=0)?L15:pc}

      REG_BR_PROB 0x26ab



However, when constant is changed to 0.0 (so, we can load it directly to %xmm

register using xorpd insn):



--cut here--

volatile double y;



void

test ()

{

  int z;



  for (z = 0; z < 1000; z++)

    y = 0.0;

}

--cut here--



gcc -O3:



_.211r.gcse2:



   15: L15:

    8: NOTE_INSN_BASIC_BLOCK 3

   10: xmm0:DF=0.0

   23: [`y']=xmm0:DF

      REG_DEAD xmm0:DF

   11: NOTE_INSN_DELETED

   12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}

   13: pc={(flags:CCZ!=0)?L15:pc}

      REG_BR_PROB 0x26ab



Constant load remains inside the loop. It looks that gcse2 pass cares only for

loads from memory, but I see no reason why constant load should not be

considered. It looks like an oversight to me.



The same happens with:



--cut here--

volatile long long y;



void

test ()

{

  int z;



  for (z = 0; z < 1000; z++)

    y = 0x123456789;

}

--cut here--



_.211r.gcse2:



   15: L15:

    8: NOTE_INSN_BASIC_BLOCK 3

   23: dx:DI=0x123456789

   24: [`y']=dx:DI

      REG_DEAD dx:DI

   11: NOTE_INSN_DELETED

   12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;}

   13: pc={(flags:CCZ!=0)?L15:pc}

      REG_BR_PROB 0x26ab



resulting in:



.L3:

        movabsq $4886718345, %rdx

        subl    $1, %eax

        movq    %rdx, y(%rip)

        jne     .L3



Reconfirmed as rtl-optimization (gcse2 pass) problem.



[1] 4.8.0 20130124 (experimental) [trunk revision 195417]

Reply via email to