[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code

steven at gcc dot gnu.org Thu, 14 Feb 2013 08:59:36 -0800


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309




Steven Bosscher <steven at gcc dot gnu.org> changed:



           What    |Removed                     |Added

----------------------------------------------------------------------------

                 CC|                            |steven at gcc dot gnu.org



--- Comment #11 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-14 
16:59:04 UTC ---

(In reply to comment #8)

> I wonder if instead of emitting this sequence

> 

>    shr    $0x20,%rdi

>    and    $0xffffffff,%ecx

>    cmp    %r8,%rdx

>    cmovbe %r11,%rdi

>    add    $0x1,%rax

>    cmp    %r8,%rdx

>    cmovbe %rdx,%rcx

> 

> it would do this instead

> 

>    shr    $0x20,%rdi

>    and    $0xffffffff,%ecx

>    add    $0x1,%rax

>    cmp    %r8,%rdx

>    cmovbe %r11,%rdi

>    cmovbe %rdx,%rcx



GCC fails to do so because the flags are clobbered between the two

cmovs, preventing code motion to group the two cmovs:



  197: r116:DI={(gtu(flags:CC,0))?r125:DI:r233:DI}

  199: {r110:DI=r110:DI+0x1;clobber flags:CC;}

  201: flags:CC=cmp(r124:DI,r235:DI)

  202: r221:DI={(gtu(flags:CC,0))?r126:DI:r124:DI}



If you do this change manually in your code (compile with -S, "fix"

the .s file and assemble it), does that speed up your code?

[Bug target/56309] -O3 optimizer generates conditional moves instead of compare and branch resulting in almost 2x slower code

Reply via email to