https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71461

            Bug ID: 71461
           Summary: missed optimization in conditional assignment
           Product: gcc
           Version: 6.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lhyatt at gmail dot com
  Target Milestone: ---

=====================
int i, j;
void f1(bool b, int x) {
    (b ? i : j) = x;
}
void f2(bool b, int x) {
    *(b ? &i : &j) = x;
}
==================

On x86-64 at -O3, gcc outputs different code for f1() vs f2(), using a branch
for f1() and a cmove for f2(). Is it something that can/should be optimized?

gcc (version 5, 6, or current trunk) outputs:
==================
f1(bool, int):
        testb   %dil, %dil
        jne     .L6
        movl    %esi, j(%rip)
        ret
.L6:
        movl    %esi, i(%rip)
        ret

f2(bool, int):
        testb   %dil, %dil
        movl    $j, %edx
        movl    $i, %eax
        cmove   %rdx, %rax
        movl    %esi, (%rax)
        ret
==================

vs clang:

==================
f1(bool, int):                                # @f1(bool, int)
        movl    $i, %eax
        movl    $j, %ecx
        testb   %dil, %dil
        cmovneq %rax, %rcx
        movl    %esi, (%rcx)
        retq

f2(bool, int):                                # @f2(bool, int)
        movl    $i, %eax
        movl    $j, %ecx
        testb   %dil, %dil
        cmovneq %rax, %rcx
        movl    %esi, (%rcx)
        retq
==================

In my tests, the branch version of f1() is about 2x-3x slower than f2() if the
branch is 50% predicted, and only about 1% faster than f2() if it is 100%
predicted.

FWIW, this version:

==================
void f3(bool b, int x) {
    if(b) i = x; else j = x;
}
==================

ends up using a branch on both gcc and clang.

Thanks...

-Lewis

Reply via email to