https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71461
Bug ID: 71461 Summary: missed optimization in conditional assignment Product: gcc Version: 6.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lhyatt at gmail dot com Target Milestone: --- ===================== int i, j; void f1(bool b, int x) { (b ? i : j) = x; } void f2(bool b, int x) { *(b ? &i : &j) = x; } ================== On x86-64 at -O3, gcc outputs different code for f1() vs f2(), using a branch for f1() and a cmove for f2(). Is it something that can/should be optimized? gcc (version 5, 6, or current trunk) outputs: ================== f1(bool, int): testb %dil, %dil jne .L6 movl %esi, j(%rip) ret .L6: movl %esi, i(%rip) ret f2(bool, int): testb %dil, %dil movl $j, %edx movl $i, %eax cmove %rdx, %rax movl %esi, (%rax) ret ================== vs clang: ================== f1(bool, int): # @f1(bool, int) movl $i, %eax movl $j, %ecx testb %dil, %dil cmovneq %rax, %rcx movl %esi, (%rcx) retq f2(bool, int): # @f2(bool, int) movl $i, %eax movl $j, %ecx testb %dil, %dil cmovneq %rax, %rcx movl %esi, (%rcx) retq ================== In my tests, the branch version of f1() is about 2x-3x slower than f2() if the branch is 50% predicted, and only about 1% faster than f2() if it is 100% predicted. FWIW, this version: ================== void f3(bool b, int x) { if(b) i = x; else j = x; } ================== ends up using a branch on both gcc and clang. Thanks... -Lewis