http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309
--- Comment #14 from arturomdn at gmail dot com 2013-02-14 17:30:54 UTC --- I also did the experiment, with the same results... it got faster but not as fast as the version with conditional branch instead of conditional moves: ./by-ref-O3 Took 6.65 seconds total. ./by-val-O3 Took 11.64 seconds total. ./by-val-fixed-O3 Took 9.94 seconds total. --- by-val-O3.s 2013-02-14 11:27:28.109856000 -0600 +++ by-val-fixed-O3.s 2013-02-14 11:11:07.312317000 -0600 @@ -679,13 +679,12 @@ shrq $32, %rdi .loc 4 25 0 andl $4294967295, %ecx + addq $1, %rax cmpq %r8, %rdx cmovbe %r11, %rdi .LVL43: .loc 4 29 0 - addq $1, %rax .LVL44: - cmpq %r8, %rdx cmovbe %rdx, %rcx .LVL45: .loc 4 21 0