------- Comment #3 from svfuerst at gmail dot com  2010-04-30 16:12 -------
Oops, you are right.  The 128 bit version needs an extra sbb on the end with
that code.  (For some reason I was missreading the shr as a sar.):

mov    %rsi,%rdx
shr    $0x3f,%rdx
lea    (%rdi,%rdx,1),%rax
and    $0x1,%eax
sub    %rdx,%rax
sbb    %rdx,%rdx

However, if you use sar + add, instead of shr + sub + sbb, it is one
instruction less:
mov    %rsi,%rdx
sar    $0x3f,%rdx
lea    (%rdi,%rdx,1),%rax
and    $0x1,%eax
add    %rdx,%rax


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43883

Reply via email to