http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46235

--- Comment #4 from Tony Poppleton <tony.poppleton at gmail dot com> 2011-01-28 
18:08:15 UTC ---
As a quick test, I commented out the block with the following comment in
fold-const.c:
      /* If this is an EQ or NE comparison with zero and ARG0 is
         (1 << foo) & bar, convert it to (bar >> foo) & 1.  Both require
         two operations, but the latter can be done in one less insn
         on machines that have only two-operand insns or on which a
         constant cannot be the first operand.  */

This produces the following asm code:
        movl    $1, %edx
        movl    %edi, %eax
        movl    %esi, %ecx
        movl    %edx, %edi
        sall    %cl, %edi
        testl   %eax, %edi
        cmove   %edx, %eax
        ret
(using modified GCC 4.6.0 20110122)

So whilst I was hoping for an easy quick-fix, it appears that the required
optimization to convert it into a "btl" test isn't there later on in the
compile.

Incidentally, from looking at http://gmplib.org/~tege/x86-timing.pdf, it
appears that "bt" is slow on P4 architecture (8 cycles if I am reading it
correctly?  which sounds slow), so the llvm code in the bug description isn't
necessarily an optimization on this arch.  Newer chips would probably still
benefit though.

Reply via email to