[Bug c/32433] New: Code for __builtin_ffs does not benefit from compiler optimizations

guillaume dot melquiond at ens-lyon dot fr Wed, 20 Jun 2007 10:30:50 -0700

When counting the number of trailing zeros using ffs, it is not uncommon to
want a 0-based result and a special value for the zero case. Example:


extern int ffs(int);
int my_ctz(int x) { return x == 0 ? 24 : ffs(x) - 1; }

The generated code (-O3 on x86-64) looks like:

.LFB2:
        testl   %edi, %edi
        movl    $24, %eax
        jne     .L6
        rep ; ret
        .p2align 4,,7
.L6:
        bsfl    %edi, %eax
        movl    $-1, %edx     <-
        cmove   %edx, %eax    <-
        addl    $1, %eax      <-
        subl    $1, %eax      <-
        ret

Notice that the last four lines are useless.

The optimal assembly code would be:

        bsfl    %edi, %eax
        movl    $24, %edx
        cmove   %edx, %eax
        ret

Tested on GCC 4.2.1 and 4.3.0 (20070613) for x86-32 and x86-64 target.

When using __builtin_ctz instead in my_ctz, the code is (almost) optimal. But
unfortunately there is no "ctz" posix function, while there is an "ffs" one.
That's why ffs is used in "portable" code.

Anyway, as a consequence of the code of __builtin_ctz being optimal, we can
force the compiler to generate optimal code by defining an ffs function by
hand:

int my_ffs(int x) { return x == 0 ? 0 : __builtin_ctz(x) + 1; }
int my_ctz(int x) { return x == 0 ? 24 : my_ffs(x) - 1; }

The compiler generates for my_ctz:

        bsfl    %edi, %edx
        movl    $24, %eax
        testl   %edi, %edi
        cmovne  %edx, %eax
        ret

So the definition of __builtin_ffs should really be modified on x86, so that it
expands to the code described in my_ffs. Compiler optimizations will then apply
and they will remove the useless tests and arithmetic operations.


-- 
           Summary: Code for __builtin_ffs does not benefit from compiler
                    optimizations
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: guillaume dot melquiond at ens-lyon dot fr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32433

[Bug c/32433] New: Code for __builtin_ffs does not benefit from compiler optimizations

Reply via email to