------- Comment #3 from ubizjak at gmail dot com  2007-08-28 12:02 -------
Current mainline [GCC: (GNU) 4.3.0 20070828] generates:

test:
.LFB2:
        xorl    %eax, %eax
        xorl    %edx, %edx
        .align 16
.L2:
        addb    table(%rdx), %al
        addq    $1, %rdx
        cmpq    $10, %rdx
        jne     .L2
        movsbl  %al,%eax
        ret

Now, there is an optimization problem in the intialization code in loop header.
To omit one iteration, it should start with:

.LFB2:
        movzbl  table(%rip), %ecx
        movl    $1, %edx
.L2:
        ...

BTW: By reversing the loop:

  for (i = 9; i; i--)
    val += table[i];

we could remove comparison from the loop, but instead we produce (x86_64):

test:
.LFB2:
        xorl    %eax, %eax
        xorl    %edx, %edx
        .align 16
.L2:
        addb    table+9(%rdx), %al
        subq    $1, %rdx
        cmpq    $-9, %rdx
        jne     .L2
        movsbl  %al,%eax
        ret

However, using -m32 we get:

test:
        xorl    %eax, %eax
        movl    $9, %edx
        .align 16
.L2:
        addb    table(%edx), %al
        subl    $1, %edx
        jne     .L2
        movsbl  %al,%eax
        ret

In reversed-loop case, we could generate:

test:
        movl    $9, %edx
        movzbl  table(%edx), %eax
        .align 16
.L2:
        addb    table(%edx), %al
        subl    $1, %edx
        jne     .L2
        movsbl  %al,%eax
        ret


-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|char adding gives an extra  |char adding (in loops) gives
                   |move or two                 |an extra move or two


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31248

Reply via email to