http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #65 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #64)
> 
> would be simplified to this:
> 
>         mov.l   @(4,r4),r1
>         tst     r1,r1   // T = @(4,r4) == 0
> .L3:
>         bt/s    .L5
>         mov     #1,r1
>         cmp/hi  r1,r5
>         bf/s    .L9
>         mov     #0,r0
>         rts
>         nop
> .L2:
>         mov.l   @r4,r1
>         bra     .L3
>         tst     r1,r1   // T = @(r4) == 0

Sorry, I got confused.  The above is wrong.  One of the T bit inversions can't
be eliminated in this case.
It should be:

        mov.l   @(4,r4),r1
.L3:
        tst     r1,r1
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        tst     r1,r1
        bra     .L3
        movt    r1


Or SH2A:
        mov.l   @(4,r4),r1
        tst     r1,r1
.L3:
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        tst     r1,r1
        bra     .L3
        nott

However, my original 'optimized' asm snippet is valid if the reduced test case
is changed to:

static inline int
blk_oversized_queue (int* q)
{
  if (q[2])
    return q[1] == 0;   // instead of != 0
  return q[0] == 0;
}

The current trunk version eliminates the movt/tst insns and produces correct
code by accident.  It can be simplified even more:

        mov.l   @(4,r4),r1
.L3:
        tst     r1,r1
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        bra     .L3
        mov.l   @r4,r1

I'm trying to come up with a patch that implements t bit tracing in order to
handle those scenarios.

Reply via email to