http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244
--- Comment #65 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #64)
>
> would be simplified to this:
>
> mov.l @(4,r4),r1
> tst r1,r1 // T = @(4,r4) == 0
> .L3:
> bt/s .L5
> mov #1,r1
> cmp/hi r1,r5
> bf/s .L9
> mov #0,r0
> rts
> nop
> .L2:
> mov.l @r4,r1
> bra .L3
> tst r1,r1 // T = @(r4) == 0
Sorry, I got confused. The above is wrong. One of the T bit inversions can't
be eliminated in this case.
It should be:
mov.l @(4,r4),r1
.L3:
tst r1,r1
bt/s .L5
mov #1,r1
cmp/hi r1,r5
bf/s .L9
mov #0,r0
rts
nop
.L2:
mov.l @r4,r1
tst r1,r1
bra .L3
movt r1
Or SH2A:
mov.l @(4,r4),r1
tst r1,r1
.L3:
bt/s .L5
mov #1,r1
cmp/hi r1,r5
bf/s .L9
mov #0,r0
rts
nop
.L2:
mov.l @r4,r1
tst r1,r1
bra .L3
nott
However, my original 'optimized' asm snippet is valid if the reduced test case
is changed to:
static inline int
blk_oversized_queue (int* q)
{
if (q[2])
return q[1] == 0; // instead of != 0
return q[0] == 0;
}
The current trunk version eliminates the movt/tst insns and produces correct
code by accident. It can be simplified even more:
mov.l @(4,r4),r1
.L3:
tst r1,r1
bt/s .L5
mov #1,r1
cmp/hi r1,r5
bf/s .L9
mov #0,r0
rts
nop
.L2:
bra .L3
mov.l @r4,r1
I'm trying to come up with a patch that implements t bit tracing in order to
handle those scenarios.