https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #28 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
With my latest patch I bootstrapped a configuration with
--with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16
--with-float=hard
I noticed a single regression in gcc.target/arm/pr53447-*.c
That is caused by disabling the adddi3 expansion.
void t0p(long long * p)
{
*p += 0x100000001;
}
used to get compiled to this at -O2:
ldrd r2, [r0]
adds r2, r2, #1
adc r3, r3, #1
strd r2, [r0]
bx lr
but without the adddi3 pattern I have at -O2:
ldr r3, [r0]
ldr r1, [r0, #4]
cmn r3, #1
add r3, r3, #1
movcc r2, #0
movcs r2, #1
add r1, r1, #1
str r3, [r0]
add r3, r2, r1
str r3, [r0, #4]
bx lr
Note that also the ldrd instructions are not there.
Unfortunaltely also the other di3 pattern make the ldrd go away:
void t0p(long long * p)
{
*p |= 0x100000001;
}
was with iordi3 like this:
ldrd r2, [r0]
orr r2, r2, #1
orr r3, r3, #1
strd r2, [r0]
bx lr
and without iordi3:
ldm r0, {r2, r3}
orr r2, r2, #1
orr r3, r3, #1
stm r0, {r2, r3}
bx lr
but
void t0p(long long * p)
{
p[1] |= 0x100000001;
}
gets two loads instead:
ldr r2, [r0, #8]
ldr r3, [r0, #12]
orr r2, r2, #1
orr r3, r3, #1
str r2, [r0, #8]
str r3, [r0, #12]
bx lr
however:
void t0p(long long * p)
{
p[1] <<= 11;
}
gets compiled into this:
ldr r3, [r0, #12]
ldr r2, [r0, #8]
lsl r3, r3, #11
lsl r1, r2, #11
orr r3, r3, r2, lsr #21
str r1, [r0, #8]
str r3, [r0, #12]
bx lr
already before my patch.
I think this is the effect on the ldrd that you already mentioned,
and it gets worse when the expansion breaks the di registers up
into two si registers.