https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63483
--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #5)
> The bug is clearly in
>
> "
> > Btw, if the mem is MEM_READONLY_P how can it be part of
> > a {un}aligned_store sequence?
>
> This flag is copied from the original memory operand of the store by
> alpha_set_memflags to all memory operands in the expanded sequence."
>
> and thus should be fixed there. What is the "original" memory reference
> here? It seems for the read-modify-write you should use the target MEM
> but somehow a source MEM gets used? Or rather you should not use either
> MEMs flags for _all_ MEMs in the sequence but properly distinguish between
> MEMs generated for the load of the source and MEMs generated for the store
> to the destination.
No, this is wrong conclusion. I have new insight into this problem.
Please consider following test:
--cut here--
static char *a;
static int *b;
void foo (void)
{
a[1] = 1;
b[2] = 1;
}
int bar (void)
{
return a && b;
}
--cut here--
(e.g. changing "static char *b" to "static int *b". This way, the problematic
insn is generated as "native" read:
(insn 15 14 16 2 (set (reg/f:DI 78)
(mem/u/f/c:DI (lo_sum:DI (reg:DI 79)
(symbol_ref:DI ("b") [flags 0x6] <var_decl 0x2b1d67f5e090 b>))
[2 b+0 S8 A64])) rmw1.c:7 -1
(nil))
and this is again moved over
(insn 13 12 14 2 (set (mem:DI (and:DI (plus:DI (reg/f:DI 72)
(const_int 1 [0x1]))
(const_int -8 [0xfffffffffffffff8])) [0 S8 A64])
(reg:DI 77)) rmw1.c:6 -1
(nil))
(note that there is no "/u" suffix on the *store* of "a".)
With this testcase, we have following sequence before sched1 pass:
5: r73:DI=high(`a')
6: r72:DI=[r73:DI+low(`a')]
REG_DEAD r73:DI
7: r74:QI=0x1
8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
9: r75:DI=r72:DI+0x1
10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
REG_DEAD r75:DI
REG_DEAD r74:QI
REG_EQUAL 0x1<<r75:DI<<0x3
12: r77:DI=r77:DI|r76:DI
REG_DEAD r76:DI
13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
REG_DEAD r77:DI
REG_DEAD r72:DI
14: r79:DI=high(`b')
15: r78:DI=[r79:DI+low(`b')]
REG_DEAD r79:DI
16: r80:SI=0x1
17: [r78:DI+0x8]=r80:SI
REG_DEAD r80:SI
REG_DEAD r78:DI
and sched1 moves (insn 15) all the way up, past (insn 13):
5: r73:DI=high(`a')
14: r79:DI=high(`b')
7: r74:QI=0x1
6: r72:DI=[r73:DI+low(`a')]
REG_DEAD r73:DI
16: r80:SI=0x1
15: r78:DI=[r79:DI+low(`b')]
REG_DEAD r79:DI
9: r75:DI=r72:DI+0x1
8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
REG_DEAD r75:DI
REG_DEAD r74:QI
REG_EQUAL 0x1<<r75:DI<<0x3
10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
12: r77:DI=r77:DI|r76:DI
REG_DEAD r76:DI
13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
REG_DEAD r77:DI
REG_DEAD r72:DI
17: [r78:DI+0x8]=r80:SI
REG_DEAD r80:SI
REG_DEAD r78:DI
20: NOTE_INSN_DELETED
(this particular sequence won't end in a failure, but the insn shouldn't be
moved past possibly aliasing (insn 13) anyway.
Patched compiler results in:
5: r73:DI=high(`a')
7: r74:QI=0x1
14: r79:DI=high(`b')
6: r72:DI=[r73:DI+low(`a')]
REG_DEAD r73:DI
16: r80:SI=0x1
9: r75:DI=r72:DI+0x1
8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
REG_DEAD r75:DI
REG_DEAD r74:QI
REG_EQUAL 0x1<<r75:DI<<0x3
10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
12: r77:DI=r77:DI|r76:DI
REG_DEAD r76:DI
13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
REG_DEAD r77:DI
REG_DEAD r72:DI
15: r78:DI=[r79:DI+low(`b')]
REG_DEAD r79:DI
17: [r78:DI+0x8]=r80:SI
REG_DEAD r80:SI
REG_DEAD r78:DI
As shown, the code further down in true_dependence_1 function blocks the move,
since the read is detected as aliased with the preceding store involving AND
alignment.
There is nothing that can be done in target-dependant code, since the "native"
read from "b" gets marked as "unchanging" by the generic middle-end code.
Please reconsider the "component" setting. There is nothing that can be fixed
in target-dependent code.