https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63483

--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #5)
> The bug is clearly in
> 
> "
> > Btw, if the mem is MEM_READONLY_P how can it be part of
> > a {un}aligned_store sequence?
> 
> This flag is copied from the original memory operand of the store by
> alpha_set_memflags to all memory operands in the expanded sequence."
> 
> and thus should be fixed there.  What is the "original" memory reference
> here?  It seems for the read-modify-write you should use the target MEM
> but somehow a source MEM gets used?  Or rather you should not use either
> MEMs flags for _all_ MEMs in the sequence but properly distinguish between
> MEMs generated for the load of the source and MEMs generated for the store
> to the destination.

No, this is wrong conclusion. I have new insight into this problem.

Please consider following test:

--cut here--
static char *a;
static int *b;

void foo (void)
{
  a[1] = 1;
  b[2] = 1;
}

int bar (void)
{
  return a && b;
}
--cut here--

(e.g. changing "static char *b" to "static int *b". This way, the problematic
insn is generated as "native" read:

(insn 15 14 16 2 (set (reg/f:DI 78)
        (mem/u/f/c:DI (lo_sum:DI (reg:DI 79)
                (symbol_ref:DI ("b") [flags 0x6]  <var_decl 0x2b1d67f5e090 b>))
[2 b+0 S8 A64])) rmw1.c:7 -1
     (nil))

and this is again moved over

(insn 13 12 14 2 (set (mem:DI (and:DI (plus:DI (reg/f:DI 72)
                    (const_int 1 [0x1]))
                (const_int -8 [0xfffffffffffffff8])) [0  S8 A64])
        (reg:DI 77)) rmw1.c:6 -1
     (nil))

(note that there is no "/u" suffix on the *store* of "a".)

With this testcase, we have following sequence before sched1 pass:

    5: r73:DI=high(`a')
    6: r72:DI=[r73:DI+low(`a')]
      REG_DEAD r73:DI
    7: r74:QI=0x1
    8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
    9: r75:DI=r72:DI+0x1
   10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
   11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
      REG_DEAD r75:DI
      REG_DEAD r74:QI
      REG_EQUAL 0x1<<r75:DI<<0x3
   12: r77:DI=r77:DI|r76:DI
      REG_DEAD r76:DI
   13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
      REG_DEAD r77:DI
      REG_DEAD r72:DI
   14: r79:DI=high(`b')
   15: r78:DI=[r79:DI+low(`b')]
      REG_DEAD r79:DI
   16: r80:SI=0x1
   17: [r78:DI+0x8]=r80:SI
      REG_DEAD r80:SI
      REG_DEAD r78:DI

and sched1 moves (insn 15) all the way up, past (insn 13):

    5: r73:DI=high(`a')
   14: r79:DI=high(`b')
    7: r74:QI=0x1
    6: r72:DI=[r73:DI+low(`a')]
      REG_DEAD r73:DI
   16: r80:SI=0x1
   15: r78:DI=[r79:DI+low(`b')]
      REG_DEAD r79:DI
    9: r75:DI=r72:DI+0x1
    8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
   11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
      REG_DEAD r75:DI
      REG_DEAD r74:QI
      REG_EQUAL 0x1<<r75:DI<<0x3
   10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
   12: r77:DI=r77:DI|r76:DI
      REG_DEAD r76:DI
   13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
      REG_DEAD r77:DI
      REG_DEAD r72:DI
   17: [r78:DI+0x8]=r80:SI
      REG_DEAD r80:SI
      REG_DEAD r78:DI
   20: NOTE_INSN_DELETED

(this particular sequence won't end in a failure, but the insn shouldn't be
moved past possibly aliasing (insn 13) anyway.

Patched compiler results in:

    5: r73:DI=high(`a')
    7: r74:QI=0x1
   14: r79:DI=high(`b')
    6: r72:DI=[r73:DI+low(`a')]
      REG_DEAD r73:DI
   16: r80:SI=0x1
    9: r75:DI=r72:DI+0x1
    8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8]
   11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3
      REG_DEAD r75:DI
      REG_DEAD r74:QI
      REG_EQUAL 0x1<<r75:DI<<0x3
   10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI
   12: r77:DI=r77:DI|r76:DI
      REG_DEAD r76:DI
   13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI
      REG_DEAD r77:DI
      REG_DEAD r72:DI
   15: r78:DI=[r79:DI+low(`b')]
      REG_DEAD r79:DI
   17: [r78:DI+0x8]=r80:SI
      REG_DEAD r80:SI
      REG_DEAD r78:DI

As shown, the code further down in true_dependence_1 function blocks the move,
since the read is detected as aliased with the preceding store involving AND
alignment.

There is nothing that can be done in target-dependant code, since the "native"
read from "b" gets marked as "unchanging" by the generic middle-end code.

Please reconsider the "component" setting. There is nothing that can be fixed
in target-dependent code.

Reply via email to