https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #8)
> Hi!
>
> (In reply to Richard Biener from comment #7)
> > Wow, and this time it's even combine coming into play!
>
> But it is just something that happens during the instruction combiner pass,
> not anything to do with its own code :-)
>
> > Trying 4 -> 11:
> > 4: r86:DI=r91:DI
> > REG_DEAD r91:DI
> > 11: [r86:DI]=[r86:DI]
> > REG_DEAD r86:DI
> > Failed to match this instruction:
> > (set (mem/j:DI (reg:DI 91) [2 pu_6(D)->x+0 S8 A64])
> > (mem/j:DI (reg:DI 91) [2 pu_6(D)->y+0 S8 A64]))
> > allowing combination of insns 4 and 11
> > original costs 4 + 4 = 8
> > replacement cost 4
> > deferring deletion of insn with uid = 4.
> > modifying insn i3 11: [r91:DI]=[r91:DI]
> > REG_DEAD r91:DI
> > deferring rescan insn with uid = 11.
> > deleting noop move 11
> >
> >
> > somewhere inside combine we'd have to realize that this isn't a noop move
>
> What makes you say it is not? It passes noop_move_p (since its pattern
> passes set_noop_p), after all.
>
> if (MEM_P (dst) && MEM_P (src))
> return (rtx_equal_p (dst, src)
> >------- && !side_effects_p (dst)
> >------- && !side_effects_p (src));
(*)
> > and then maybe not allow the combination in the first place since it
> > isn't recognizable?
>
> It passes recog() just fine, that is what
>
> > allowing combination of insns 4 and 11
>
> tells us.
>
> > That is, somehow we must anticipate the removal,
> > I suppose it is via
> >
> > /* Recognize all noop sets, these will be killed by followup pass. */
> > if (insn_code_number < 0 && GET_CODE (pat) == SET && set_noop_p (pat))
> > insn_code_number = NOOP_MOVE_INSN_CODE, num_clobbers_to_add = 0;
>
> It is not. We must already pass set_noop_p before that insn_code is ever
> set, as you show here.
Oh, so it's just a compile-time optimization then.
> > where set_noop_p for two MEMs simply dispatches to
> > rtx_equal_p && !side_effects_p.
>
> Yes. Which is completely correct, no? RTL describes what happens in the
> machine, not some layer whatever language drops on top of this, that has to
> be handled in the language frontend.
Even RTL is more than just "what happens in the machine", else we'd not
have MEM_ALIAS_SET or other side-band information usable by optimizers.
So then (*), aka
if (MEM_P (dst) && MEM_P (src))
return (rtx_equal_p (dst, src)
&& !side_effects_p (dst)
&& !side_effects_p (src));
is not true since the store represents changing the effective type of DST
with respect to TBAA (that's what all these bugs are about). One might
argue the bug is in rtx_equal_p but then the "side-effect" of altering
the effective type is part of the insn, not of the SRC or DST by themselves.
Note this is all a bit academic since the testcase is miscompiled in many
places during GIMPLE already. I was just noting that set_noop_p will
eventually trigger the same issue during the combine pass (as you say,
not a fault of combine, but of general infrastructure).