https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591
--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Hongtao Liu from comment #12)
> > short a;
> > short c;
> > short d;
> > void
> > foo (short b, short f)
> > {
> > c = b + a;
> > d = f + a;
> > }
> >
> > foo(short, short):
> > addw a(%rip), %di
> > addw a(%rip), %si
> > movw %di, c(%rip)
> > movw %si, d(%rip)
> > ret
> >
> > this one is bad since gcc10.1 and there's no subreg, The problem is if the
> > operand is used by more than 1 insn, and they all support separate m
> > constraint, mem_cost is quite small(just 1, reg move cost is 2), and this
> > makes RA more inclined to propagate memory across insns. I guess RA assumes
> > the separate m means the insn only support memory_operand?
>
> I don't see this as problematic. IIRC, there was a discussion in the past
> that a couple (two?) memory accesses from the same location close to each
> other can be faster (so, -O2, not -Os) than preloading the value to the
> register first.
Someone just filed a similar issue to the above testcase (the one in comment
#12) as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114688 :).