[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

rguenther at suse dot de via Gcc-bugs Thu, 27 Jan 2022 03:30:43 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178

--- Comment #19 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 27 Jan 2022, hubicka at kam dot mff.cuni.cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178
> 
> --- Comment #13 from hubicka at kam dot mff.cuni.cz ---
> > > According to znver2_cost
> > > 
> > > Cost of sse_to_integer is a little bit less than fp_store, maybe increase
> > > sse_to_integer cost(more than fp_store) can helps RA to choose memory
> > > instead of GPR.
> > 
> > That sounds reasonable - GPR<->xmm is cheaper than GPR -> stack -> xmm
> > but GPR<->xmm should be more expensive than GPR/xmm<->stack.  As said above
> > Zen2 can do reg -> mem, mem -> reg via renaming if 'mem' is somewhat 
> > special,
> > but modeling that doesn't seem to be necessary.
> > 
> > We seem to have store costs of 8 and load costs of 6, I'll try bumping the
> > gpr<->xmm move cost to 8.
> 
> I was simply following latencies here, so indeed reg<->mem bypass is not
> really modelled.  I recall doing few experiments which was kind of
> inconclusive.

Yes, I think xmm->gpr->xmm vs. xmm->mem->xmm isn't really the issue here
but it's mem->gpr->xmm vs. mem->xmm with all the constant pool remats.
Agner lists latency of 3 for gpr<->xmm and a latency of 4 for mem<->xmm
but then there's forwarding (and the clever renaming trick) which likely
makes xmm->mem->xmm cheaper than 4 + 4 but xmm->gpr->xmm will really
be 3 + 3 latency.  grp<->xmm seem to be also more resource constrained.

In any case for moving xmm to gpr it doesn't make sense to go through
memory but it doesn't seem worth to spill to xmm or gpr when we only
use gpr / xmm later.

Letting the odd and bogus code we generate for the .LC0 re-materialization
aside which we should fix and which fixing likely will fix LBM.

[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

Reply via email to