https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178

--- Comment #10 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #8)
> So w/ -Ofast -march=znver2 I get a runtime of 130 seconds, when I add
> -mtune-ctrl=^inter_unit_moves_from_vec,^inter_unit_moves_to_vec then
> this improves to 114 seconds, with sink2 disabled I get 108 seconds
> and with the tune-ctrl ontop I get 113 seconds.
> 
> Note that Zen2 is quite special in that it has the ability to handle
> load/store from the stack by mapping it to a register, effectively
> making them zero latency (zen3 lost this ability).
> 
> So while moves between GPRs and XMM might not be bad anymore _spilling_
> to a GPR (and I suppose XMM, too) is still a bad idea and the stack
> should be preferred.
> 

According to znver2_cost

Cost of sse_to_integer is a little bit less than fp_store, maybe increase
sse_to_integer cost(more than fp_store) can helps RA to choose memory instead
of GPR.

Reply via email to