reload, and generic should think about it.

peter at cordes dot ca Fri, 19 May 2017 15:01:41 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820


--- Comment #3 from Peter Cordes <peter at cordes dot ca> ---
Also, going the other direction is not symmetric.  On some CPUs, a store/reload
strategy for xmm->int might be better even if an ALU strategy for int->xmm is
best.

Also, the choice can depend on chunk size, since loads are cheap (2 per clock
for AMD since K8 and Intel since SnB).  And store-forwarding works.

Doing the first one with movd and the next with store/reload might be good,
too, on some CPUs. especially if there's some independent work that can happen
for the movd result.

I also discussed some of this at the bottom of the first post in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80833.

[Bug target/80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen should avoid store/reload, and generic should think about it.

Reply via email to