https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #39 from Hongtao.liu <crazylht at gmail dot com> ---

> I'll see if I get around to prototype some argument classification
> in the vectorizer (looking how hard it is to use
> INIT_CUMULATIVE_ARGS in a context where we are not expanding to RTL),
> unfortunately stack passing is done by code in function.cc (plus
> extra target hooks of course), but it might be easy enough to figure
> alignment and size at least (and whether arguments are passed on
> the stack or not).

According to Intel software optimization guide,  
When using an unmasked store instruction, and load instruction after it, data
forwarding depends on ***load type, size and address offset from store
address***, and does not depend on the store address itself (i.e., the store
address does not have to be aligned to or fit into cache line, forwarding will
occur for nonaligned and even line-split stores).
The figure below describes all possible cases when data forwarding will occur.

I'm not sure if we can get store size in the vectorizer, how parameter been
pushed to stack by caller also matters for STLF.

Reply via email to