https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #38 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 11 Mar 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
> 
> --- Comment #37 from Hongtao.liu <crazylht at gmail dot com> ---
> > There is not much value in the vectorization we do in this function
> > (when manually fixing the STLF issue the speed is as good as with the
> > scalar code).  We cost
> > 
> > ray.dir.x 1 times scalar_load costs 12 in body
> > ray.dir.y 1 times scalar_load costs 12 in body
> Still from an target-related perspective, instead of adding cost for STLF
> penalty, maybe we should just reduce cost of scalar_load if it's from 
> parm_decl
> because there's probably STLF.

That's an interesting idea - it would eventually also improve the case
where the argument is passed in register(s) but we fail to realize that.

I'll see if I get around to prototype some argument classification
in the vectorizer (looking how hard it is to use
INIT_CUMULATIVE_ARGS in a context where we are not expanding to RTL),
unfortunately stack passing is done by code in function.cc (plus
extra target hooks of course), but it might be easy enough to figure
alignment and size at least (and whether arguments are passed on
the stack or not).

Reply via email to