https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #37 from Hongtao.liu <crazylht at gmail dot com> ---
> There is not much value in the vectorization we do in this function
> (when manually fixing the STLF issue the speed is as good as with the
> scalar code).  We cost
> 
> ray.dir.x 1 times scalar_load costs 12 in body
> ray.dir.y 1 times scalar_load costs 12 in body
Still from an target-related perspective, instead of adding cost for STLF
penalty, maybe we should just reduce cost of scalar_load if it's from parm_decl
because there's probably STLF.

Reply via email to