12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

rguenther at suse dot de via Gcc-bugs Fri, 18 Feb 2022 01:28:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #12 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 18 Feb 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
> 
> --- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> True.
> So another option is to try to undo some of those short vectorization cases
> during isel, expansion or later, though e.g. for the negdi2 case it will go
> already during expansion into memory.

Yes, there are duplicates about this issue and it's really hard to
solve generally.  There's the possibility to try improving on the
costing side but currently the cost hooks just see

ix86_vector_costs::add_stmt_cost (this=0x41b88c0, count=1, 
kind=vec_construct, stmt_info=0x0, vectype=<vector_type 0x7ffff667a888>, 
misalign=0, where=vect_prologue)

so they have no idea about the feeding stmts.  The cost entry
is generated by vect_prologue_cost_for_slp which knows the
scalar operands but we do not pass the SLP node down to the cost
hooks (that's something on my list but my idea was to push it back
when we only have SLP nodes and thus could go w/o the stmt_info then).

The other possibility is (for the original testcase) to anticipate
that RTL expansion will expand 'w' to a TImode register and take
that as a reason to pessimize vectorization (but we don't know how
it's going to be used, so that's probably a flawed attempt).

The only short-term fixes are a) biasing the costing, regressing
the from memory case, b) pass down the SLP node where available
and look at the defs of the CTOR components, costing a gpr->xmm
move where it can be anticipated.

b) is more future-proof, if we'd take that at this point I can
see how intrusive it would be.

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

Reply via email to