https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88786
Bug ID: 88786
Summary: Expand vector copysign (and xorsign) operations in the
vectoriser
Product: gcc
Version: 9.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
Target Milestone: ---
Currently every target defines the copysign optab for vector modes to emit very
similar sequences of extracting the sign bit in RTL. This leads to almost
identical code for AArch64 Adv SIMD, SVE, aarch32 NEON etc.
We should teach the vectoriser to expand a vector copysign operation at the
tree level to benefit from more optimisations early on. Care needs to be taken
to make sure the xorsign optimisation (currently done late in widen_mult) still
triggers for vectorised code. This will allow us to a lot of duplicate code in
the MD patterns and only implement them if the target can actually do a smarter
sequence than the default.
This is similar in principle to the multiplication-by-constant expansion we
already do in tree-vect-patterns.c
See, for example, the gcc.target/aarch64/vect-xorsign_exec.c testcase for the
kind of input for this.