[Bug target/106038] x86_64 vectorization of ALU ops using xmm registers prematurely

rguenth at gcc dot gnu.org via Gcc-bugs Tue, 21 Jun 2022 01:21:04 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106038


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2022-06-21
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
                 CC|                            |rguenth at gcc dot gnu.org
             Blocks|                            |53947
             Target|                            |x86_64-*-*

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The vectorizer does not anticipate store-merging performing "vectorization" in
GPRs and thus the scalar cost is off (it also doesn't anticipate the different
ISA constraints wrt xmm vs gpr usage).

I wonder if we should try to follow what store-merging would do with respect
to "vector types", thus prefer "general vectors" (but explicitely via integer
types since we can't have vector types with both integer and vector modes)
when possible (for bit operations and plain copies).

scanning over an SLP instance (group) and substituting integer types for
SLP_TREE_VECTYPE might be possible.  Doing this nicely somewhere is going to
be more interesting.  Far away vectorizable_* should compute a set of
{ vector-type, cost } pairs from the set of input operand vector-type[, cost]
pair sets.  Not having "generic" vectors as vector types and relying on
vector lowering to expand them would be an incremental support step for this
I guess.

"backwards STV" could of course also work on the target side.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug target/106038] x86_64 vectorization of ALU ops using xmm registers prematurely

Reply via email to