https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811

--- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Actually vectorization hurts on both compilers and bit more with clang.
It seems that all important loops are hand vectorized and since register
pressure is a problem, vectorizing other loops causes enough of collateral
damage to register allocation to regress performance.

I believe the core of the problem (or at least one of them) is simply way we
compile loops popping data from std::vector based stack. See PR109849
We keep updating stack datastructure in the innermost loop becuase in not too
common case reallocation needs to be done and that is done by offlined code.

Reply via email to