https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107916
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Summary|PPC VSX code generation for |vector_size(32) is
|OpenZFS |inefficient for VSX on
| |powerpc64
Component|target |middle-end
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Reduced testcase:
```
#include <stdint.h>
typedef uint32_t u32x4 __attribute__ ((vector_size (16)));
typedef uint32_t u32x8 __attribute__ ((vector_size (32)));
typedef uint64_t u64x4 __attribute__ ((vector_size (32)));
#pragma GCC push_options
#if defined(__x86_64__)
#ifdef __clang_major__
#pragma clang attribute push(__attribute__((target("avx2"))), \
apply_to = function)
#else
#pragma GCC target ("avx2")
#endif
#elif defined(__powerpc64__)
#ifdef __clang_major__
#pragma clang attribute
push(__attribute__((target("vsx,block-ops-unaligned-vsx,power8-vector"))), \
apply_to = function)
#else
#pragma GCC target ("vsx,block-ops-unaligned-vsx,power8-vector,power9-vector")
#endif
#endif
void f(int n, u32x8 *a, u32x8 *b)
{
u32x8 c = {0};
for(int i = 0; i < n; i++)
c+=*a;
*b += c;
}
#ifdef __clang_major__
#if defined(__x86_64__) || defined(__powerpc64__)
#pragma clang attribute pop
#endif
#else
#pragma GCC pop_options
#endif
```
Basically what is going wrong is that c is being pushed to the stack. But
really I had expected c's phi node to be split during vector lowering.