https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56253
Agner Fog <agner at agner dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |agner at agner dot org
--- Comment #8 from Agner Fog <agner at agner dot org> ---
The same problem applies to other kinds of optimizations, such as algebraic
reductions and constant propagation.
The method of using operators such as * and + is not portable to other
compilers, and it doesn't work with integer vectors for other integer sizes
than 64-bits. (I know that there is no integer FMA on Intel CPUs, but I am also
talking about other optimizations).
Here are some other examples of optimizations I would like gcc to do:
#include "x86intrin.h"
void dummy2(__m128 a, __m128 b);
void dummyi2(__m128i a, __m128i b);
void commutative(__m128 a, __m128 b) {
// expect reduce a+b = b+a. This is the only reduction that actually works!
dummy2(_mm_add_ps(a,b), _mm_add_ps(b,a));
}
void associative(__m128i a, __m128i b, __m128i c) {
// expect reduce (a+b)+c = a+(b+c)
dummy2i(_mm_add_epi32(_mm_add_epi32(a,b),c),
_mm_add_epi32(a,_mm_add_epi32(b,c)));
}
void distributive(__m128i a, __m128i b, __m128i c) {
// expect reduce a*b+a*c = a*(b+c)
dummy2i(_mm_add_epi32(_mm_mul_epi32(a,b),_mm_mul_epi32(a,c)),
_mm_mul_epi32(a,_mm_add_epi32(b,c)));
}
void constant_propagation() {
// expect store c and d as precalculated constants
__m128i a = _mm_setr_epi32(1,2,3,4);
__m128i b = _mm_set1_epi32(5);
__m128i c = _mm_add_epi32(a,b);
__m128i d = _mm_mul_epi32(a,b);
dummyi2(c,d);
}
Of course, the same applies to 256-bit and 512-bit vectors.