‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, December 6, 2019 6:38 AM, Richard Biener <richard.guent...@gmail.com> wrote:
> On Fri, Dec 6, 2019 at 12:15 PM Jakub Jelinek ja...@redhat.com wrote: > > > On Fri, Dec 06, 2019 at 11:48:03AM +0100, Richard Biener wrote: > > > > > So I used > > > void sincos(double x, double *sin, double *cos); > > > _Complex double attribute((simd("notinbranch"))) > > > __builtin_cexpi (double); > > > > While Intel-ABI-Vector-Function-2015-v0.9.8.pdf talks about complex numbers, > > the reason we punt: > > unsupported return type ‘complex double’ for simd > > etc. is that we really don't support VECTOR_TYPE with COMPLEX_TYPE element > > type, I guess the vectorizer doesn't do anything with that either unless > > some earlier optimization was able to scalarize the complex halves. > > In theory we could represent the vector counterparts of complex types > > as just vectors of double width with element type of COMPLEX_TYPE element > > type, have a look at what exactly ICC does to find out if the vector > > ordering is real0 complex0 real1 complex1 ... or > > real0 real1 real2 ... complex0 complex1 complex2 ... > > and tweak everything that needs to cope. > > I hope real0 complex0, ... > > Anyway, the first step is to support vectorizing code where parts of it are > already vectors: > > typedef double v2df attribute((vector_size(16))); > #define N 1024 > v2df a[N]; > double b[N]; > double c[N]; > void foo() > { > for (int i = 0; i < N; ++i) > { > v2df tem = a[i]; > b[i] = tem[0]; > c[i] = tem[1]; > } > } > > that can be "re-vectorized" for AVX for example. If you substitute > _Complex double for the vector type we only handle it during > vectorization because forwprop combines the load and the > __real/imag which helps. > Are we certain the change we want is to support _Complex double so that cexpi is auto-vectorized? Looking at the resulting executable of the code with sincos in the loop, the only function called is sincos. Not builtin_cexpi or any variant of cexpi. File gcc/builtins.c expands calls to builtin_cexpi to sincos! What is gained by the compiler going through the transformations sincos -> builtin_cexpi -> sincos? Bert.