[Bug target/63351] Optimization: contract broadcast intrinsics when AVX512 is enabled

kyukhin at gcc dot gnu.org Wed, 24 Sep 2014 23:32:31 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63351


--- Comment #3 from Kirill Yukhin <kyukhin at gcc dot gnu.org> ---
Hello,
For AVX-512F (zmm-s)
We have a patch which enables such as stuff basing
on combiner machinery: a new subst which allows
`broadcasted' version of patterns.
Combiner can combine (load-bcst + actual insn)
into (actual insn w/ bcst-ed mem-op).

This patch generates emb. bcts for such a cases:
+/* { dg-options "-O3 -mavxavx512f" } */
+/* { dg-final { scan-assembler-times "vpmulps\[
\\t\]+\[^\n\]*.*1to16.*%zmm\[0-9\]\[\\n\]" 1 } } */
+
+#define N 16
+
+float f1 (float *c1_p, float *c2_p)
+{
+
+  float a[N];
+  float b[N];
+  float c[N];
+  float c1 = *c1_p;
+  float c2 = *c2_p;
+  int i;
+
+  for (i = 0; i < N; i++)
+  {
+    a[i] = c1;
+    b[i] = c2;
+  }
+
+  for (i = 0; i < N; i++)
+  {
+    c[i] = a[i] * b[i];
+  }
+
+  return c[(int)(c1 + c2) % N];
+}

The patch almost no impact on Spec2006 (one of the reasons
is the combiner not working through bb-s).

For AVX-512VL ([xy]mm-s)
Such an optimization should be also applicable, when
all new patterns will reach the trunk.

[Bug target/63351] Optimization: contract broadcast intrinsics when AVX512 is enabled

Reply via email to