When I fixed various tests in
<http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01662.html> for failures
with --with-arch=bdver3, I missed that a so-configured compiler still
defaults to -mtune=generic. If you override that as well with
--with-cpu=bdver3, further failures appear, and this patch fixes some
of them.
Most of these changes add -mno-prefer-avx128 to AVX tests not
expecting a -mprefer-avx128 default. In addition, some tests have
-mtune=generic added where the behavior tested for depends on some
tuning parameter that I identified: X86_TUNE_EXT_80387_CONSTANTS or
X86_TUNE_SSE_LOAD0_BY_PXOR.
Tested x86_64-linux-gnu. OK to commit?
There are other failures this patch does not resolve in a
--with-arch=bdver3 --with-cpu=bdver3 configuration. Some of these are
AVX tests whose failures are not resolved by adding -mno-prefer-avx128
(and so this patch does not add -mno-prefer-avx128 to those tests);
others may be cases where -mtune=generic is appropriate but I haven't
identified the specific tuning parameter that shows code generation
differences depending on tuning are correct and so a -mtune= option
should be used.
FAIL: gcc.target/i386/avx2-vpand-1.c scan-assembler vpand[ \\t]+[^\n]*%ymm[0-9]
FAIL: gcc.target/i386/avx2-vpand-3.c scan-assembler-times vpand[
\\t]+[^\n]*%ymm[0-9] 1
FAIL: gcc.target/i386/avx2-vpandn-1.c scan-assembler vpandn[
\\t]+[^\n]*%ymm[0-9]
FAIL: gcc.target/i386/avx2-vpor-1.c scan-assembler vpor[ \\t]+[^\n]*%ymm[0-9]
FAIL: gcc.target/i386/avx2-vpxor-1.c scan-assembler vpxor[ \\t]+[^\n]*%ymm[0-9]
FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler
(sse2_loaddqu|vmovdqu[^\n\r]*movv16qi_internal)
FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler vinsert.128
FAIL: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vmovdqa64[
\\t]+%zmm 2
FAIL: gcc.target/i386/avx512f-vmovdqu32-1.c scan-assembler-times
vmovdqu[36][24][ \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
FAIL: gcc.target/i386/avx512f-vmovupd-1.c scan-assembler-times vmovupd[
\\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[
\\t]+[^\n]*%zmm[0-9][^{] 4
FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[
\\t]+[^\n]*%zmm[0-9][^{] 4
FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[
\\t]+[^\n]*%zmm[0-9][^{] 3
FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[
\\t]+[^\n]*%zmm[0-9][^{] 3
FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[
\\t]+[^\n]*%zmm[0-9][^{] 4
FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[
\\t]+[^\n]*%zmm[0-9][^{] 3
FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[
\\t]+[^\n]*%zmm[0-9][^{] 4
FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[
\\t]+[^\n]*%zmm[0-9][^{] 3
FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[
\\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
FAIL: gcc.target/i386/pr49002-1.c scan-assembler vmovapd[\t ]*[^,]*,[\t ]*%xmm
FAIL: gcc.target/i386/pr53712.c scan-assembler-times movdqu 1
FAIL: gcc.target/i386/pr53907.c scan-assembler movdqa
FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu 1
FAIL: gcc.target/i386/pr59539-2.c scan-assembler-times vmovdqu 1
2014-04-01 Joseph Myers <[email protected]>
* gcc.target/i386/387-3.c, gcc.target/i386/387-4.c,
gcc.target/i386/pr30970.c: Use -mtune=generic.
* gcc.target/i386/avx2-vpaddb-3.c,
gcc.target/i386/avx2-vpaddd-3.c, gcc.target/i386/avx2-vpaddq-3.c,
gcc.target/i386/avx2-vpaddw-3.c, gcc.target/i386/avx2-vpmulld-3.c,
gcc.target/i386/avx2-vpmullw-3.c, gcc.target/i386/avx2-vpsrad-3.c,
gcc.target/i386/avx2-vpsraw-3.c, gcc.target/i386/avx2-vpsrld-3.c,
gcc.target/i386/avx2-vpsrlw-3.c, gcc.target/i386/avx2-vpsubb-3.c,
gcc.target/i386/avx2-vpsubd-3.c, gcc.target/i386/avx2-vpsubq-3.c,
gcc.target/i386/avx2-vpsubw-3.c,
gcc.target/i386/avx256-unaligned-load-1.c,
gcc.target/i386/avx256-unaligned-load-4.c,
gcc.target/i386/avx256-unaligned-store-1.c,
gcc.target/i386/avx256-unaligned-store-2.c,
gcc.target/i386/avx256-unaligned-store-4.c: Use
-mno-prefer-avx128.
Index: gcc/testsuite/gcc.target/i386/avx2-vpmulld-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpmulld-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpmulld-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c (revision
208989)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load
-mno-avx256-split-unaligned-store -fno-common" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load
-mno-avx256-split-unaligned-store -mno-prefer-avx128 -fno-common" } */
#define N 1024
Index: gcc/testsuite/gcc.target/i386/avx2-vpmullw-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpmullw-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpmullw-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/387-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/387-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/387-3.c (working copy)
@@ -1,6 +1,6 @@
/* Verify that 387 mathematical constants are recognized. */
/* { dg-do compile } */
-/* { dg-options "-O2 -mfpmath=387 -mfancy-math-387" } */
+/* { dg-options "-O2 -mfpmath=387 -mfancy-math-387 -mtune=generic" } */
/* { dg-final { scan-assembler "fldpi" } } */
/* { dg-require-effective-target large_long_double } */
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c (revision
208989)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-2.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store
-mno-prefer-avx128" } */
#define N 1024
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-load-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-1.c (revision
208989)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-1.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-load" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-load
-mno-prefer-avx128" } */
#define N 1024
Index: gcc/testsuite/gcc.target/i386/pr30970.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr30970.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/pr30970.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile }
-/* { dg-options "-msse2 -O2 -ftree-vectorize" } */
+/* { dg-options "-msse2 -O2 -ftree-vectorize -mtune=generic" } */
#define N 256
int b[N];
Index: gcc/testsuite/gcc.target/i386/387-4.c
===================================================================
--- gcc/testsuite/gcc.target/i386/387-4.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/387-4.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -mfancy-math-387" } */
+/* { dg-options "-O2 -mfancy-math-387 -mtune=generic" } */
/* { dg-final { scan-assembler "fldpi" } } */
/* { dg-require-effective-target large_long_double } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsubq-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsubq-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsubq-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsubb-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsubb-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsubb-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpaddq-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpaddq-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpaddq-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpaddb-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpaddb-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpaddb-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsubd-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsubd-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsubd-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsrld-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsrld-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsrld-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpaddd-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpaddd-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpaddd-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsrad-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsrad-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsrad-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c (revision
208989)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load
-mno-avx256-split-unaligned-store -fno-common" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load
-mno-avx256-split-unaligned-store -mno-prefer-avx128 -fno-common" } */
#define N 1024
Index: gcc/testsuite/gcc.target/i386/avx2-vpsubw-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsubw-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsubw-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsrlw-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsrlw-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsrlw-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpaddw-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpaddw-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpaddw-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx2-vpsraw-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx2-vpsraw-3.c (revision 208989)
+++ gcc/testsuite/gcc.target/i386/avx2-vpsraw-3.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do run } */
-/* { dg-options "-mavx2 -O2 -ftree-vectorize -save-temps" } */
+/* { dg-options "-mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps" }
*/
/* { dg-require-effective-target avx2 } */
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c (revision
208989)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c (working copy)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -fno-common" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store
-mno-prefer-avx128 -fno-common" } */
#define N 1024
--
Joseph S. Myers
[email protected]