[PATCH][simplify-rtx]: Fix incorrect folding of shift and AND [PR117012]

Tamar Christina Mon, 14 Oct 2024 03:52:47 -0700

Hi All,

The optimization added in r15-1047-g7876cde25cbd2f is using the wrong
operaiton to check for uniform constant vectors.


The Author intended to check that all the lanes in the vector are the same and
so used CONST_VECTOR_DUPLICATE_P.  However this only checks that the vector
is created from a pattern duplication, but doesn't say how many pattern
alternatives make up the duplication.  Normally would would need to check this
separately or use const_vec_duplicate_p.

Without this the optimization incorrectly triggers.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

        PR rtl-optimization/117012
        * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): Use
        const_vec_duplicate_p instead of CONST_VECTOR_DUPLICATE_P.

gcc/testsuite/ChangeLog:

        PR rtl-optimization/117012
        * gcc.target/aarch64/pr117012.c: New test.

---
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 
e8e60404ef62b891a68bc68645c4c349a1b12a7c..c304baa3c3ab6ada95b85961f34532966428e337
 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4084,10 +4084,10 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
       if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
          && (CONST_INT_P (XEXP (op0, 1))
              || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
-                 && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1))
+                 && const_vec_duplicate_p (XEXP (op0, 1))
                  && CONST_INT_P (XVECEXP (XEXP (op0, 1), 0, 0))))
          && GET_CODE (op1) == CONST_VECTOR
-         && CONST_VECTOR_DUPLICATE_P (op1)
+         && const_vec_duplicate_p (op1)
          && CONST_INT_P (XVECEXP (op1, 0, 0)))
        {
          unsigned HOST_WIDE_INT shift_count
diff --git a/gcc/testsuite/gcc.target/aarch64/pr117012.c 
b/gcc/testsuite/gcc.target/aarch64/pr117012.c
new file mode 100644
index 
0000000000000000000000000000000000000000..537c0fa566c6c930ebad3013e3adea9e9fdd1a23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr117012.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#define vector16 __attribute__((vector_size(16)))
+
+vector16 unsigned char
+g (vector16 unsigned char a)
+{
+  vector16 signed char b = (vector16 signed char)a;
+  b = b >> 7;
+  vector16 unsigned char c = (vector16 unsigned char)b;
+  vector16 unsigned char d = { 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0 
};
+  return c & d;
+}
+
+/* { dg-final { scan-assembler-times {and\tv[0-9]+\.16b, v[0-9]+\.16b, 
v[0-9]+\.16b} 1 } } */




--

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index e8e60404ef62b891a68bc68645c4c349a1b12a7c..c304baa3c3ab6ada95b85961f34532966428e337 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4084,10 +4084,10 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
       if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
 	  && (CONST_INT_P (XEXP (op0, 1))
 	      || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
-		  && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1))
+		  && const_vec_duplicate_p (XEXP (op0, 1))
 		  && CONST_INT_P (XVECEXP (XEXP (op0, 1), 0, 0))))
 	  && GET_CODE (op1) == CONST_VECTOR
-	  && CONST_VECTOR_DUPLICATE_P (op1)
+	  && const_vec_duplicate_p (op1)
 	  && CONST_INT_P (XVECEXP (op1, 0, 0)))
 	{
 	  unsigned HOST_WIDE_INT shift_count
diff --git a/gcc/testsuite/gcc.target/aarch64/pr117012.c b/gcc/testsuite/gcc.target/aarch64/pr117012.c
new file mode 100644
index 0000000000000000000000000000000000000000..537c0fa566c6c930ebad3013e3adea9e9fdd1a23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr117012.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#define vector16 __attribute__((vector_size(16)))
+
+vector16 unsigned char
+g (vector16 unsigned char a)
+{
+  vector16 signed char b = (vector16 signed char)a;
+  b = b >> 7;
+  vector16 unsigned char c = (vector16 unsigned char)b;
+  vector16 unsigned char d = { 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0 };
+  return c & d;
+}
+
+/* { dg-final { scan-assembler-times {and\tv[0-9]+\.16b, v[0-9]+\.16b, v[0-9]+\.16b} 1 } } */

[PATCH][simplify-rtx]: Fix incorrect folding of shift and AND [PR117012]

Reply via email to