On September 1, 2019 12:19:51 PM GMT+02:00, Jakub Jelinek <ja...@redhat.com> 
wrote:
>Hi!
>
>The following testcase ICEs, because for SSE4.1 only VEC_COND_EXPRs
>with
>EQ_EXPR/NE_EXPR are supported and vectorizer generates such
>VEC_COND_EXPR,
>but later on the condition is folded into a VECTOR_CST and the
>VEC_COND_EXPR
>expansion code expands non-comparison conditions as LT_EXPR against
>zero
>vector.
>
>I think the only problematic case is when the equality comparison is
>folded
>into a constant; at that point, if both other VEC_COND_EXPR arguments
>are
>constant, we could in theory fold it (but can't really rely on it
>during
>expansion anyway), but if they aren't constant, just the condition is,
>there
>is nothing to fold it into anyway.  The patch verifies that LT_EXPR
>against
>zero will behave the same as NE_EXPR by punting if there are
>non-canonical
>elements (> 0), otherwise just tries to expand it as NE_EXPR if LT_EXPR
>didn't work.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok. 

Thanks, 
Richard. 

>2019-09-01  Jakub Jelinek  <ja...@redhat.com>
>
>       PR middle-end/91623
>       * optabs.c (expand_vec_cond_expr): If op0 is a VECTOR_CST and only
>       EQ_EXPR/NE_EXPR is supported, verify that op0 only contains
>       zeros or negative elements and use NE_EXPR instead of LT_EXPR against
>       zero vector.
>
>       * gcc.target/i386/pr91623.c: New test.
>
>--- gcc/optabs.c.jj    2019-08-27 12:26:37.392912813 +0200
>+++ gcc/optabs.c       2019-08-31 19:49:32.831430056 +0200
>@@ -5868,6 +5868,25 @@ expand_vec_cond_expr (tree vec_cond_type
>   icode = get_vcond_icode (mode, cmp_op_mode, unsignedp);
>   if (icode == CODE_FOR_nothing)
>     {
>+      if (tcode == LT_EXPR
>+        && op0a == op0
>+        && TREE_CODE (op0) == VECTOR_CST)
>+      {
>+        /* A VEC_COND_EXPR condition could be folded from EQ_EXPR/NE_EXPR
>+           into a constant when only get_vcond_eq_icode is supported.
>+           Verify < 0 and != 0 behave the same and change it to NE_EXPR. 
>*/
>+        unsigned HOST_WIDE_INT nelts;
>+        if (!VECTOR_CST_NELTS (op0).is_constant (&nelts))
>+          {
>+            if (VECTOR_CST_STEPPED_P (op0))
>+              return 0;
>+            nelts = vector_cst_encoded_nelts (op0);
>+          }
>+        for (unsigned int i = 0; i < nelts; ++i)
>+          if (tree_int_cst_sgn (vector_cst_elt (op0, i)) == 1)
>+            return 0;
>+        tcode = NE_EXPR;
>+      }
>       if (tcode == EQ_EXPR || tcode == NE_EXPR)
>       icode = get_vcond_eq_icode (mode, cmp_op_mode);
>       if (icode == CODE_FOR_nothing)
>--- gcc/testsuite/gcc.target/i386/pr91623.c.jj 2019-08-31
>19:55:02.470674149 +0200
>+++ gcc/testsuite/gcc.target/i386/pr91623.c    2019-08-31
>19:54:39.186010098 +0200
>@@ -0,0 +1,32 @@
>+/* PR middle-end/91623 */
>+/* { dg-do compile } */
>+/* { dg-options "-O3 -msse4.1 -mno-sse4.2" } */
>+
>+typedef long long V __attribute__((__vector_size__(16)));
>+V e, h;
>+int d;
>+const int i;
>+
>+void foo (void);
>+
>+void
>+bar (int k, int l)
>+{
>+  if (d && 0 <= k - 1 && l)
>+    foo ();
>+}
>+
>+void
>+baz (void)
>+{
>+  V n = (V) { 1 };
>+  V g = (V) {};
>+  V o = g;
>+  for (int f = 0; f < i; ++f)
>+    {
>+      V a = o == n;
>+      h = a;
>+      bar (f, i);
>+      o = e;
>+    }
>+}
>
>       Jakub

Reply via email to