Re: [PATCH rs6000]Fix PR92132

Kewen.Lin Tue, 05 Nov 2019 00:36:14 -0800

Hi Segher,

Thanks for the comments!


on 2019/11/2 上午7:17, Segher Boessenkool wrote:
> On Tue, Oct 29, 2019 at 01:16:53PM +0800, Kewen.Lin wrote:
>>      (vcond_mask_<mode><mode>): New expand.
> 
> Say for which mode please?  Like
>       (vcond_mask_<mode><mode> for VEC_I and VEC_I): New expand.
> 

Fixed as below.

>>      (vcond_mask_<mode><VEC_int>): Likewise.
> 
> "for VEC_I and VEC_F", here, but the actual names in the pattern are for
> vector modes of same-size integer elements.  Maybe it is clear enough like
> this, dunno.

Changed to for VEC_F, New expand for float vector modes and same-size 
integer vector modes.

> 
>>      (vector_{ungt,unge,unlt,unle}<mode>): Likewise.
> 
> Never use wildcards (or shell expansions) in the "what changed" part of a
> changelog, because people try to search for that.

Thanks for the explanation, fixed. 

> 
>>  ;; 128-bit one's complement
>> -(define_insn_and_split "*one_cmpl<mode>3_internal"
>> +(define_insn_and_split "one_cmpl<mode>3_internal"
> 
> Instead, rename it to "one_cmpl<mode>3" and delete the define_expand that
> serves no function?

Renamed.  Sorry, what's the "define_expand" specified here.  I thought it's
for existing one_cmpl<mode>3 but I didn't find it. 

> 
>> +(define_code_iterator fpcmpun [ungt unge unlt unle])
> 
> Why these four?  Should there be more?  Should this be added to some
> existing iterator?

For floating point comparison operator and vector type, currently rs6000
supports eq, gt, ge, *ltgt, *unordered, *ordered, *uneq (* for unnamed).
We can leverage gt, ge, eq for lt, le, ne, then these four left.

I originally wanted to merge them into the existing unordered or uneq, but
I found it's hard to share their existing patterns.  For example, the uneq
looks like:

  [(set (match_dup 3)
        (gt:VEC_F (match_dup 1)
                  (match_dup 2)))
   (set (match_dup 4)
        (gt:VEC_F (match_dup 2)
                  (match_dup 1)))
   (set (match_dup 0)
        (and:VEC_F (not:VEC_F (match_dup 3))
                   (not:VEC_F (match_dup 4))))]

While ungt looks like:

  [(set (match_dup 3)
        (ge:VEC_F (match_dup 1)
                  (match_dup 2)))
   (set (match_dup 4)
        (ge:VEC_F (match_dup 2)
                  (match_dup 1)))
   (set (match_dup 3)
        (ior:VEC_F (not:VEC_F (match_dup 3))
                   (not:VEC_F (match_dup 4))))
   (set (match_dup 4)
        (gt:VEC_F (match_dup 1)
                  (match_dup 2)))
   (set (match_dup 3)
        (ior:VEC_F (match_dup 3)
                   (match_dup 4)))]
  
> 
> It's not all comparisons including unordered, there are uneq, unordered
> itself, and ne as well.

Yes, they are not, just a list holding missing support comparison operator.

> 
>> +;; Same mode for condition true/false values and predicate operand.
>> +(define_expand "vcond_mask_<mode><mode>"
>> +  [(match_operand:VEC_I 0 "vint_operand")
>> +   (match_operand:VEC_I 1 "vint_operand")
>> +   (match_operand:VEC_I 2 "vint_operand")
>> +   (match_operand:VEC_I 3 "vint_operand")]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> +{
>> +  emit_insn (gen_vector_select_<mode> (operands[0], operands[2], 
>> operands[1],
>> +                              operands[3]));
>> +  DONE;
>> +})
> 
> So is this exactly the same as vsel/xxsel?

Yes, expanded into if_then_else and ne against zero, can match their patterns.

> 
>> +;; For signed integer vectors comparison.
>> +(define_expand "vec_cmp<mode><mode>"
> 
>> +    case GEU:
>> +      emit_insn (
>> +    gen_vector_nltu<mode> (operands[0], operands[2], operands[3], tmp));
>> +      break;
>> +    case GTU:
>> +      emit_insn (gen_vector_gtu<mode> (operands[0], operands[2], 
>> operands[3]));
>> +      break;
>> +    case LEU:
>> +      emit_insn (
>> +    gen_vector_ngtu<mode> (operands[0], operands[2], operands[3], tmp));
>> +      break;
>> +    case LTU:
>> +      emit_insn (gen_vector_gtu<mode> (operands[0], operands[3], 
>> operands[2]));
>> +      break;
> 
> You shouldn't allow those for signed comparisons, that will only hide
> problems.

OK, moved into vec_cmpu*.

> 
> You can do all the rest with some iterator / code attribute?  Or two cases,
> one for the codes that need ops 2 and 3 swapped, one for the rest?
> 

Sorry, I tried to use code attributes here but failed.  I think the reason is 
the
pattern name doesn't have <code>.  I can only get the code from operand 1, then
have to use "switch case"?  I can change it with one more define_expand, but
is that what we wanted?  It looks we still need "case"s.

define_expand "vec_cmp<mode><mode>"
...
{...
enum rtx_code code = GET_CODE (operands[1]);
switch (code)
  case GT:
  ... gen_vec_cmp<mode><mode>gt
  ...
}

define_expand "vec_cmp<mode><mode><code>"
  ... gen_vector_<code_name><mode>


>> +;; For unsigned integer vectors comparison.
>> +(define_expand "vec_cmpu<mode><mode>"
>> +  [(set (match_operand:VEC_I 0 "vint_operand")
>> +    (match_operator 1 "comparison_operator"
>> +      [(match_operand:VEC_I 2 "vint_operand")
>> +       (match_operand:VEC_I 3 "vint_operand")]))]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> +{
>> +  emit_insn (gen_vec_cmp<mode><mode> (operands[0], operands[1],
>> +                                  operands[2], operands[3]));
>> +  DONE;
>> +})
> 
> unsigned_comparison_operator?

Good point, fixed.

> 
> Why *are* there separate vec_cmp and vec_cmpu patterns, in the first place?
> 

If I understood the question correctly, you were asking why not have one
unique pattern for them?  I noticed some vectorization related SPNs have
separate signed and unsigned patterns, I guess it's due to that sign matters
for some vector instructions, some platform may only support some of them,
using sign for fine grain queries and checks?

Updated patch attached by addressing above comments.

BR,
Kewen

------------------
gcc/ChangeLog

2019-11-05  Kewen Lin  <li...@gcc.gnu.org>

        PR target/92132
        * config/rs6000/rs6000.md (one_cmpl<mode>3_internal): Rename to
        one_cmpl<mode>3 and expose.
        * config/rs6000/predicates.md
        (signed_or_equality_comparison_operator): New predicate.
        * config/rs6000/vector.md (fpcmpun_gtelte): New code_iterator.
        (vcond_mask_<mode><mode> for VEC_I and VEC_I): New expand.
        (vec_cmp<mode><mode> for VEC_I and VEC_I): Likewise.
        (vec_cmpu<mode><mode> for VEC_I and VEC_I): Likewise.
        (vcond_mask_<mode><VEC_int> for VEC_F): New expand for float
        vector modes and same-size integer vector modes.
        (vec_cmp<mode><VEC_int> for VEC_F): Likewise.
        (vector_<code><mode> for fpcmpun_gtelte): New expand.
        (vector_uneq<mode>): Expose name.
        (vector_ltgt<mode>): Likewise.
        (vector_unordered<mode>): Likewise.
        (vector_ordered<mode>): Likewise.

gcc/testsuite/ChangeLog

2019-11-05  Kewen Lin  <li...@gcc.gnu.org>

        PR target/92132
        * gcc.target/powerpc/pr92132-fp-1.c: New test.
        * gcc.target/powerpc/pr92132-fp-2.c: New test.
        * gcc.target/powerpc/pr92132-int-1.c: New test.
        * gcc.target/powerpc/pr92132-int-2.c: New test.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 345d9c3..5665174 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1140,6 +1140,11 @@
 (define_predicate "signed_comparison_operator"
   (match_code "lt,gt,le,ge"))
 
+;; Return 1 if OP is a signed comparison or an equality operator.
+(define_predicate "signed_or_equality_comparison_operator"
+  (ior (match_operand 0 "equality_operator")
+       (match_operand 0 "signed_comparison_operator")))
+
 ;; Return 1 if OP is a comparison operation that is valid for an SCC insn --
 ;; it must be a positive comparison.
 (define_predicate "scc_comparison_operator"
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d0cca1e..e3429d7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6800,7 +6800,7 @@
         (const_string "16")))])
 
 ;; 128-bit one's complement
-(define_insn_and_split "*one_cmpl<mode>3_internal"
+(define_insn_and_split "one_cmpl<mode>3"
   [(set (match_operand:BOOL_128 0 "vlogical_operand" "=<BOOL_REGS_OUTPUT>")
        (not:BOOL_128
          (match_operand:BOOL_128 1 "vlogical_operand" "<BOOL_REGS_UNARY>")))]
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 886cbad..7111b43 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -107,6 +107,8 @@
                                 (smin "smin")
                                 (smax "smax")])
 
+(define_code_iterator fpcmpun_gtelte [ungt unge unlt unle])
+
 
 ;; Vector move instructions.  Little-endian VSX loads and stores require
 ;; special handling to circumvent "element endianness."
@@ -493,6 +495,217 @@
     FAIL;
 })
 
+;; To support vector condition vectorization, define vcond_mask and vec_cmp.
+
+;; Same mode for condition true/false values and predicate operand.
+(define_expand "vcond_mask_<mode><mode>"
+  [(match_operand:VEC_I 0 "vint_operand")
+   (match_operand:VEC_I 1 "vint_operand")
+   (match_operand:VEC_I 2 "vint_operand")
+   (match_operand:VEC_I 3 "vint_operand")]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  emit_insn (gen_vector_select_<mode> (operands[0], operands[2], operands[1],
+                                 operands[3]));
+  DONE;
+})
+
+;; Condition true/false values are float but predicate operand is of
+;; type integer vector with same element size.
+(define_expand "vcond_mask_<mode><VEC_int>"
+  [(match_operand:VEC_F 0 "vfloat_operand")
+   (match_operand:VEC_F 1 "vfloat_operand")
+   (match_operand:VEC_F 2 "vfloat_operand")
+   (match_operand:<VEC_INT> 3 "vint_operand")]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  emit_insn (gen_vector_select_<mode> (operands[0], operands[2], operands[1],
+                                 gen_lowpart (<MODE>mode, operands[3])));
+  DONE;
+})
+
+;; For signed integer vectors comparison.
+(define_expand "vec_cmp<mode><mode>"
+  [(set (match_operand:VEC_I 0 "vint_operand")
+       (match_operator 1 "signed_or_equality_comparison_operator"
+         [(match_operand:VEC_I 2 "vint_operand")
+          (match_operand:VEC_I 3 "vint_operand")]))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  switch (code)
+    {
+    case NE:
+      emit_insn (gen_vector_eq<mode> (operands[0], operands[2], operands[3]));
+      emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
+      break;
+    case EQ:
+      emit_insn (gen_vector_eq<mode> (operands[0], operands[2], operands[3]));
+      break;
+    case GE:
+      emit_insn (gen_vector_nlt<mode> (operands[0],operands[2], operands[3],
+                                      tmp));
+      break;
+    case GT:
+      emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[3]));
+      break;
+    case LE:
+      emit_insn (gen_vector_ngt<mode> (operands[0], operands[2], operands[3],
+                                      tmp));
+      break;
+    case LT:
+      emit_insn (gen_vector_gt<mode> (operands[0], operands[3], operands[2]));
+      break;
+    default:
+      gcc_unreachable ();
+      break;
+    }
+  DONE;
+})
+
+;; For unsigned integer vectors comparison.
+(define_expand "vec_cmpu<mode><mode>"
+  [(set (match_operand:VEC_I 0 "vint_operand")
+       (match_operator 1 "unsigned_comparison_operator"
+         [(match_operand:VEC_I 2 "vint_operand")
+          (match_operand:VEC_I 3 "vint_operand")]))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  switch (code)
+    {
+    case GEU:
+      emit_insn (gen_vector_nltu<mode> (operands[0], operands[2], operands[3],
+                                       tmp));
+      break;
+    case GTU:
+      emit_insn (gen_vector_gtu<mode> (operands[0], operands[2], operands[3]));
+      break;
+    case LEU:
+      emit_insn (gen_vector_ngtu<mode> (operands[0], operands[2], operands[3],
+                                       tmp));
+      break;
+    case LTU:
+      emit_insn (gen_vector_gtu<mode> (operands[0], operands[3], operands[2]));
+      break;
+    default:
+      gcc_unreachable ();
+      break;
+    }
+  DONE;
+})
+
+;; For float point vectors comparison.
+(define_expand "vec_cmp<mode><VEC_int>"
+  [(set (match_operand:<VEC_INT> 0 "vint_operand")
+        (match_operator 1 "comparison_operator"
+           [(match_operand:VEC_F 2 "vfloat_operand")
+           (match_operand:VEC_F 3 "vfloat_operand")]))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx res = gen_reg_rtx (<MODE>mode);
+  switch (code)
+    {
+    case NE:
+      emit_insn (gen_vector_eq<mode> (res, operands[2], operands[3]));
+      emit_insn (gen_one_cmpl<mode>3 (res, res));
+      break;
+    case EQ:
+      emit_insn (gen_vector_eq<mode> (res, operands[2], operands[3]));
+      break;
+    case GE:
+      emit_insn (gen_vector_ge<mode> (res, operands[2], operands[3]));
+      break;
+    case GT:
+      emit_insn (gen_vector_gt<mode> (res, operands[2], operands[3]));
+      break;
+    case LE:
+      emit_insn (gen_vector_ge<mode> (res, operands[3], operands[2]));
+      break;
+    case LT:
+      emit_insn (gen_vector_gt<mode> (res, operands[3], operands[2]));
+      break;
+    case LTGT:
+      emit_insn (gen_vector_ltgt<mode> (res, operands[2], operands[3]));
+      break;
+    case UNORDERED:
+      emit_insn (gen_vector_unordered<mode> (res, operands[2], operands[3]));
+      break;
+    case ORDERED:
+      emit_insn (gen_vector_ordered<mode> (res, operands[2], operands[3]));
+      break;
+    case UNEQ:
+      emit_insn (gen_vector_uneq<mode> (res, operands[2], operands[3]));
+      break;
+    case UNGE:
+      emit_insn (gen_vector_unge<mode> (res, operands[2], operands[3]));
+      break;
+    case UNGT:
+      emit_insn (gen_vector_ungt<mode> (res, operands[2], operands[3]));
+      break;
+    case UNLE:
+      emit_insn (gen_vector_unle<mode> (res, operands[2], operands[3]));
+      break;
+    case UNLT:
+      emit_insn (gen_vector_unlt<mode> (res, operands[2], operands[3]));
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  emit_insn (gen_move_insn (operands[0], gen_lowpart (<VEC_INT>mode, res)));
+  DONE;
+})
+
+;; For below vector_UN<cc><mode>:
+;; op3 = (op1 >= op2)  # !isNaN (op1)
+;; op4 = (op2 >= op1)  # !isNaN (op2)
+;; op3 = !(op3 & op4)  # isNaN (op1) | isNaN (op2)
+;; op4 = op1 <cc> op2  # normal cmp
+;; op0 = op3 | op04    # UNORDERED result | normal cmp result
+
+(define_expand "vector_<code><mode>"
+  [(set (match_operand:VEC_F 0 "vfloat_operand")
+    (fpcmpun_gtelte:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
+            (match_operand:VEC_F 2 "vfloat_operand")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  rtx op3 = gen_reg_rtx (<MODE>mode);
+  rtx op4 = gen_reg_rtx (<MODE>mode);
+
+  /* Refer to vector_unordered.  */
+  emit_insn (gen_vector_ge<mode> (op3, operands[1], operands[2]));
+  emit_insn (gen_vector_ge<mode> (op4, operands[2], operands[1]));
+  emit_insn (gen_and<mode>3 (op3, op3, op4));
+  emit_insn (gen_one_cmpl<mode>3 (op3, op3));
+
+  switch (<CODE>)
+    {
+    case UNLT:
+      std::swap (operands[1], operands[2]);
+    /* Fall through.  */
+    case UNGT:
+      emit_insn (gen_vector_gt<mode> (op4, operands[1], operands[2]));
+      break;
+    case UNLE:
+      std::swap (operands[1], operands[2]);
+    /* Fall through.  */
+    case UNGE:
+      emit_insn (gen_vector_ge<mode> (op4, operands[1], operands[2]));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  emit_insn (gen_ior<mode>3 (operands[0], op3, op4));
+  DONE;
+})
+
+
 (define_expand "vector_eq<mode>"
   [(set (match_operand:VEC_C 0 "vlogical_operand")
        (eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
@@ -575,7 +788,7 @@
   operands[3] = gen_reg_rtx_and_attrs (operands[0]);
 })
 
-(define_insn_and_split "*vector_uneq<mode>"
+(define_insn_and_split "vector_uneq<mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
        (uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
                    (match_operand:VEC_F 2 "vfloat_operand")))]
@@ -596,7 +809,7 @@
   operands[4] = gen_reg_rtx (<MODE>mode);
 })
 
-(define_insn_and_split "*vector_ltgt<mode>"
+(define_insn_and_split "vector_ltgt<mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
        (ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
                    (match_operand:VEC_F 2 "vfloat_operand")))]
@@ -617,7 +830,7 @@
   operands[4] = gen_reg_rtx (<MODE>mode);
 })
 
-(define_insn_and_split "*vector_ordered<mode>"
+(define_insn_and_split "vector_ordered<mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
        (ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
                       (match_operand:VEC_F 2 "vfloat_operand")))]
@@ -638,7 +851,7 @@
   operands[4] = gen_reg_rtx (<MODE>mode);
 })
 
-(define_insn_and_split "*vector_unordered<mode>"
+(define_insn_and_split "vector_unordered<mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
        (unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
                         (match_operand:VEC_F 2 "vfloat_operand")))]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92132-fp-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr92132-fp-1.c
new file mode 100644
index 0000000..1023e8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92132-fp-1.c
@@ -0,0 +1,297 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-options "-O2 -ftree-vectorize -mvsx -fno-vect-cost-model 
-fdump-tree-vect-details" } */
+
+/* To test condition reduction vectorization, where comparison operands are of
+   double type and condition true/false values are integer type.  Cover all
+   float point comparison codes.  */
+
+#include <math.h>
+
+extern void
+abort (void) __attribute__ ((noreturn));
+
+#define N 27
+#define FP_TYPE double
+
+#define LTGT(a, b) (__builtin_islessgreater ((a), (b)))
+#define UNORD(a, b) (__builtin_isunordered ((a), (b)))
+#define ORD(a, b) (!__builtin_isunordered ((a), (b)))
+#define UNEQ(a, b) (!__builtin_islessgreater ((a), (b)))
+#define UNGT(a, b) (!__builtin_islessequal ((a), (b)))
+#define UNGE(a, b) (!__builtin_isless ((a), (b)))
+#define UNLT(a, b) (!__builtin_isgreaterequal ((a), (b)))
+#define UNLE(a, b) (!__builtin_isgreater ((a), (b)))
+
+__attribute__ ((noinline)) int
+test_eq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] == min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ne (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] != min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_gt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] > min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] >= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_lt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] < min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_le (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] <= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ltgt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (LTGT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ord (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (ORD (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unord (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNORD (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_uneq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNEQ (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ungt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNGT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNGE (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unlt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNLT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unle (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNLE (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+int
+main (void)
+{
+  int ret = 0;
+
+  FP_TYPE a1[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, 24, 25, 26, 27};
+
+  FP_TYPE a2[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, NAN, 25, 26, 27};
+
+  FP_TYPE a3[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11, 12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  5,  6,  7};
+
+  FP_TYPE a4[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11,  12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  NAN, 6,  7};
+
+  FP_TYPE a5[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11,  12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  NAN, 10, 10};
+
+  ret = test_eq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ne (a1, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_gt (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ge (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_lt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_le (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ltgt (a3, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_ltgt (a5, 10);
+  if (ret != 23)
+    abort ();
+
+  ret = test_unord (a5, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_ord (a5, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_uneq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_uneq (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_ungt (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ungt (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_unge (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ungt (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_unlt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_unlt (a2, 10);
+  if (ret != 23)
+    abort ();
+
+  ret = test_unle (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_unle (a2, 10);
+  if (ret != 23)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 14 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92132-fp-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr92132-fp-2.c
new file mode 100644
index 0000000..db7b9ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92132-fp-2.c
@@ -0,0 +1,297 @@
+/* { dg-do run } */
+/* { dg-require-effective-target vmx_hw } */
+/* { dg-options "-O2 -ftree-vectorize -maltivec -fno-vect-cost-model 
-fdump-tree-vect-details" } */
+
+/* To test condition reduction vectorization, where comparison operands are of
+   float type and condition true/false values are integer type.  Cover all
+   float point comparison codes.  */
+
+#include <math.h>
+
+extern void
+abort (void) __attribute__ ((noreturn));
+
+#define N 27
+#define FP_TYPE float
+
+#define LTGT(a, b) (__builtin_islessgreater ((a), (b)))
+#define UNORD(a, b) (__builtin_isunordered ((a), (b)))
+#define ORD(a, b) (!__builtin_isunordered ((a), (b)))
+#define UNEQ(a, b) (!__builtin_islessgreater ((a), (b)))
+#define UNGT(a, b) (!__builtin_islessequal ((a), (b)))
+#define UNGE(a, b) (!__builtin_isless ((a), (b)))
+#define UNLT(a, b) (!__builtin_isgreaterequal ((a), (b)))
+#define UNLE(a, b) (!__builtin_isgreater ((a), (b)))
+
+__attribute__ ((noinline)) int
+test_eq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] == min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ne (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] != min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_gt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] > min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] >= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_lt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] < min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_le (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] <= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ltgt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (LTGT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ord (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (ORD (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unord (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNORD (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_uneq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNEQ (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ungt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNGT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNGE (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unlt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNLT (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_unle (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (UNLE (a[i], min_v))
+      last = i;
+
+  return last;
+}
+
+int
+main (void)
+{
+  int ret = 0;
+
+  FP_TYPE a1[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, 24, 25, 26, 27};
+
+  FP_TYPE a2[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20,  1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, NAN, 25, 26, 27};
+
+  FP_TYPE a3[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11, 12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  5,  6,  7};
+
+  FP_TYPE a4[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11,  12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  NAN, 6,  7};
+
+  FP_TYPE a5[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11,  12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  NAN, 10, 10};
+
+  ret = test_eq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ne (a1, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_gt (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ge (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_lt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_le (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ltgt (a3, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_ltgt (a5, 10);
+  if (ret != 23)
+    abort ();
+
+  ret = test_unord (a5, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_ord (a5, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_uneq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_uneq (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_ungt (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ungt (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_unge (a3, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ungt (a4, 10);
+  if (ret != 24)
+    abort ();
+
+  ret = test_unlt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_unlt (a2, 10);
+  if (ret != 23)
+    abort ();
+
+  ret = test_unle (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_unle (a2, 10);
+  if (ret != 23)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 14 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92132-int-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr92132-int-1.c
new file mode 100644
index 0000000..a786811
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92132-int-1.c
@@ -0,0 +1,126 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p8vector_hw } */
+/* { dg-options "-O2 -ftree-vectorize -mdejagnu-cpu=power8 
-fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* To test condition reduction vectorization, where comparison operands are of
+   signed long long type and condition true/false values are integer type.  */
+
+#include <math.h>
+
+extern void
+abort (void) __attribute__ ((noreturn));
+
+#define N 27
+#define FP_TYPE signed long long
+
+__attribute__ ((noinline)) int
+test_eq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] == min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ne (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] != min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_gt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] > min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] >= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_lt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] < min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_le (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] <= min_v)
+      last = i;
+
+  return last;
+}
+
+int
+main (void)
+{
+  int ret = 0;
+
+  FP_TYPE a1[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, 24, 25, 26, 27};
+
+  FP_TYPE a2[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11, 12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  5,  6,  7};
+
+  ret = test_eq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ne (a1, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_gt (a2, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ge (a2, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_lt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_le (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 6 "vect" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92132-int-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr92132-int-2.c
new file mode 100644
index 0000000..dd3c030
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92132-int-2.c
@@ -0,0 +1,126 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p8vector_hw } */
+/* { dg-options "-O2 -ftree-vectorize -mdejagnu-cpu=power8 
-fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* To test condition reduction vectorization, where comparison operands are of
+   unsigned long long type and condition true/false values are integer type.  
*/
+
+#include <math.h>
+
+extern void
+abort (void) __attribute__ ((noreturn));
+
+#define N 27
+#define FP_TYPE unsigned long long
+
+__attribute__ ((noinline)) int
+test_eq (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] == min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ne (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] != min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_gt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] > min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_ge (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] >= min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_lt (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] < min_v)
+      last = i;
+
+  return last;
+}
+
+__attribute__ ((noinline)) int
+test_le (FP_TYPE *a, FP_TYPE min_v)
+{
+  int last = 0;
+
+  for (int i = 0; i < N; i++)
+    if (a[i] <= min_v)
+      last = i;
+
+  return last;
+}
+
+int
+main (void)
+{
+  int ret = 0;
+
+  FP_TYPE a1[N] = {11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 1,  2,  3, 4,
+                  5,  6,  7,  8,  9,  10, 21, 22, 23, 24, 25, 26, 27};
+
+  FP_TYPE a2[N] = {21, 22, 23, 24, 25, 26, 27, 28, 29, 10, 11, 12, 13, 14,
+                  15, 16, 17, 18, 19, 20, 1,  2,  3,  4,  5,  6,  7};
+
+  ret = test_eq (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ne (a1, 10);
+  if (ret != 26)
+    abort ();
+
+  ret = test_gt (a2, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_ge (a2, 10);
+  if (ret != 19)
+    abort ();
+
+  ret = test_lt (a1, 10);
+  if (ret != 18)
+    abort ();
+
+  ret = test_le (a1, 10);
+  if (ret != 19)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 6 "vect" } } */

Re: [PATCH rs6000]Fix PR92132

Reply via email to