On Fri, 2017-10-13 at 11:36 +0200, Richard Biener wrote: > On Thu, Oct 12, 2017 at 10:03 PM, Will Schmidt > <will_schm...@vnet.ibm.com> wrote: > > Hi, > > > > Add support for gimple folding of vec_cmp_{eq,ge,gt,le,ne} for > > the integer data types. > > > > This adds a handful of entries to the switch statement in > > builtin_function_type > > for those builtins having unsigned arguments. > > > > Three entries are added to vsx.md to enable vcmpne[bhw] instruction, where > > we > > would otherwise generate a vcmpeq + vnor. > > > > This patch requires the previously posted "allow integer return type from > > vector compares" patch. > > > > A handful of existing tests required updates to their specified optimization > > levels to continue to generate the desired code. builtins-3-p9.c in > > particular > > has been updated to reflect improved code gen with the higher specified > > optimization level. Testcase coverage is otherwise handled by the > > already-in-tree > > gcc.target/powerpc/fold-vec-cmp-*.c tests. > > > > Tested OK on P6 and newer. OK for trunk? > > > > Thanks, > > -Will > > > > [gcc] > > > > 2017-10-12 Will Schmidt <will_schm...@vnet.ibm.com> > > > > * config/rs6000/rs6000.c: (rs6000_gimple_fold_builtin) Add support > > for > > folding of vector compares. (builtin_function_type) Add compare > > builtins to the list of functions having unsigned arguments. > > * config/rs6000/vsx.md: Add vcmpne{b,h,w} instructions. > > > > [testsuite] > > > > 2017-10-12 Will Schmidt <will_schm...@vnet.ibm.com> > > > > * gcc.target/powerpc/builtins-3-p9.c: Add -O1, update > > expected codegen checks. > > * gcc.target/powerpc/vec-cmp-sel.c: Mark vars as volatile. > > * gcc.target/powerpc/vsu/vec-cmpne-0.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-1.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-2.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-3.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-4.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-5.c: Add -O1. > > * gcc.target/powerpc/vsu/vec-cmpne-6.c: Add -O1. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index 12ddd97..7e73239 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -16605,17 +16605,93 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator > > *gsi) > > build_int_cst (arg2_type, 0)), > > arg0); > > gimple_set_location (g, loc); > > gsi_replace (gsi, g, true); > > return true; > > } > > + /* Vector compares (integer); EQ, NE, GE, GT, LE. */ > > + case ALTIVEC_BUILTIN_VCMPEQUB: > > + case ALTIVEC_BUILTIN_VCMPEQUH: > > + case ALTIVEC_BUILTIN_VCMPEQUW: > > + case P8V_BUILTIN_VCMPEQUD: > > + { > > + arg0 = gimple_call_arg (stmt, 0); > > + arg1 = gimple_call_arg (stmt, 1); > > + lhs = gimple_call_lhs (stmt); > > + gimple *g = gimple_build_assign (lhs, EQ_EXPR, arg0, arg1); > > As said elsewhere this needs to become either > > tree ctype = build_same_sized_truth_vector_type (TREE_TYPE (lhs)); > gimple_build_assign (make_ssa_name (ctype), EQ_EXPR, arg0, arg1) > gimple_build_assign (lhs, VIEW_CONVERT_EXPR, lhs above); > > (eventually the VCE can be elided - try) or > > gimple_build_assign (lhs, VEC_COND_EXPR, > fold_build2 (EQ_EXPR, ctype, arg0, arg1), > vector-with-trues, vector-with-falses); > > depending on what your target can expand.
Alright, i'll work with this some more and see what I come up with. Thanks for the review and feedback. :-) -Will