On Wed, Nov 11, 2015 at 7:54 PM, Alan Hayward <alan.hayw...@arm.com> wrote: > > > On 11/11/2015 13:25, "Richard Biener" <richard.guent...@gmail.com> wrote: > >>On Wed, Nov 11, 2015 at 1:22 PM, Alan Hayward <alan.hayw...@arm.com> >>wrote: >>> Hi, >>> I hoped to post this in time for Monday’s cut off date, but >>>circumstances >>> delayed me until today. Hoping if possible this patch will still be able >>> to go in. >>> >>> >>> This patch builds upon the change for PR65947, and reduces the amount of >>> code produced in a vectorized condition reduction where operand 2 of the >>> COND_EXPR is an assignment of a increasing integer induction variable >>>that >>> won't wrap. >>> >>> >>> For example (assuming all types are ints), this is a match: >>> >>> last = 5; >>> for (i = 0; i < N; i++) >>> if (a[i] < min_v) >>> last = i; >>> >>> Whereas, this is not because the result is based off a memory access: >>> last = 5; >>> for (i = 0; i < N; i++) >>> if (a[i] < min_v) >>> last = a[i]; >>> >>> In the integer induction variable case we can just use a MAX reduction >>>and >>> skip all the code I added in my vectorized condition reduction patch - >>>the >>> additional induction variables in vectorizable_reduction () and the >>> additional checks in vect_create_epilog_for_reduction (). From the patch >>> diff only, it's not immediately obvious that those parts will be skipped >>> as there is no code changes in those areas. >>> >>> The initial value of the induction variable is force set to zero, as any >>> other value could effect the result of the induction. At the end of the >>> loop, if the result is zero, then we restore the original initial value. >> >>+static bool >>+is_integer_induction (gimple *stmt, struct loop *loop) >> >>is_nonwrapping_integer_induction? >> >>+ tree lhs_max = TYPE_MAX_VALUE (TREE_TYPE (gimple_phi_result (stmt))); >> >>don't use TYPE_MAX_VALUE. >> >>+ /* Check that the induction increments. */ >>+ if (tree_int_cst_compare (step, size_zero_node) <= 0) >>+ return false; >> >>tree_int_cst_sgn (step) == -1 >> >>+ /* Check that the max size of the loop will not wrap. */ >>+ >>+ if (! max_loop_iterations (loop, &ni)) >>+ return false; >>+ /* Convert backedges to iterations. */ >>+ ni += 1; >> >>just use max_stmt_executions (loop, &ni) which properly checks for >>overflow >>of the +1. >> >>+ max_loop_value = wi::add (wi::to_widest (base), >>+ wi::mul (wi::to_widest (step), ni)); >>+ >>+ if (wi::gtu_p (max_loop_value, wi::to_widest (lhs_max))) >>+ return false; >> >>you miss a check for the wi::add / wi::mul to overflow. You can use >>extra args to determine this. >> >>Instead of TYPE_MAX_VALUE use wi::max_value (precision, sign). >> >>I wonder if you want to skip all the overflow checks for >>TYPE_OVERFLOW_UNDEFINED >>IV types? >> > > Ok with all the above. > > Tried using max_value () but this gave me a wide_int instead of a > widest_int. > Instead I've replaced with min_precision and GET_MODE_BITSIZE. > > Added an extra check for when the type is TYPE_OVERFLOW_UNDEFINED.
+ /* Set the loop-entry arg of the reduction-phi. */ + + if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) + == INTEGER_INDUC_COND_REDUCTION) extra vertical space + tree zero = build_int_cst ( TREE_TYPE (vec_init_def_type), 0); + tree zero_vec = build_vector_from_val (vec_init_def_type, zero); + build_zero_cst (vec_init_def_type); + else + { + add_phi_arg (as_a <gphi *> (phi), vec_init_def, loop_preheader_edge (loop), UNKNOWN_LOCATION); + } no {}s around single stmts + tree comparez = build2 (EQ_EXPR, boolean_type_node, new_temp, zero); please no l33t speech + tmp = build3 (COND_EXPR, scalar_type, comparez, initial_def, + new_temp); + + epilog_stmt = gimple_build_assign (new_scalar_dest, tmp); + new_temp = make_ssa_name (new_scalar_dest, epilog_stmt); + gimple_assign_set_lhs (epilog_stmt, new_temp); epilog_stmt = gimple_build_assign (make_ssa_name (new_scalar_dest), COND_EXPR, compare, initial_def, new_temp); + /* Check that the max size of the loop will not wrap. */ + + if (TYPE_OVERFLOW_UNDEFINED (lhs_type)) + { + return (GET_MODE_BITSIZE (TYPE_MODE (lhs_type)) + >= GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (base)))); this mode check will always be true as lhs_type and base are from the same PHI node. + return (wi::min_precision (max_loop_value, TYPE_SIGN (lhs_type)) + <= GET_MODE_BITSIZE (TYPE_MODE (lhs_type))); please use TYPE_PRECISION (lhs_type) instead. Ok with those changes. Thanks, Richard. > > > Alan. >