On Wed, Sep 23, 2015 at 5:51 PM, Alan Hayward <alan.hayw...@arm.com> wrote: > > > On 18/09/2015 14:53, "Alan Hayward" <alan.hayw...@arm.com> wrote: > >> >> >>On 18/09/2015 14:26, "Alan Lawrence" <alan.lawre...@arm.com> wrote: >> >>>On 18/09/15 13:17, Richard Biener wrote: >>>> >>>> Ok, I see. >>>> >>>> That this case is already vectorized is because it implements MAX_EXPR, >>>> modifying it slightly to >>>> >>>> int foo (int *a) >>>> { >>>> int val = 0; >>>> for (int i = 0; i < 1024; ++i) >>>> if (a[i] > val) >>>> val = a[i] + 1; >>>> return val; >>>> } >>>> >>>> makes it no longer handled by current code. >>>> >>> >>>Yes. I believe the idea for the patch is to handle arbitrary expressions >>>like >>> >>>int foo (int *a) >>>{ >>> int val = 0; >>> for (int i = 0; i < 1024; ++i) >>> if (some_expression (i)) >>> val = another_expression (i); >>> return val; >>>} >> >>Yes, that’s correct. Hopefully my new test cases should cover everything. >> > > Attached is a new version of the patch containing all the changes > requested by Richard.
+ /* Compare the max index vector to the vector of found indexes to find + the postion of the max value. This will result in either a single + match or all of the values. */ + tree vec_compare = make_ssa_name (index_vec_type_signed); + gimple vec_compare_stmt = gimple_build_assign (vec_compare, EQ_EXPR, + induction_index, + max_index_vec); I'm not sure all targets can handle this. If I deciper the code correctly then we do mask = induction_index == max_index_vec; vec_and = mask & vec_data; plus some casts. So this is basically vec_and = induction_index == max_index_vec ? vec_data : {0, 0, ... }; without the need to relate the induction index vector type to the data vector type. I believe this is also the form all targets support. I am missing a comment before all this code-generation that shows the transform result with the variable names used in the code-gen. I have a hard time connecting things here. + tree matched_data_reduc_cast = build1 (VIEW_CONVERT_EXPR, scalar_type, + matched_data_reduc); + epilog_stmt = gimple_build_assign (new_scalar_dest, + matched_data_reduc_cast); + new_temp = make_ssa_name (new_scalar_dest, epilog_stmt); + gimple_assign_set_lhs (epilog_stmt, new_temp); this will leave the stmt unsimplified. scalar sign-changes should use NOP_EXPR, not VIEW_CONVERT_EXPR. The easiest fix is to use fold_convert instead. Also just do like before - first make_ssa_name and then directly use it in the gimple_build_assign. The patch is somewhat hard to parse with all the indentation changes. A context diff would be much easier to read in those contexts. + if (v_reduc_type == COND_REDUCTION) + { + widest_int ni; + + if (! max_loop_iterations (loop, &ni)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "loop count not known, cannot create cond " + "reduction.\n"); ugh. That's bad. + /* The additional index will be the same type as the condition. Check + that the loop can fit into this less one (because we'll use up the + zero slot for when there are no matches). */ + tree max_index = TYPE_MAX_VALUE (cr_index_scalar_type); + if (wi::geu_p (ni, wi::to_widest (max_index))) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "loop size is greater than data size.\n"); + return false; Likewise. @@ -5327,6 +5540,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi, if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "transform reduction.\n"); + STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type; + /* FORNOW: Multiple types are not supported for condition. */ if (code == COND_EXPR) this change looks odd (or wrong). The type should be _only_ set/changed during analysis. + + /* For cond reductions we need to add an additional conditional based on + the loop index. */ + if (v_reduc_type == COND_REDUCTION) + { + int nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out); + int k; ... + STMT_VINFO_VECTYPE (index_vec_info) = cr_index_vector_type; + set_vinfo_for_stmt (index_condition, index_vec_info); + + /* Update the phi with the vec cond. */ + add_phi_arg (new_phi, cond_name, loop_latch_edge (loop), + UNKNOWN_LOCATION); same as before - I am missing a comment that shows the generated code and connects the local vars used. + tree ccompare_name = make_ssa_name (TREE_TYPE (ccompare)); + gimple ccompare_stmt = gimple_build_assign (ccompare_name, ccompare); + gsi_insert_before (&vec_stmt_gsi, ccompare_stmt, GSI_SAME_STMT); + gimple_assign_set_rhs1 (*vec_stmt, ccompare_name); hum - are you sure this works with ncopies > 1? Will it use the correct vec_stmt? I still dislike the v_reduc_type plastered and passed everywhere. Can you explore adding the reduction kind to stmt_info? Thanks, Richard. > > Thanks, > Alan. > >