On Thu, Sep 6, 2012 at 8:06 AM, Marc Glisse <marc.gli...@inria.fr> wrote: > On Wed, 5 Sep 2012, Gabriel Dos Reis wrote: > >> On Wed, Sep 5, 2012 at 5:09 PM, Iyer, Balaji V <balaji.v.i...@intel.com> >> wrote: >>> >>> Let's say we have two for loops like this: >>> >>> int my_func (int x, int y); >>> >>> For (ii = 0; ii < 10000; ii++) >>> X[ii] = my_func (Y[ii], Z[ii]); > > > I assume X, Y and Z are __restrict pointers (or something the compiler can > detect doesn't alias). > > >> 2. Considering this example, won't you get the same behaviour >> if my_func was declared with "pure" attribute? If not, why? > > > AFAIU, my_func is defined in a separate library and because of the attribute > on the definition, it will actually export overloads: > int myfunc(int,int); > v2si myfunc(v2si,v2si); > v4si myfunc(v4si,v4si); > etc (where does it stop? seems problematic if the library is compiled for > sse4 and I then compile and link an avx program) > > (hopefully with implementations more clever than breaking the vectors into > pieces and calling the basic myfunc on each) > > The attribute on the declaration then lets gcc's vectorizer know it can call > those overloads.
And as the overloads definitions are not guaranteed to be generated by GCC you need to specify the ABI and mangling of those overloads. +static tree +handle_vector_attribute (tree *node, tree name ATTRIBUTE_UNUSED, + tree args ATTRIBUTE_UNUSED, + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + tree opt_list; + VEC(tree,gc) *opt_vec = NULL; + opt_vec = make_tree_vector (); + VEC_safe_push (tree, gc, opt_vec, build_string (2, "O3")); + opt_list = build_tree_list_vec (opt_vec); + release_tree_vector (opt_vec); + handle_optimize_attribute (node, get_identifier ("optimize"), opt_list, + flags, no_add_attrs); Please no - do not use "optimize" attributes from inside the implementation. What happens if the user also specifies an optimize attribute? The above also doesnt' make sense to me, so please elaborate on why you want to enable -O3 for a function marked with the vector attribute. This all awfully sounds like a worse way to do the multi-versioning stuff that is still pending review. + if (flag_enable_cilkplus + && gimple_code (stmt) == GIMPLE_CALL + && is_elem_fn (gimple_call_fndecl (stmt))) + { + parm_type = find_elem_fn_parm_type (stmt, op, &step_size); + if (parm_type == TYPE_UNIFORM || parm_type == TYPE_LINEAR) + dt = vect_external_def; the middle-end should not care if CILK+ is enabled or not. Otherwise this will not work with LTO. Please use generic infrastructure for the implementation or enhance generic infrastructure. If the vectorizer should be able to vectorize non-inlined functions then there should be an IPA pass analyzing functions for whether they can be "elemental" (propagating this alongside the callgraph). Then you either decide up-front whether to "clone" those functions for various vector sizes, or, IMHO better, make sure to ship the function bodies to all LTRANS units that make use of them (much similar to how we handle inlines) and make the vectorizer emit the clones. In all this seems unrelated to CILK+ work (even if you make use of this from within CILK+). Richard. > With suitable pure/const attribute you could unroll the loop a bit and > reorder the calls to myfunc, but without myfunc's body, you couldn't do as > much. > > Note that this is my guess from reading the example and completely ignoring > the patch, it could be miles from the truth, and it needs better explanation > (the doc patch is coming later in the series IIRC). > > -- > Marc Glisse