On Tue, 30 Jan 2018, Richard Biener wrote: > > This patch tries to deal with the "easy" part of a function ABI, > the return value location, in vectorization costing. The testcase > shows that if we vectorize the returned value but the function > doesn't return in memory or in a vector register but as in this > case in an integer register pair (reg:TI ax) (bah, ABI details > exposed late? why's this not a parallel?) we end up spilling > badly. > > The idea is to account for such spilling so if vectorization > benefits outweight the spilling we'll vectorize anyway. > > I think the particular testcase could be fixed in the subreg > pass basically undoing the vectorization but I realize that > generally this is a too hard problem and avoiding vectorization > is better. Still this patch is somewhat fragile in that it > depends on us "seeing" that the stored to decl is returned > (see cfun_returns). > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > I'd like to hear opinions on my use of hard_function_value > and also from other target maintainers. I'm not sure we > have sufficient testsuite coverage of _profitable_ vectorization > of a return value. Feel free to add to this for your > target. > > Ok for trunk?
Ok, looks like hard_function_value doesn't like being called with type <record_type 0x7ffff65009d8 optional asm_written needs-constructing type_1 type_5 type_6 BLK size <integer_cst 0x7ffff64ee558 constant 8008> we then ICE: #0 fancy_abort ( file=0x2094ef0 "/space/rguenther/src/svn/early-lto-debug/gcc/machmode.h", line=292, function=0x2095534 <opt_mode<scalar_int_mode>::require() const::__FUNCTION__> "require") at /space/rguenther/src/svn/early-lto-debug/gcc/diagnostic.c:1500 #1 0x0000000000c34790 in opt_mode<scalar_int_mode>::require ( this=0x7fffffffcf20) at /space/rguenther/src/svn/early-lto-debug/gcc/machmode.h:292 #2 0x0000000000db271a in hard_function_value (valtype=0x7ffff65009d8, func=0x7ffff64fff00, fntype=0x0, outgoing=1) at /space/rguenther/src/svn/early-lto-debug/gcc/explow.c:2212 2201 /* int_size_in_bytes can return -1. We don't need a check here 2202 since the value of bytes will then be large enough that no 2203 mode will match anyway. */ 2204 2205 FOR_EACH_MODE_IN_CLASS (tmpmode, MODE_INT) 2206 { (gdb) l 2207 /* Have we found a large enough mode? */ 2208 if (GET_MODE_SIZE (tmpmode.require ()) >= bytes) 2209 break; 2210 } 2211 2212 PUT_MODE (val, tmpmode.require ()); which means we didn't find an integer mode of the wanted size. The backend simply returned (reg:BLK 0 ax). I suppose I have to guard hard_function_value with sth ... Richard. > Thanks, > Richard. > > 2018-01-30 Richard Biener <rguent...@suse.de> > > PR tree-optimization/84101 > * tree-vect-stmts.c: Include explow.h for hard_function_value. > (cfun_returns): New helper. > (vect_model_store_cost): When vectorizing a store to a decl > we return and the function ABI returns via a non-vector > register account for the possible spilling that will happen. > > * gcc.target/i386/pr84101.c: New testcase. > > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c (revision 257139) > +++ gcc/tree-vect-stmts.c (working copy) > @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. > #include "tree-cfg.h" > #include "tree-ssa-loop-manip.h" > #include "cfgloop.h" > +#include "explow.h" > #include "tree-ssa-loop.h" > #include "tree-scalar-evolution.h" > #include "tree-vectorizer.h" > @@ -893,6 +894,22 @@ vect_model_promotion_demotion_cost (stmt > "prologue_cost = %d .\n", inside_cost, prologue_cost); > } > > +/* Returns true if the current function returns DECL. */ > + > +static bool > +cfun_returns (tree decl) > +{ > + edge_iterator ei; > + edge e; > + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) > + { > + greturn *ret = safe_dyn_cast <greturn *> (last_stmt (e->src)); > + if (ret && gimple_return_retval (ret) == decl) > + return true; > + } > + return false; > +} > + > /* Function vect_model_store_cost > > Models cost for stores. In the case of grouped accesses, one access > @@ -971,6 +988,36 @@ vect_model_store_cost (stmt_vec_info stm > vec_to_scalar, stmt_info, 0, vect_body); > } > > + /* When vectorizing an SLP store into the function result assign > + a penalty if the function returns in a non-vector register. > + In this case we assume we'll end up with having to spill the > + vector result and do element loads. */ > + if (slp_node) > + { > + tree base = get_base_address (dr->ref); > + if (base > + && (TREE_CODE (base) == RESULT_DECL > + || (DECL_P (base) && cfun_returns (base)))) > + { > + rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1); > + if ((REG_P (reg) && ! VECTOR_MODE_P (GET_MODE (reg))) > + || GET_CODE (reg) == PARALLEL > + || GET_CODE (reg) == CONCAT) > + { > + /* Spill. */ > + inside_cost += record_stmt_cost (body_cost_vec, > + ncopies, vector_store, > + stmt_info, 0, vect_body); > + /* Element loads. */ > + unsigned int assumed_nunits = vect_nunits_for_cost (vectype); > + inside_cost += record_stmt_cost (body_cost_vec, > + ncopies * assumed_nunits, > + scalar_load, > + stmt_info, 0, vect_body); > + } > + } > + } > + > if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, vect_location, > "vect_model_store_cost: inside_cost = %d, " > Index: gcc/testsuite/gcc.target/i386/pr84101.c > =================================================================== > --- gcc/testsuite/gcc.target/i386/pr84101.c (nonexistent) > +++ gcc/testsuite/gcc.target/i386/pr84101.c (working copy) > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -fdump-tree-slp2-details" } */ > + > +typedef struct uint64_pair uint64_pair_t ; > +struct uint64_pair > +{ > + unsigned long w0 ; > + unsigned long w1 ; > +} ; > + > +uint64_pair_t pair(int num) > +{ > + uint64_pair_t p ; > + > + p.w0 = num << 1 ; > + p.w1 = num >> 1 ; > + > + return p ; > +} > + > +/* { dg-final { scan-tree-dump-not "basic block vectorized" "slp2" } } */ > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)