On Mon, Jan 13, 2014 at 5:37 AM, Richard Biener <rguent...@suse.de> wrote: > On Wed, 27 Nov 2013, Jakub Jelinek wrote: > >> On Wed, Nov 27, 2013 at 10:53:56AM +0100, Richard Biener wrote: >> > Hmm. I'm still thinking that we should handle this during the regular >> > transform step. >> >> I wonder if it can't be done instead just in vectorizable_load, >> if LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) and the load is >> invariant, just emit the (broadcasted) load not inside of the loop, but on >> the loop preheader edge. > > So this implements this suggestion, XFAILing the no longer handled cases. > For example we get > > _94 = *b_8(D); > vect_cst_.18_95 = {_94, _94, _94, _94}; > _99 = prolog_loop_adjusted_niters.9_132 * 4; > vectp_a.22_98 = a_6(D) + _99; > ivtmp.43_77 = (unsigned long) vectp_a.22_98; > > <bb 13>: > # ivtmp.41_67 = PHI <ivtmp.41_70(3), 0(12)> > # ivtmp.43_71 = PHI <ivtmp.43_69(3), ivtmp.43_77(12)> > vect__10.19_97 = vect_cst_.18_95 + { 1, 1, 1, 1 }; > _76 = (void *) ivtmp.43_71; > MEM[base: _76, offset: 0B] = vect__10.19_97; > > ... > > instead of having hoisted *b_8 + 1 as scalar computation. Not sure > why LIM doesn't hoist the vector variant later. > > vect__10.19_97 = vect_cst_.18_95 + vect_cst_.20_96; > invariant up to level 1, cost 1. > > ah, the cost thing. Should be "improved" to see that hoisting > reduces the number of live SSA names in the loop. > > Eventually lower_vector_ssa could optimize vector to scalar > code again ... (ick). > > Bootstrap / regtest running on x86_64. > > Comments? > > Thanks, > Richard. > > 2014-01-13 Richard Biener <rguent...@suse.de> > > PR tree-optimization/58921 > PR tree-optimization/59006 > * tree-vect-loop-manip.c (vect_loop_versioning): Remove code > hoisting invariant stmts. > * tree-vect-stmts.c (vectorizable_load): Insert the splat of > invariant loads on the preheader edge if possible. >
This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59841 H.J.