On Thu, Nov 21, 2013 at 11:37 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Thu, Nov 21, 2013 at 07:43:35AM +1000, Richard Henderson wrote: >> On 11/20/2013 07:44 PM, Jakub Jelinek wrote: >> > On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote: >> >> Aww ;) Nice improvement. Generally when I see this I always wonder >> >> whether we want to do this kind of stuff pre RTL expansion. >> >> 1st to not rely on being able to TER, 2nd to finally eventually >> >> get rid of TER. >> >> >> >> These patches are unfortunately a step backward for #2. >> >> >> >> As of the patch, do we have a way to query whether the target >> >> can efficiently broadcast? If so this IMHO belongs in generic >> > >> > We don't. Perhaps if we'd add optab for vec_dup<mode> and mentioned >> > clearly in the documentation that it should be used only if it is >> > reasonably >> > efficient. But still, even with optab, it would probably better to do it >> > in the veclower* passes than in the vectorizer itself. >> >> I think we can assume that broadcast is relatively efficient, whether or not >> vec_dup is present. I'd lean to making the transformation generic to start >> with, so that you don't need extra handling in the i386 backend. > > Ok, here is a generic veclower implementation without looking at any optabs, > so far only handles PLUS_EXPR, what operation other than MULT_EXPR would > make sense here? Though, handling MULT_EXPR also would complicate the code > slightly (it would need to handle say: > _2 = _1(D) + 1; > _3 = _2 + 2; > _4 = _3 * 2; > _5 = _4 * 3; > _6 = { _3, _4, _5, _4 }; > where we could start thinking first the operation is PLUS_EXPR, but it > actually is MULT_EXPR with _3 as base). Also, for MULT_EXPR, supposedly > we could handle some values to be constant 0, like in: > _2 = _1(D) * 5; > _3 = _2 * 2; > _4 = _1(D) * 10; > _5 = { _3, 0, _4, _2, _1(D), 0, _4, _2 }; > > Bootstrap/regtest pending, ok at least for this for the start and can be > improved later on? > > 2013-11-21 Jakub Jelinek <ja...@redhat.com> > > * tree-vect-generic.c (optimize_vector_constructor): New function. > (expand_vector_operations_1): Call it. > > * gcc.dg/vect/vect-124.c: New test. > > --- gcc/tree-vect-generic.c.jj 2013-11-19 21:56:40.000000000 +0100 > +++ gcc/tree-vect-generic.c 2013-11-21 11:17:55.146118161 +0100 > @@ -988,6 +988,84 @@ expand_vector_operation (gimple_stmt_ite > gimple_assign_rhs1 (assign), > gimple_assign_rhs2 (assign), code);
This patch caused PR 59273 [1] on alpha-linux-gnu. [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59273 Uros.