On Thu, Nov 21, 2013 at 11:37 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Thu, Nov 21, 2013 at 07:43:35AM +1000, Richard Henderson wrote:
>> On 11/20/2013 07:44 PM, Jakub Jelinek wrote:
>> > On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote:
>> >> Aww ;)  Nice improvement.  Generally when I see this I always wonder
>> >> whether we want to do this kind of stuff pre RTL expansion.
>> >> 1st to not rely on being able to TER, 2nd to finally eventually
>> >> get rid of TER.
>> >>
>> >> These patches are unfortunately a step backward for #2.
>> >>
>> >> As of the patch, do we have a way to query whether the target
>> >> can efficiently broadcast?  If so this IMHO belongs in generic
>> >
>> > We don't.  Perhaps if we'd add optab for vec_dup<mode> and mentioned
>> > clearly in the documentation that it should be used only if it is 
>> > reasonably
>> > efficient.  But still, even with optab, it would probably better to do it
>> > in the veclower* passes than in the vectorizer itself.
>>
>> I think we can assume that broadcast is relatively efficient, whether or not
>> vec_dup is present.  I'd lean to making the transformation generic to start
>> with, so that you don't need extra handling in the i386 backend.
>
> Ok, here is a generic veclower implementation without looking at any optabs,
> so far only handles PLUS_EXPR, what operation other than MULT_EXPR would
> make sense here?  Though, handling MULT_EXPR also would complicate the code
> slightly (it would need to handle say:
>   _2 = _1(D) + 1;
>   _3 = _2 + 2;
>   _4 = _3 * 2;
>   _5 = _4 * 3;
>   _6 = { _3, _4, _5, _4 };
> where we could start thinking first the operation is PLUS_EXPR, but it
> actually is MULT_EXPR with _3 as base).  Also, for MULT_EXPR, supposedly
> we could handle some values to be constant 0, like in:
>   _2 = _1(D) * 5;
>   _3 = _2 * 2;
>   _4 = _1(D) * 10;
>   _5 = { _3, 0, _4, _2, _1(D), 0, _4, _2 };
>
> Bootstrap/regtest pending, ok at least for this for the start and can be
> improved later on?
>
> 2013-11-21  Jakub Jelinek  <ja...@redhat.com>
>
>         * tree-vect-generic.c (optimize_vector_constructor): New function.
>         (expand_vector_operations_1): Call it.
>
>         * gcc.dg/vect/vect-124.c: New test.
>
> --- gcc/tree-vect-generic.c.jj  2013-11-19 21:56:40.000000000 +0100
> +++ gcc/tree-vect-generic.c     2013-11-21 11:17:55.146118161 +0100
> @@ -988,6 +988,84 @@ expand_vector_operation (gimple_stmt_ite
>                                     gimple_assign_rhs1 (assign),
>                                     gimple_assign_rhs2 (assign), code);

This patch caused PR 59273 [1] on alpha-linux-gnu.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59273

Uros.

Reply via email to