On Wed, 4 Dec 2013, Vidya Praveen wrote:
> Hi Richi,
>
> Apologies for the late response. I was on vacation.
>
> On Mon, Oct 14, 2013 at 09:04:58AM +0100, Richard Biener wrote:
> > > void
> > > foo (int *__restrict__ a,
> > > int *__restrict__ b,
> > > int c)
> > > {
> > > int i;
> > >
> > > for (i = 0; i < 8; i++)
> > > a[i] = b[i] * c;
> > > }
> >
> > Both cases can be handled by patterns that match
> >
> > (mul:VXSI (reg:VXSI
> > (vec_duplicate:VXSI reg:SI)))
>
> How do I arrive at this pattern in the first place? Assuming vec_init with
> uniform values are expanded as vec_duplicate, it will still be two
> expressions.
>
> That is,
>
> (set reg:VXSI (vec_duplicate:VXSI (reg:SI)))
> (set reg:VXSI (mul:VXSI (reg:VXSI) (reg:VXSI)))
Yes, but then combine comes along and creates
(set reg:VXSI (mul:VXSI (reg:VXSI (vec_duplicate:VXSI (reg:SI)))))
which matches one of your define_insn[_and_split]s.
> > You'd then "consume" the vec_duplicate and implement it as
> > load scalar into element zero of the vector and use index mult
> > with index zero.
>
> If I understand this correctly, you are suggesting to leave the scalar
> load from memory as it is but treat the
>
> (mul:VXSI (reg:VXSI (vec_duplicate:VXSI reg:SI)))
>
> as
>
> load reg:VXSI[0], reg:SI
> mul reg:VXSI, reg:VXSI, re:VXSI[0] // by reusing the destination register
> perhaps
>
> either by generating instructions directly or by using define_split. Am I
> right?
Possibly. Or allow memory as operand 2 for your pattern (so, not
reg:SI but mem:SI). Combine should be happy with that, too.
> If I'm right, then my concern is that it may be possible to simplify this
> further
> by loading directly to a indexed vector register from memory but it's too
> late at
> this point for such simplification to be possible.
>
> Please let me know what am I not understanding.
Not sure. Did you try it?
Richard.