Ping?

thanks,
Cong


On Mon, Nov 11, 2013 at 11:25 AM, Cong Hou <co...@google.com> wrote:
> Hi James
>
> Sorry for the late reply.
>
>
> On Fri, Nov 8, 2013 at 2:55 AM, James Greenhalgh
> <james.greenha...@arm.com> wrote:
>>> On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou <co...@google.com> wrote:
>>> > Thank you for your detailed explanation.
>>> >
>>> > Once GCC detects a reduction operation, it will automatically
>>> > accumulate all elements in the vector after the loop. In the loop the
>>> > reduction variable is always a vector whose elements are reductions of
>>> > corresponding values from other vectors. Therefore in your case the
>>> > only instruction you need to generate is:
>>> >
>>> >     VABAL   ops[3], ops[1], ops[2]
>>> >
>>> > It is OK if you accumulate the elements into one in the vector inside
>>> > of the loop (if one instruction can do this), but you have to make
>>> > sure other elements in the vector should remain zero so that the final
>>> > result is correct.
>>> >
>>> > If you are confused about the documentation, check the one for
>>> > udot_prod (just above usad in md.texi), as it has very similar
>>> > behavior as usad. Actually I copied the text from there and did some
>>> > changes. As those two instruction patterns are both for vectorization,
>>> > their behavior should not be difficult to explain.
>>> >
>>> > If you have more questions or think that the documentation is still
>>> > improper please let me know.
>>
>> Hi Cong,
>>
>> Thanks for your reply.
>>
>> I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and
>> DOT_PROD_EXPR and I see that the same ambiguity exists for
>> DOT_PROD_EXPR. Can you please add a note in your tree.def
>> that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either:
>>
>>   tmp = WIDEN_MINUS_EXPR (arg1, arg2)
>>   tmp2 = ABS_EXPR (tmp)
>>   arg3 = PLUS_EXPR (tmp2, arg3)
>>
>> or:
>>
>>   tmp = WIDEN_MINUS_EXPR (arg1, arg2)
>>   tmp2 = ABS_EXPR (tmp)
>>   arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
>>
>> Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a
>> a value of the same (widened) type as arg3.
>>
>
>
> I have added it, although we currently don't have WIDEN_MINUS_EXPR (I
> mentioned it in tree.def).
>
>
>> Also, while looking for the history of DOT_PROD_EXPR I spotted this
>> patch:
>>
>>   [autovect] [patch] detect mult-hi and sad patterns
>>   http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html
>>
>> I wonder what the reason was for that patch to be dropped?
>>
>
> It has been 8 years.. I have no idea why this patch is not accepted
> finally. There is even no reply in that thread. But I believe the SAD
> pattern is very important to be recognized. ARM also provides
> instructions for it.
>
>
> Thank you for your comment again!
>
>
> thanks,
> Cong
>
>
>
>> Thanks,
>> James
>>

Reply via email to