Re: [PATCH, ARM] Support NEON's VABD with combine pass

2011-09-12 Thread Ramana Radhakrishnan
On 12 September 2011 17:11, Dmitry Melnik wrote: > >> Interesting but I would be a bit defensive and make sure that this >> matches only if -ffast-math in the FP case. You are sort of relying on >> the fact that vsub wouldn't be generated without ffast-math but I'd >> rather be defensive about it

Re: [PATCH, ARM] Support NEON's VABD with combine pass

2011-09-12 Thread Dmitry Melnik
Interesting but I would be a bit defensive and make sure that this matches only if -ffast-math in the FP case. You are sort of relying on the fact that vsub wouldn't be generated without ffast-math but I'd rather be defensive about it . (This is in case it's not clear in the non-intrinsics case)

Re: [PATCH, ARM] Support NEON's VABD with combine pass

2011-08-05 Thread Joseph S. Myers
On Fri, 5 Aug 2011, Ramana Radhakrishnan wrote: > I've had a couple of conversations about what the intrinsics > behaviour should in such cases with folks. Should we try to match vabs > (vsub) even for intrinsics and generate a vabd or desist from doing > this and generate only what was asked f

Re: [PATCH, ARM] Support NEON's VABD with combine pass

2011-08-04 Thread Ramana Radhakrishnan
On 29 July 2011 10:58, Dmitry Melnik wrote: > This patch adds two define_insn patterns for NEON vabd instruction to make > combine pass recognize expressions matching (vabs (vsub ...)) patterns as > vabd. Interesting but I would be a bit defensive and make sure that this matches only if -ffast-ma

[PATCH, ARM] Support NEON's VABD with combine pass

2011-07-29 Thread Dmitry Melnik
This patch adds two define_insn patterns for NEON vabd instruction to make combine pass recognize expressions matching (vabs (vsub ...)) patterns as vabd. This patch reduces code size of x264 binary from 649143 to 648343 (800 bytes, or 0.12%) and increases its performance on average by 2.5% on

[PATCH, ARM] Support NEON's VABD with combine pass

2011-07-29 Thread Dmitry Melnik
This patch adds two define_insn patterns for NEON vabd instruction to make combine pass recognize expressions matching (vabs (vsub ...)) patterns as vabd. This patch reduces code size of x264 binary from 649143 to 648343 (800 bytes, or 0.12%) and increases its performance on average by 2.5% on