I did not look at execute_cse_reciprocals_1(), yet.

However, with the recip-patch applied:

double recipd (double a, double b)
{
  return a/b;
}

translates to

recipd:
  frecpe  d2, d1
  frecps  d3, d2, d1
  fmul  d2, d2, d3
  frecps  d3, d2, d1
  fmul  d2, d2, d3
  frecps  d1, d2, d1
  fmul  d2, d2, d1
  fmul  d0, d2, d0
  ret

float recipf (float a, float b)
{
  return a/b;
}

translates to

recipf:
  frecpe  s2, s1
  frecps  s3, s2, s1
  fmul  s2, s2, s3
  frecps  s1, s2, s1
  fmul  s2, s2, s1
  fmul  s0, s2, s0
  ret

So it seems, that it works also for a generic division.

Best regards,
Benedikt

> On 24 Jun 2015, at 22:39, Evandro Menezes <e.mene...@samsung.com> wrote:
> 
> Philipp,
> 
> I think that execute_cse_reciprocals_1() applies only when the denominator is 
> known at compile-time, otherwise the division stays.  It doesn't seem to know 
> whether the target supports the approximate reciprocal or not.
> 
> Cheers,
> 
> --
> Evandro Menezes                              Austin, TX
> 
> 
>> -----Original Message-----
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On
>> Behalf Of Dr. Philipp Tomsich
>> Sent: Wednesday, June 24, 2015 15:08
>> To: Evandro Menezes
>> Cc: Benedikt Huber; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
>> estimation in -ffast-math
>> 
>> Evandro,
>> 
>> Shouldn't ‘execute_cse_reciprocals_1’ take care of this, once the reciprocal-
>> division is implemented?
>> Do you think there’s additional work needed to catch all cases/opportunities?
>> 
>> Best,
>> Philipp.
>> 
>>> On 24 Jun 2015, at 20:19, Evandro Menezes <e.mene...@samsung.com> wrote:
>>> 
>>> Benedikt,
>>> 
>>> Are you developing the reciprocal approximation just for 1/x proper or for
>> any division, as in x/y = x * 1/y?
>>> 
>>> Thank you,
>>> 
>>> --
>>> Evandro Menezes                              Austin, TX
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Benedikt Huber [mailto:benedikt.hu...@theobroma-systems.com]
>>>> Sent: Wednesday, June 24, 2015 12:11
>>>> To: Dr. Philipp Tomsich
>>>> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org
>>>> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root
>>>> (rsqrt) estimation in -ffast-math
>>>> 
>>>> Evandro,
>>>> 
>>>> Yes, we also have the 1/x approximation.
>>>> However we do not have the test cases yet, and it also would need
>>>> some clean up.
>>>> I am going to provide a patch for that soon (say next week).
>>>> Also, for this optimization we have *not* yet found a benchmark with
>>>> significant improvements.
>>>> 
>>>> Best Regards,
>>>> Benedikt
>>>> 
>>>> 
>>>>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich
>>>>> <philipp.tomsich@theobroma-
>>>> systems.com> wrote:
>>>>> 
>>>>> Evandro,
>>>>> 
>>>>> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar)
>>>>> reciprocal
>>>> sqrt.
>>>>> 
>>>>> Also, the “reciprocal divide” patches are floating around in various
>>>>> of our git-tree, but aren’t ready for public consumption, yet… I’ll
>>>>> leave Benedikt to comment on potential timelines for getting that
>>>>> pushed
>>>> out.
>>>>> 
>>>>> Best,
>>>>> Philipp.
>>>>> 
>>>>>> On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com> wrote:
>>>>>> 
>>>>>> Benedikt,
>>>>>> 
>>>>>> You beat me to it! :-)  Do you have the implementation for dividing
>>>>>> using the Newton series as well?
>>>>>> 
>>>>>> I'm not sure that the series is always for all data types and on
>>>>>> all processors.  It would be useful to allow each AArch64 processor
>>>>>> to enable this or not depending on the data type.  BTW, do you have
>>>>>> some tests showing the speed up?
>>>>>> 
>>>>>> Thank you,
>>>>>> 
>>>>>> --
>>>>>> Evandro Menezes                              Austin, TX
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: gcc-patches-ow...@gcc.gnu.org
>>>>>>> [mailto:gcc-patches-ow...@gcc.gnu.org]
>>>>>> On
>>>>>>> Behalf Of Benedikt Huber
>>>>>>> Sent: Thursday, June 18, 2015 7:04
>>>>>>> To: gcc-patches@gcc.gnu.org
>>>>>>> Cc: benedikt.hu...@theobroma-systems.com;
>>>>>>> philipp.tomsich@theobroma- systems.com
>>>>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root
>>>>>>> (rsqrt) estimation in -ffast-math
>>>>>>> 
>>>>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt
>>>>>>> estimation
>>>>>> and
>>>>>>> a Newton-Raphson step, respectively.
>>>>>>> There are ARMv8 implementations where this is faster than using
>>>>>>> fdiv and rsqrt.
>>>>>>> It runs three steps for double and two steps for float to achieve
>>>>>>> the
>>>>>> needed
>>>>>>> precision.
>>>>>>> 
>>>>>>> There is one caveat and open question.
>>>>>>> Since -ffast-math enables flush to zero intermediate values
>>>>>>> between approximation steps will be flushed to zero if they are
>> denormal.
>>>>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX).
>>>>>>> The test cases pass, but it is unclear to me whether this is
>>>>>>> expected behavior with -ffast-math.
>>>>>>> 
>>>>>>> The patch applies to commit:
>>>>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470
>>>>>>> 
>>>>>>> Please consider including this patch.
>>>>>>> Thank you and best regards,
>>>>>>> Benedikt Huber
>>>>>>> 
>>>>>>> Benedikt Huber (1):
>>>>>>> 2015-06-15  Benedikt Huber  <benedikt.hu...@theobroma-systems.com>
>>>>>>> 
>>>>>>> gcc/ChangeLog                            |   9 +++
>>>>>>> gcc/config/aarch64/aarch64-builtins.c    |  60 ++++++++++++++++
>>>>>>> gcc/config/aarch64/aarch64-protos.h      |   2 +
>>>>>>> gcc/config/aarch64/aarch64-simd.md       |  27 ++++++++
>>>>>>> gcc/config/aarch64/aarch64.c             |  63 +++++++++++++++++
>>>>>>> gcc/config/aarch64/aarch64.md            |   3 +
>>>>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113
>>>>>>> +++++++++++++++++++++++++++++++
>>>>>>> 7 files changed, 277 insertions(+) create mode 100644
>>>>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c
>>>>>>> 
>>>>>>> --
>>>>>>> 1.9.1
>>>>>> <Mail Attachment.eml>
>>>>> 
>>> 
>>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to