Re: [PATCH][GCC] Optimize unnecessary casts out of binary math operations.

Barnaby Wilks Fri, 06 Sep 2019 06:46:43 -0700

On 9/6/19 12:53 PM, Richard Biener wrote:
> On Fri, 6 Sep 2019, Richard Biener wrote:
> 
>> On Thu, 5 Sep 2019, Barnaby Wilks wrote:
>>
>>> Hello,
>>>
>>> This patch changes a match.pd expression so that binary math expressions
>>> will not be done in the precision of it's widest type if
>>> -funsafe-math-optimizations is enabled.
>>>
>>> This patch has been extracted from
>>> https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00072.html
>>> based on upstream comments.
>>>
>>> For example take the function:
>>>
>>>     float f (float x, float y) {
>>>       double z = 1.0 / x;
>>>       return z * y;
>>>     }
>>>
>>> Without this patch this would generate the following (with -Ofast):
>>>
>>>     _1 = (double) x_4(D);
>>>     z_5 = 1.0e+0 / _1;
>>>     _2 = (double) y_6(D);
>>>     _3 = _2 * z_5;
>>>     _7 = (float) _3;
>>>     return _7;
>>>
>>> This contains 3 unnecessary casts - since all outputs of the expression
>>> are in single precision, the calculation itself can be done entirely in
>>> single precision - instead of converting the operands to double,
>>> doing the calculation and then converting the result back to
>>> single precision again.
>>>
>>> With this patch (and -Ofast) the following GIMPLE is generated:
>>>
>>>     _5 = 1.0e+0 / x_1(D);
>>>     _3 = y_2(D) * _5;
>>>     return _3;
>>>
>>> The benefits can then be seen in the generated code:
>>>
>>> Without this patch
>>>
>>>     f:
>>>       fcvt  d1, s1
>>>       fcvt  d0, s0
>>>       fdiv  d0, d1, d0
>>>       fcvt  s0, d0
>>>       ret
>>>
>>>
>>> With this patch
>>>
>>>     f:
>>>       fdiv s0, s1, s0
>>>       ret
>>>
>>> Added tests to verify that no unnecessary casts are kept.
>>>
>>> Bootstrapped and regtested on aarch64 and x86_64 with no regressions.
>>>
>>> I don't have write access, so if OK for trunk then can someone commit on
>>> my behalf?
>>
>> OK.  I'll commit it after a round of testing.
> 
> So on x86_64 I see the testcase FAILing:
> 
> FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized
> "\\\\(double\\\\)"
> FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized
> "\\\\(float\\\\)"
> 
> It's
> 
> float
> k (float a)
> {
>    return 1.0 / sqrt (a);
> }
> 
> where we don't narrow the sqrt call itself (leftover from the
> other pattern?)


Hi,

Ah yeah, that was just leftover from the other patch - I've attached the 
updated patch.

Just changed the sqrt -> sqrtf so that the casts are not due to the sqrt 
call but the divide instead (which should be optimized out by this patch).

Regards,
Barney

> Richard.
> 
>> Thanks,
>> Richard.
>>
>>> Regards
>>> Barney
>>>
>>> gcc/ChangeLog:
>>>
>>> 2019-09-05  Barnaby Wilks  <barnaby.wi...@arm.com>
>>>
>>>     * match.pd: Add flag_unsafe_math_optimizations check
>>>     before deciding on the widest type in a binary math operation.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2019-09-05  Barnaby Wilks  <barnaby.wi...@arm.com>
>>>
>>>     * gcc.dg/fold-binary-math-casts.c: New test.
>>
>

diff --git a/gcc/match.pd b/gcc/match.pd
index 
1d13543a6159dc94ce1ff1112c0bfc6b0d588638..5b2d95dfa9d8feef7e7248c0364909fc061da3ab
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5040,10 +5040,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
              && newtype == type
              && types_match (newtype, type))
            (op (convert:newtype @1) (convert:newtype @2))
-           (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
+           (with
+             {
+               if (!flag_unsafe_math_optimizations)
+                 {
+                   if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
                      newtype = ty1;
+
                    if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype))
-                     newtype = ty2; }
+                     newtype = ty2;
+                 }
+             }
+
               /* Sometimes this transformation is safe (cannot
                  change results through affecting double rounding
                  cases) and sometimes it is not.  If NEWTYPE is
diff --git a/gcc/testsuite/gcc.dg/fold-binary-math-casts.c 
b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c
new file mode 100644
index 
0000000000000000000000000000000000000000..53c247fa14360c9e5719b432aa213f899caa2d25
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-optimized" } */
+
+#include <math.h>
+
+float
+f (float x, float y)
+{
+  double z = 1.0 / x;
+  return z * y;
+}
+
+float
+g (float x, float y)
+{
+  double a = 1.0 / x;
+  double b = 1.0 / y;
+  long double k = x*x*x*x*x*x;
+
+  return a + b - k;
+}
+
+float
+h (float x)
+{
+  double a = x * 2.0;
+  double b = a / 3.5f;
+  return a + b;
+}
+
+float
+i (float y, float z)
+{
+  return pow (y, 2.0) / (double) (y + z);
+}
+
+float
+j (float x, float y)
+{
+  double t = 4.0 * x;
+  double z = t + y;
+  return z;
+}
+
+float
+k (float a)
+{
+  return 1.0 / sqrtf (a);
+}
+
+float
+l (float a)
+{
+  return (double) a * (a / 2.0);
+}
+
+/* { dg-final { scan-tree-dump-not "\\(double\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "\\(float\\)" "optimized" } } */

Re: [PATCH][GCC] Optimize unnecessary casts out of binary math operations.

Reply via email to