On 9/6/19 12:53 PM, Richard Biener wrote:
> On Fri, 6 Sep 2019, Richard Biener wrote:
>
>> On Thu, 5 Sep 2019, Barnaby Wilks wrote:
>>
>>> Hello,
>>>
>>> This patch changes a match.pd expression so that binary math expressions
>>> will not be done in the precision of it's widest type if
>>> -funsafe-math-optimizations is enabled.
>>>
>>> This patch has been extracted from
>>> https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00072.html
>>> based on upstream comments.
>>>
>>> For example take the function:
>>>
>>> float f (float x, float y) {
>>> double z = 1.0 / x;
>>> return z * y;
>>> }
>>>
>>> Without this patch this would generate the following (with -Ofast):
>>>
>>> _1 = (double) x_4(D);
>>> z_5 = 1.0e+0 / _1;
>>> _2 = (double) y_6(D);
>>> _3 = _2 * z_5;
>>> _7 = (float) _3;
>>> return _7;
>>>
>>> This contains 3 unnecessary casts - since all outputs of the expression
>>> are in single precision, the calculation itself can be done entirely in
>>> single precision - instead of converting the operands to double,
>>> doing the calculation and then converting the result back to
>>> single precision again.
>>>
>>> With this patch (and -Ofast) the following GIMPLE is generated:
>>>
>>> _5 = 1.0e+0 / x_1(D);
>>> _3 = y_2(D) * _5;
>>> return _3;
>>>
>>> The benefits can then be seen in the generated code:
>>>
>>> Without this patch
>>>
>>> f:
>>> fcvt d1, s1
>>> fcvt d0, s0
>>> fdiv d0, d1, d0
>>> fcvt s0, d0
>>> ret
>>>
>>>
>>> With this patch
>>>
>>> f:
>>> fdiv s0, s1, s0
>>> ret
>>>
>>> Added tests to verify that no unnecessary casts are kept.
>>>
>>> Bootstrapped and regtested on aarch64 and x86_64 with no regressions.
>>>
>>> I don't have write access, so if OK for trunk then can someone commit on
>>> my behalf?
>>
>> OK. I'll commit it after a round of testing.
>
> So on x86_64 I see the testcase FAILing:
>
> FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized
> "\\\\(double\\\\)"
> FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized
> "\\\\(float\\\\)"
>
> It's
>
> float
> k (float a)
> {
> return 1.0 / sqrt (a);
> }
>
> where we don't narrow the sqrt call itself (leftover from the
> other pattern?)
Hi,
Ah yeah, that was just leftover from the other patch - I've attached the
updated patch.
Just changed the sqrt -> sqrtf so that the casts are not due to the sqrt
call but the divide instead (which should be optimized out by this patch).
Regards,
Barney
> Richard.
>
>> Thanks,
>> Richard.
>>
>>> Regards
>>> Barney
>>>
>>> gcc/ChangeLog:
>>>
>>> 2019-09-05 Barnaby Wilks <[email protected]>
>>>
>>> * match.pd: Add flag_unsafe_math_optimizations check
>>> before deciding on the widest type in a binary math operation.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2019-09-05 Barnaby Wilks <[email protected]>
>>>
>>> * gcc.dg/fold-binary-math-casts.c: New test.
>>
>
diff --git a/gcc/match.pd b/gcc/match.pd
index
1d13543a6159dc94ce1ff1112c0bfc6b0d588638..5b2d95dfa9d8feef7e7248c0364909fc061da3ab
100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5040,10 +5040,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& newtype == type
&& types_match (newtype, type))
(op (convert:newtype @1) (convert:newtype @2))
- (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
+ (with
+ {
+ if (!flag_unsafe_math_optimizations)
+ {
+ if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
newtype = ty1;
+
if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype))
- newtype = ty2; }
+ newtype = ty2;
+ }
+ }
+
/* Sometimes this transformation is safe (cannot
change results through affecting double rounding
cases) and sometimes it is not. If NEWTYPE is
diff --git a/gcc/testsuite/gcc.dg/fold-binary-math-casts.c
b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c
new file mode 100644
index
0000000000000000000000000000000000000000..53c247fa14360c9e5719b432aa213f899caa2d25
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-optimized" } */
+
+#include <math.h>
+
+float
+f (float x, float y)
+{
+ double z = 1.0 / x;
+ return z * y;
+}
+
+float
+g (float x, float y)
+{
+ double a = 1.0 / x;
+ double b = 1.0 / y;
+ long double k = x*x*x*x*x*x;
+
+ return a + b - k;
+}
+
+float
+h (float x)
+{
+ double a = x * 2.0;
+ double b = a / 3.5f;
+ return a + b;
+}
+
+float
+i (float y, float z)
+{
+ return pow (y, 2.0) / (double) (y + z);
+}
+
+float
+j (float x, float y)
+{
+ double t = 4.0 * x;
+ double z = t + y;
+ return z;
+}
+
+float
+k (float a)
+{
+ return 1.0 / sqrtf (a);
+}
+
+float
+l (float a)
+{
+ return (double) a * (a / 2.0);
+}
+
+/* { dg-final { scan-tree-dump-not "\\(double\\)" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "\\(float\\)" "optimized" } } */