On 9/6/19 12:53 PM, Richard Biener wrote: > On Fri, 6 Sep 2019, Richard Biener wrote: > >> On Thu, 5 Sep 2019, Barnaby Wilks wrote: >> >>> Hello, >>> >>> This patch changes a match.pd expression so that binary math expressions >>> will not be done in the precision of it's widest type if >>> -funsafe-math-optimizations is enabled. >>> >>> This patch has been extracted from >>> https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00072.html >>> based on upstream comments. >>> >>> For example take the function: >>> >>> float f (float x, float y) { >>> double z = 1.0 / x; >>> return z * y; >>> } >>> >>> Without this patch this would generate the following (with -Ofast): >>> >>> _1 = (double) x_4(D); >>> z_5 = 1.0e+0 / _1; >>> _2 = (double) y_6(D); >>> _3 = _2 * z_5; >>> _7 = (float) _3; >>> return _7; >>> >>> This contains 3 unnecessary casts - since all outputs of the expression >>> are in single precision, the calculation itself can be done entirely in >>> single precision - instead of converting the operands to double, >>> doing the calculation and then converting the result back to >>> single precision again. >>> >>> With this patch (and -Ofast) the following GIMPLE is generated: >>> >>> _5 = 1.0e+0 / x_1(D); >>> _3 = y_2(D) * _5; >>> return _3; >>> >>> The benefits can then be seen in the generated code: >>> >>> Without this patch >>> >>> f: >>> fcvt d1, s1 >>> fcvt d0, s0 >>> fdiv d0, d1, d0 >>> fcvt s0, d0 >>> ret >>> >>> >>> With this patch >>> >>> f: >>> fdiv s0, s1, s0 >>> ret >>> >>> Added tests to verify that no unnecessary casts are kept. >>> >>> Bootstrapped and regtested on aarch64 and x86_64 with no regressions. >>> >>> I don't have write access, so if OK for trunk then can someone commit on >>> my behalf? >> >> OK. I'll commit it after a round of testing. > > So on x86_64 I see the testcase FAILing: > > FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized > "\\\\(double\\\\)" > FAIL: gcc.dg/fold-binary-math-casts.c scan-tree-dump-not optimized > "\\\\(float\\\\)" > > It's > > float > k (float a) > { > return 1.0 / sqrt (a); > } > > where we don't narrow the sqrt call itself (leftover from the > other pattern?)
Hi, Ah yeah, that was just leftover from the other patch - I've attached the updated patch. Just changed the sqrt -> sqrtf so that the casts are not due to the sqrt call but the divide instead (which should be optimized out by this patch). Regards, Barney > Richard. > >> Thanks, >> Richard. >> >>> Regards >>> Barney >>> >>> gcc/ChangeLog: >>> >>> 2019-09-05 Barnaby Wilks <barnaby.wi...@arm.com> >>> >>> * match.pd: Add flag_unsafe_math_optimizations check >>> before deciding on the widest type in a binary math operation. >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2019-09-05 Barnaby Wilks <barnaby.wi...@arm.com> >>> >>> * gcc.dg/fold-binary-math-casts.c: New test. >> >
diff --git a/gcc/match.pd b/gcc/match.pd index 1d13543a6159dc94ce1ff1112c0bfc6b0d588638..5b2d95dfa9d8feef7e7248c0364909fc061da3ab 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5040,10 +5040,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && newtype == type && types_match (newtype, type)) (op (convert:newtype @1) (convert:newtype @2)) - (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype)) + (with + { + if (!flag_unsafe_math_optimizations) + { + if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype)) newtype = ty1; + if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype)) - newtype = ty2; } + newtype = ty2; + } + } + /* Sometimes this transformation is safe (cannot change results through affecting double rounding cases) and sometimes it is not. If NEWTYPE is diff --git a/gcc/testsuite/gcc.dg/fold-binary-math-casts.c b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c new file mode 100644 index 0000000000000000000000000000000000000000..53c247fa14360c9e5719b432aa213f899caa2d25 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-binary-math-casts.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -fdump-tree-optimized" } */ + +#include <math.h> + +float +f (float x, float y) +{ + double z = 1.0 / x; + return z * y; +} + +float +g (float x, float y) +{ + double a = 1.0 / x; + double b = 1.0 / y; + long double k = x*x*x*x*x*x; + + return a + b - k; +} + +float +h (float x) +{ + double a = x * 2.0; + double b = a / 3.5f; + return a + b; +} + +float +i (float y, float z) +{ + return pow (y, 2.0) / (double) (y + z); +} + +float +j (float x, float y) +{ + double t = 4.0 * x; + double z = t + y; + return z; +} + +float +k (float a) +{ + return 1.0 / sqrtf (a); +} + +float +l (float a) +{ + return (double) a * (a / 2.0); +} + +/* { dg-final { scan-tree-dump-not "\\(double\\)" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "\\(float\\)" "optimized" } } */