Hi, The attached patch adds a new switch -fftz-math which makes certain optimizations assume that "flush to zero" behavior of denormal inputs and outputs is not an optimization hint, but required behavior for semantical correctness.
The need for this was initiated by HSAIL (BRIG). With HSAIL, flush to zero handling is required, (not only "allowed") in case an HSAIL instruction is marked with the 'ftz' modifier (all HSA Base profile instructions are). The patch is not complete and likely misses many optimizations. However, it is a starting point that fixes a few cases brought out by the HSAIL conformance suite. We plan to extend this as new cases come up. OK for trunk? BR, Pekka
Index: gcc/common.opt =================================================================== --- gcc/common.opt (revision 251026) +++ gcc/common.opt (working copy) @@ -2281,6 +2281,11 @@ Common Report Var(flag_single_precision_constant) Optimization Convert floating point constants to single precision constants. +fftz-math +Common Report Var(flag_ftz_math) Optimization +Optimizations handle floating-point operations as they must flush +subnormal floating-point values to zero. + fsplit-ivs-in-unroller Common Report Var(flag_split_ivs_in_unroller) Init(1) Optimization Split lifetimes of induction variables when loops are unrolled. Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 251026) +++ gcc/doc/invoke.texi (working copy) @@ -9458,6 +9458,17 @@ This option is experimental and does not currently guarantee to disable all GCC optimizations that affect signaling NaN behavior. +@item -fftz-math +@opindex ftz-math +This option is experimental. With this flag on GCC treats +floating-point operations (except abs, class, copysign and neg) as +they must flush subnormal input operands and results to zero +(FTZ). The FTZ rules are derived from HSA Programmers Reference Manual +for the base profile. This alters optimizations that would break the +rules, for example X * 1 -> X simplification. The option assumes the +target supports FTZ in hardware and has it enabled - either by default +or set by the user. + @item -fno-fp-int-builtin-inexact @opindex fno-fp-int-builtin-inexact Do not allow the built-in functions @code{ceil}, @code{floor}, Index: gcc/fold-const-call.c =================================================================== --- gcc/fold-const-call.c (revision 251026) +++ gcc/fold-const-call.c (working copy) @@ -697,7 +697,7 @@ && do_mpfr_arg1 (result, mpfr_y1, arg, format)); CASE_CFN_FLOOR: - if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) + if ((!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) && !flag_ftz_math) { real_floor (result, format, arg); return true; @@ -705,7 +705,7 @@ return false; CASE_CFN_CEIL: - if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) + if ((!REAL_VALUE_ISNAN (*arg) || !flag_errno_math) && !flag_ftz_math) { real_ceil (result, format, arg); return true; Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 251026) +++ gcc/match.pd (working copy) @@ -143,6 +143,7 @@ (simplify (mult @0 real_onep) (if (!HONOR_SNANS (type) + && !flag_ftz_math && (!HONOR_SIGNED_ZEROS (type) || !COMPLEX_FLOAT_TYPE_P (type))) (non_lvalue @0))) @@ -151,6 +152,7 @@ (simplify (mult @0 real_minus_onep) (if (!HONOR_SNANS (type) + && !flag_ftz_math && (!HONOR_SIGNED_ZEROS (type) || !COMPLEX_FLOAT_TYPE_P (type))) (negate @0))) @@ -332,13 +334,13 @@ /* In IEEE floating point, x/1 is not equivalent to x for snans. */ (simplify (rdiv @0 real_onep) - (if (!HONOR_SNANS (type)) + (if (!HONOR_SNANS (type) && !flag_ftz_math) (non_lvalue @0))) /* In IEEE floating point, x/-1 is not equivalent to -x for snans. */ (simplify (rdiv @0 real_minus_onep) - (if (!HONOR_SNANS (type)) + (if (!HONOR_SNANS (type) && !flag_ftz_math) (negate @0))) (if (flag_reciprocal_math) Index: gcc/simplify-rtx.c =================================================================== --- gcc/simplify-rtx.c (revision 251026) +++ gcc/simplify-rtx.c (working copy) @@ -2565,8 +2565,10 @@ return op1; /* In IEEE floating point, x*1 is not equivalent to x for - signalling NaNs. */ + signalling NaNs. + For -fftz-math, x*1 is not equivalent to x for subnormals. */ if (!HONOR_SNANS (mode) + && (FLOAT_MODE_P (mode) && !flag_ftz_math) && trueop1 == CONST1_RTX (mode)) return op0;