https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22326

--- Comment #9 from luoxhu at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #6)
> (In reply to luoxhu from comment #4)
> > float foo(float f, float x, float y) {
> > return (fabs(f)*x+y);
> > }
> > 
> > the input of fabs is float type, so use fabsf is enough here, drafted a
> > patch to avoid double promotion when generating gimple if fabs could be
> > replaced by fabsf as argument[0] is float type.
> 
> what about adding something to match.pd for:
> ABS<(float_convert)f> into (float_convert)ABS<f>
> This is only valid prompting and not reducing the precision.

Thanks, this is already implemented in fold-const.c, though not using match.pd
and fabsf really.  fabs will always convert arguments to double type first in
front-end.  And there are 3 kind of cases for this issue:

1) "return fabs(x);"
tree
fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
{
...
    case ABS_EXPR:
      /* Convert fabs((double)float) into (double)fabsf(float).  */
      if (TREE_CODE (arg0) == NOP_EXPR
          && TREE_CODE (type) == REAL_TYPE)
        {
          tree targ0 = strip_float_extensions (arg0);
          if (targ0 != arg0)
            return fold_convert_loc (loc, type,
                                     fold_build1_loc (loc, ABS_EXPR,
                                                  TREE_TYPE (targ0),
                                                  targ0));
        }
      return NULL_TREE;
...
}

This piece of code could convert the code from "(float)fabs((double)x)" to
"(float)(double)(float)fabs(x)", then match.pd could remove the useless
convert.

2) "return fabs(x)*y;"

Frontend will generate "(float) (fabs((double) x) * (double) y)" expression
first, 
then fold-const.c:fold_unary_loc will Convert fabs((double)float) into
(double)fabsf(float) and get "(float)((double)fabs(x) * (double)y)", finally,
match.pd will convert (outertype)((innertype0)a+(innertype1)b) into
((newtype)a+(newtype)b) to remove the double conversion.

3)"return fabs(x)*y + z;"

Frontend produces: (float) ((fabs((double) float) * (double) y) + (double z))

So what we need here is to match the MUL&ADD in match.pd as followed, any
comments?

+(simplify (convert (plus (mult (convert@3 (abs @0)) (convert@4 @1)) (convert@5
@2)))
+ (if (( flag_unsafe_math_optimizations
+       && types_match (type, float_type_node)
+       && types_match (TREE_TYPE(@0), float_type_node)
+       && types_match (TREE_TYPE(@1), float_type_node)
+       && types_match (TREE_TYPE(@2), float_type_node)
+       && element_precision (TREE_TYPE(@3)) > element_precision (TREE_TYPE
(@0))
+       && element_precision (TREE_TYPE(@4)) > element_precision (TREE_TYPE
(@1))
+       && element_precision (TREE_TYPE(@5)) > element_precision (TREE_TYPE
(@2))
+   && ! HONOR_NANS (type)
+         && ! HONOR_INFINITIES (type)))
+  (plus (mult (abs @0) @1) @2) ))
+

1) and 2) won't generate double conversion, only 3) has frsp in fast-math mode,
and it could be removed by above pattern.

PS: convert_to_real_1 seems to me not quite related here? It converts
(float)sqrt((double)x) where x is float into sqrtf(x), but with recursive call
to convert_to_real_1 and build_call_expr with new mathfn_built_in, I suppose it
a bit complicated to move them to match.pd?

The optimization should be under fast-math mode, is
flag_unsafe_math_optimizations enough to guard them?

Reply via email to