On Tue, Jul 17, 2018 at 3:46 PM Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > Hi Richard, > > On 17/07/18 14:27, Richard Biener wrote: > > On Tue, Jul 17, 2018 at 2:35 PM Kyrill Tkachov > > <kyrylo.tkac...@foss.arm.com> wrote: > >> Hi all, > >> > >> This is my first Fortran patch, so apologies if I'm missing something. > >> The current expansion of the min and max intrinsics explicitly expands > >> the comparisons between each argument to calculate the global min/max. > >> Some targets, like aarch64, have instructions that can calculate the > >> min/max > >> of two real (floating-point) numbers with the proper NaN-handling semantics > >> (if both inputs are NaN, return Nan. If one is NaN, return the other) and > >> those > >> are the semantics provided by the __builtin_fmin/max family of functions > >> that expand > >> to these instructions. > >> > >> This patch makes the frontend emit __builtin_fmin/max directly to compare > >> each > >> pair of numbers when the numbers are floating-point, and use > >> MIN_EXPR/MAX_EXPR otherwise > >> (integral types and -ffast-math) which should hopefully be easier to > >> recognise in the > > What is Fortrans requirement on min/max intrinsics? Doesn't it only > > require things that > > are guaranteed by MIN/MAX_EXPR anyways? The only restriction here is > > The current implementation expands to: > mvar = a1; > if (a2 .op. mvar || isnan (mvar)) > mvar = a2; > if (a3 .op. mvar || isnan (mvar)) > mvar = a3; > ... > return mvar; > > That is, if one of the operands is a NaN it will return the other argument. > If both (all) are NaNs, it will return NaN. This is the same as the semantics > of fmin/max > as far as I can tell. > > > /* Minimum and maximum values. When used with floating point, if both > > operands are zeros, or if either operand is NaN, then it is unspecified > > which of the two operands is returned as the result. */ > > > > which means MIN/MAX_EXPR are not strictly IEEE compliant with signed > > zeros or NaNs. > > Thus the correct test would be !HONOR_SIGNED_ZEROS && !HONOR_NANS if singed > > zeros are significant. > > True, MIN/MAX_EXPR would not be appropriate in that condition. I guarded > their use > on !HONOR_NANS (type) only. I'll update it to !HONOR_SIGNED_ZEROS (type) && > !HONOR_NANS (type). > > > > > > I'm not sure if using fmin/max calls when we cannot use MIN/MAX_EXPR > > is a good idea, > > this may both generate bigger code and be slower. > > The patch will generate fmin/fmax calls (or the fminf,fminl variants) when > mathfn_built_in advertises > them as available (does that mean they'll have a fast inline implementation?)
This doesn't mean anything given you make them available with your patch ;) So I expect it may cause issues for !c99_runtime targets (and long double at least). > If the above doesn't hold and we can't use either MIN/MAX_EXPR of fmin/fmax > then the patch falls back > to the existing expansion. As said I would not use fmin/fmax calls here at all. > FWIW, this patch does improve performance on 521.wrf from SPEC2017 on aarch64. You said that, yes. Even without -ffast-math? Richard. > Thanks, > Kyrill > > > > > Richard. > > > >> midend and optimise. The previous approach of generating the open-coded > >> version of that > >> is used when we don't have an appropriate __builtin_fmin/max available. > >> For example, for a configuration of x86_64-unknown-linux-gnu that I tested > >> there was no > >> 128-bit __built_fminl available. > >> > >> With this patch I'm seeing more than 7000 FMINNM/FMAXNM instructions being > >> generated at -O3 > >> on aarch64 for 521.wrf from fprate SPEC2017 where none before were > >> generated > >> (we were generating explicit comparisons and NaN checks). This gave a 2.4% > >> improvement > >> in performance on a Cortex-A72. > >> > >> Bootstrapped and tested on aarch64-none-linux-gnu and > >> x86_64-unknown-linux-gnu. > >> > >> Ok for trunk? > >> Thanks, > >> Kyrill > >> > >> 2018-07-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > >> > >> * f95-lang.c (gfc_init_builtin_functions): Define __builtin_fmin, > >> __builtin_fminf, __builtin_fminl, __builtin_fmax, __builtin_fmaxf, > >> __builtin_fmaxl. > >> * trans-intrinsic.c: Include builtins.h. > >> (gfc_conv_intrinsic_minmax): Emit __builtin_fmin/max or MIN/MAX_EXPR > >> functions to calculate the min/max. > >> > >> 2018-07-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > >> > >> * gfortran.dg/max_fmaxf.f90: New test. > >> * gfortran.dg/min_fminf.f90: Likewise. > >> * gfortran.dg/minmax_integer.f90: Likewise. > >> * gfortran.dg/max_fmaxl_aarch64.f90: Likewise. > >> * gfortran.dg/min_fminl_aarch64.f90: Likewise. >