On Tue, Jul 17, 2018 at 2:35 PM Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > Hi all, > > This is my first Fortran patch, so apologies if I'm missing something. > The current expansion of the min and max intrinsics explicitly expands > the comparisons between each argument to calculate the global min/max. > Some targets, like aarch64, have instructions that can calculate the min/max > of two real (floating-point) numbers with the proper NaN-handling semantics > (if both inputs are NaN, return Nan. If one is NaN, return the other) and > those > are the semantics provided by the __builtin_fmin/max family of functions that > expand > to these instructions. > > This patch makes the frontend emit __builtin_fmin/max directly to compare each > pair of numbers when the numbers are floating-point, and use > MIN_EXPR/MAX_EXPR otherwise > (integral types and -ffast-math) which should hopefully be easier to > recognise in the
What is Fortrans requirement on min/max intrinsics? Doesn't it only require things that are guaranteed by MIN/MAX_EXPR anyways? The only restriction here is /* Minimum and maximum values. When used with floating point, if both operands are zeros, or if either operand is NaN, then it is unspecified which of the two operands is returned as the result. */ which means MIN/MAX_EXPR are not strictly IEEE compliant with signed zeros or NaNs. Thus the correct test would be !HONOR_SIGNED_ZEROS && !HONOR_NANS if singed zeros are significant. I'm not sure if using fmin/max calls when we cannot use MIN/MAX_EXPR is a good idea, this may both generate bigger code and be slower. Richard. > midend and optimise. The previous approach of generating the open-coded > version of that > is used when we don't have an appropriate __builtin_fmin/max available. > For example, for a configuration of x86_64-unknown-linux-gnu that I tested > there was no > 128-bit __built_fminl available. > > With this patch I'm seeing more than 7000 FMINNM/FMAXNM instructions being > generated at -O3 > on aarch64 for 521.wrf from fprate SPEC2017 where none before were generated > (we were generating explicit comparisons and NaN checks). This gave a 2.4% > improvement > in performance on a Cortex-A72. > > Bootstrapped and tested on aarch64-none-linux-gnu and > x86_64-unknown-linux-gnu. > > Ok for trunk? > Thanks, > Kyrill > > 2018-07-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * f95-lang.c (gfc_init_builtin_functions): Define __builtin_fmin, > __builtin_fminf, __builtin_fminl, __builtin_fmax, __builtin_fmaxf, > __builtin_fmaxl. > * trans-intrinsic.c: Include builtins.h. > (gfc_conv_intrinsic_minmax): Emit __builtin_fmin/max or MIN/MAX_EXPR > functions to calculate the min/max. > > 2018-07-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gfortran.dg/max_fmaxf.f90: New test. > * gfortran.dg/min_fminf.f90: Likewise. > * gfortran.dg/minmax_integer.f90: Likewise. > * gfortran.dg/max_fmaxl_aarch64.f90: Likewise. > * gfortran.dg/min_fminl_aarch64.f90: Likewise.