On Wed, Jul 18, 2018 at 5:03 PM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com > wrote:
> > On 18/07/18 14:26, Thomas König wrote: > >> Hi Kyrlll, >> >> Am 18.07.2018 um 13:17 schrieb Kyrill Tkachov < >>> kyrylo.tkac...@foss.arm.com>: >>> >>> Thomas, Janne, would this relaxation of NaN handling be acceptable given >>> the benefits >>> mentioned above? If so, what would be the recommended adjustment to the >>> nan_1.f90 test? >>> >> I would be a bit careful about changing behavior in such a major way. >> What would the results with NaN and infinity then be, with or without >> optimization? Would the results be consistent with min(nan,num) vs >> min(num,nan)? Would they be consistent with the new IEEE standard? >> >> In general, I think that min(nan,num) should be nan and that our current >> behavior is not the best. >> >> Does anybody have dats points on how this is handled by other compilers? >> >> Oh, and if anything is changed, then compile and runtime behavior should >> always be the same. >> > > Thanks, that makes it clearer what behaviour is accceptable. > > So this v3 patch follows Richard Sandiford's suggested approach of > emitting IFN_FMIN/FMAX > when dealing with floating-point values and NaN handling is important and > the target > supports the IFN_FMIN/FMAX. Otherwise the current explicit comparison > sequence is emitted. > For integer types and -ffast-math floating-point it will emit MIN/MAX_EXPR. > > With this patch the nan_1.f90 behaviour is preserved on all targets, we > get the optimal > sequence on aarch64 and on x86_64 we avoid the function call, with no > changes in code generation. > > This gives the performance improvement on 521.wrf on aarch64 and leaves it > unchanged on x86_64. > > I'm hoping this addresses all the concerns raised in this thread: > * The NaN-handling behaviour is unchanged on all platforms. > * The fast inline sequence is emitted where it is available. > * No calls to library fmin*/fmax* are emitted where there were none. > * MIN/MAX_EXPR sequence are emitted where possible. > > Is this acceptable? > So if I understand it correctly, the "internal fn" thing is a mechanism that allows to check whether the target supports expanding a builtin inline or whether it requires a call to an external library function? If so, then yes, Ok, thanks for the patch! -- Janne Blomqvist