max when appropriate

Kyrill Tkachov Tue, 17 Jul 2018 09:17:02 -0700

Hi Thomas,

On 17/07/18 16:36, Thomas Koenig wrote:

Hi Kyrill,

The current implementation expands to:
     mvar = a1;
     if (a2 .op. mvar || isnan (mvar))
       mvar = a2;
     if (a3 .op. mvar || isnan (mvar))
       mvar = a3;
     ...
     return mvar;

That is, if one of the operands is a NaN it will return the other argument.
If both (all) are NaNs, it will return NaN. This is the same as the semantics 
of fmin/max
as far as I can tell.


I've looked at the F2008 standard, and, interestingly enough, the
requirement on MIN and MAX do not mention NaNs at all. 13.7.106
has, for MAX,

Result Value. The value of the result is that of the largest argument.

plus some stuff about character variables (not relevant here). Similar
for MIN.

Also, the section on IEEE_ARITHMETIC (14.9) does not mention
comparisons; also, "Complete conformance with IEC 60559:1989 is not
required", what is required is the correct support for +,-, and *,
plus support for / if IEEE_SUPPORT_DIVIDE is covered.


Thanks for checking this.

So, the Fortran standard does not impose many requirements. I do think
that a patch such as yours should not change the current behavior unless
we know what it does and do think it is a good idea.  Hmm...

Having said that, I think we pretty much cover all the corner cases
in nan_1.f90, so if that test passes without regression, then that
aspect should be fine.


Looking at the test it looks like there is a de facto expected behaviour.
For example it contains:
if (max(2.d0, nan) /= 2.d0) STOP 9

So it definitely expects comparison with NaN to return the non-NaN result,
which is a the behaviour what my patch preserves.

On integral arguments or when we don't care about NaNs (-Ofast and such) we'll 
be using
the MIN/MAX_EXPR, which doesn't specify what's returned on a NaN argument, thus 
allowing
for more aggressive optimisations.

Question: You have found an advantage on Aarm64. Do you have
access to other architectures so see if there is also a speed
advantage, or maybe a disadvantage?


Because the expansion now emits straightline code rather than conditionals and 
branches
it should be easier to optimise in general, so I'd expect this to be an 
improvement overall.
That said, I have benchmarked it on SPEC2017 on aarch64.

If you have any benchmarks of interest to you you (or somebody else) can run on 
a target that you
care about I would be very grateful for any results.

Thanks,
Kyrill

Regards

    Thomas

Re: [PATCH][Fortran] Use MIN/MAX_EXPR for intrinsics or __builtin_fmin/max when appropriate

Reply via email to