https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121570
--- Comment #4 from kargls at comcast dot net --- > > movq %rbx, %rdi > movq %r14, %rsi > callq __for_ieee_next_after_k4_@PLT > movss %xmm0, 12(%rsp) > ucomiss 16(%rsp), %xmm0 > jne .LBB0_1 > jp .LBB0_1 We don't know what Intel is doing within __for_ieee_after_k4. I've quoted the Fortran standard about requirements: 1) On entry to a procedure, save current exceptions 2) Quiet all exceptions 3) Execute procedure 4) Restore exceptions from entry into function 5) Update exceptions that may have occurred during execution Those requirements force > whereas gfortran does > > movq %rbx, %rdi > movss %xmm0, 8(%rsp) > call _gfortran_ieee_procedure_entry this call ... > movss 8(%rsp), %xmm0 > pxor %xmm1, %xmm1 > movss %xmm1, 12(%rsp) > call nextafterf > movq %rbx, %rdi > movss %xmm0, 20(%rsp) > movss %xmm0, 8(%rsp) > call _gfortran_ieee_procedure_exit and this call. But, see below ... > movss 8(%rsp), %xmm0 > ucomiss 12(%rsp), %xmm0 > > I cannot look at what ifx's __for_ieee_next_after_k4_ does, but > a separate, more optimized implementation for ieee_next_after might > be faster also for gfortran. The only thing that one might be able to do is in-line the _entry and _exit procedure to avoid function call overhead. Intel has the luxury that it deals with only Intel/AMD cpus. gfortran has seven different config files: fpu-387.h, fpu-aix.h, fpu-generic.h, fpu-glibc.h fpu-sysv.h, fpu-aarch64.h, and fpu-macppc.h. > For example, it could check its argument > if the operation will raise an exception, and branch in that event > (which could be marked as unlikely, and after a few iterations, would > be marked as unlikely to be taken by the CPU). > > Confirmed as an enhancement request. ... here. If it can be assumed that ieee_next_after, which is mapped to nextafter (on x86_64-*-freebsd) is already IEEE-754 compliant, then the calls to _entry and _exit are redundant so gfortran need not emit them. That is, subroutine foo(x, y) use ieee_arithmetic real x x = ieee_next_after(x, 10.) end subroutine foo would be translated to __attribute__((fn spec (". w w "))) void foo (real(kind=4) & restrict x, real(kind=4) & restrict y) { c_char fpstate.0[33]; try { // Needed for 1 and 2 above on entry into foo _gfortran_ieee_procedure_entry ((void *) &fpstate.0); { // This is 3 above, i.e., execution of procedure *x = __builtin_nextafterf (*x, 1.0e+1); } } finally { // Needed for 4 and 5 above on exit from foo _gfortran_ieee_procedure_exit ((void *) &fpstate.0); } }