https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121570
Bug ID: 121570 Summary: Very high cost of ieee_next_after function, gfortran optimization failure? Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: b.j.braams at cwi dot nl Target Milestone: --- To clarify the subject heading, the appended code contains three timed loops that each iterate over close to 2^32 values. The first loop contains only very simple arithmetic, the second loop invokes the fortran 'nearest' intrinsic and the third loop invokes ieee_next_after. Using gfortran 14.2.1 with the -O5 option the execution times for the three loops are 4.0 seconds, 12 seconds and 920 seconds respectively on my Intel i7-1165G7 processor. I think that Andi Kleen of the gcc-bugs mailing list may add some comments here. He identified that under the hood ieee_next_after invokes routines to save the fpu state before the actual computation and to restore it afterwards, and this is where the bulk of the time is spent. He comments further that these invocations of _gfortran_ieee_procedure_{entry,exit} may perhaps be optimized out of the loop. (I note that with 'ifx -O5' the ieee_next_after loop takes only 17 seconds on my system.) Test program follows. program main !..use and access use iso_fortran_env, only : int32, real32, real64 use ieee_arithmetic implicit none !..data integer (kind=int32) :: i32, j32 real (kind=real32) :: t32, r32 real (kind=real64) :: tim0, tim1 !..executable part j32 = huge(0_int32) i32 = -j32-1 r32 = 0 call cpu_time (tim0) ! 2^32 iterations of a simple arithmetic statement do while (i32.ne.j32) r32 = -r32+i32 i32 = i32+1 end do call cpu_time (tim1) write (*,'(a25,1pg9.2,1pg16.9)') & '2^32 simple arithmetic:', tim1-tim0, r32 t32 = huge(0.0_real32) r32 = -t32 call cpu_time (tim0) ! close to 2^32 iterations of fortran nearest do while (r32.ne.t32) r32 = nearest(r32,1.0_real32) end do call cpu_time (tim1) write (*,'(a25,1pg9.2,1pg16.9)') & '2^32 intrinsic nearest:', tim1-tim0, r32 t32 = huge(0.0_real32) r32 = -t32 call cpu_time (tim0) ! close to 2^32 iterations of ieee_next_after do while (r32.ne.t32) r32 = ieee_next_after(r32,t32) end do call cpu_time (tim1) write (*,'(a25,1pg9.2,1pg16.9)') & '2^32 ieee_next_after:', tim1-tim0, r32 stop end program main