------- Comment #5 from jakub at gcc dot gnu dot org  2009-07-14 13:31 -------
Created an attachment (id=18193)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18193&action=view)
gcc45-pr40643.patch

And now a patch which uses two loops instead of one if needed for performance
(in the honor nans case or when mask is used).  The first loop will stop after
changing pos the first time, the second loop then can just use val < limit
comparison.

E.g. on
subroutine bar(a, n)
  real, allocatable :: a(:)
  integer :: n
end subroutine
program main
  implicit none
  interface bar
    subroutine bar(a, n)
      real, allocatable :: a(:)
      integer :: n
    end subroutine
  end interface
  integer :: n
  real, allocatable :: a(:)
  integer :: i

  allocate (a(100))
  call random_number(a)
  do i=1, 10000000
    n = minloc(a, dim=1)
    call bar(a, n)
  end do
end program main

this patch shows very noticeable difference both for -O3 -ffast-math
-funroll-loops and for -O3 -funroll-loops.
If you think this is the way we should go, I can change also the inline
minval/maxval version and attempt to change the library routines as well.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40643

Reply via email to