[Bug fortran/48636] Enable more inlining with -O2 and higher

burnus at gcc dot gnu.org Wed, 20 Apr 2011 05:29:57 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636


--- Comment #7 from Tobias Burnus <burnus at gcc dot gnu.org> 2011-04-20 
12:29:02 UTC ---
(In reply to comment #6)
> > Here is some sample code (extreme, I admit) which profits a lot from
> > inlining:
> > 
> > - Strides are known to be one when inlining (a common case, but you can
> >   never be sure if the user doesn't call a(1:5:2))

First, you do not have any issue with strides if the dummy argument is either
allocatable, has the contiguous attribute, or is an explicit or assumed-sized
array.

For inlining, I see only one place where information loss happens: If a
simply-contiguous array is passed as actual argument to a assumed-shape dummy.
Then the Fortran front-end knows that the stride of the actual argument is 1,
but the callee needs to assume an arbitrary stride. The middle-end will
continue to do so as the "simply contiguous" information is lost - even though
it would be profitable for inlining.


> Not strictly related to inlining, but in the new descriptor we'll have a field
> specifying whether the array is simply contiguous

I am not sure we will indeed have one; initially I thought one should, but I am
no longer convinced that it is the right approach. My impression is now that
setting and updating the flag all the time is more expensive then doing once a
is_contiguous() check. The TR descriptor also does not such an flag - thus one
needs to handle such arrays - if they come from C - with extra care. (Unless
one requires the C side to call a function, which could set this flag. I think
one does not need to do so.)


By the way, the latest version of the TR draft is linked at
http://j3-fortran.org/pipermail/interop-tr/2011-April/000582.html


> so it might make sense to
> generate two loops for each loop over the array in the source, one for the
> contiguous case where it can be vectorized etc. and another loop for the
> general case.

Maybe. Definitely not for -Os. Best would be if the middle end would be able to
generate automatically a stride-free version when it thinks that it is
profitable. The FE could also do it, if one had a way to tell the ME that it
might drop the stride-free version, if it thinks that it is more profitable.


> As we're planning to use the TR 29113 descriptor as the native one, this has
> some implications for the procedure call interface as well. See
> http://gcc.gnu.org/ml/fortran/2011-03/msg00215.html

Regarding:
"For a descriptor of an assumed-shape array, the value of the
lower-bound member of each element of the dim member of the descriptor
shall be zero."

That's actually also not that different from the current situation: In Fortran,
the lower bound of assumed-shape arrays is also always the same: It is 1. Which
makes sense as on can then do the following w/o worrying about the lbound:
  subroutine bar(a)
    real :: a(:)
    do i = 1, ubound(a, dim=1)
      a(i) = ...

For explicit-shape/assumed-size arrays one does not have a descriptor and for
deferred-shape arrays (allocatables, pointers) the TR keeps the lbound - which
is the same as currently in Fortran.

> This will reduce the procedure call overhead substantially, at the cost
> of some extra work in the caller in the case of non-default lower bounds.

Which is actually nothing new ... That's the reason that one often creates a
new descriptor for procedure calls.

[Bug fortran/48636] Enable more inlining with -O2 and higher

Reply via email to