Re: GSoC(run-time argument checking project)

Tobias Burnus Tue, 29 Mar 2022 14:47:52 -0700

Hi Γιωργος,

On 29.03.22 22:26, Γιωργος Μελλιος via Fortran wrote:

I am looking forward to applying for GCC so I was checking the project
ideas list. I got interested in the Fortran - run-time argument checking
project and I would like to learn some more information about it in order
to start doing some research on the specific field so that I will be more
productive if I get selected.


This feature relates more to older Fortran code - as modern Fortran code
tends to use modules. With modules, one writes procedures (subroutines or 
functions)
like:

! MODERN CODE - USING MODULES

module myMod
  implicit none
contains
  subroutine mySub(n,y,z)
    integer :: n
    real :: y(10)
    character(len=n) :: z(:,:)
  end subroutine
end module

And then when using it, just doing:

  use myMod
  ...
  call mySub(m, var, array)

By 'use'ing the module, the compiler knows the data type and can use
the proper ABI (here: all variables are passed by reference, 'y' is
a contiguous stream of the actual data whereas 'z' uses some wrapper
("array descriptor", "dope vector"), which contains additional data
(like array bounds).

 * * *

OLD WAY:

subroutine mySub2(n, x)
  integer :: n
  real :: y(n)
end

subroutine another_sub()
  real :: x(2)
  x = [1., 2.]
  call mySub2(size(x), x)
end

Even if you put this into the same file, in terms of the Fortran language,
the compiler does not know anything about 'mySub' inside 'another_sub'
except that it is a subroutine (because of the 'call mySub') - it does not
know the number of arguments or the data types or how to pass the data.
By usage, it can deduce 2 argument and it uses the standard argument passing
known from Fortran 66 (i.e. pass by reference, pass arrays as stream of
data).

If the two subroutines are in different files, the Fortran semantic and
what the compiler knows is the same. But of course, if both are in the
same file, the compiler _can_ see the other subroutine and do checks
between what is known locally – and how the subroutine looks in reality.
(GCC/gfortran does such checks if possible. There is room for improvement
but it already detects a lot.)

With -fcheck=interface or some option like that, the compiler should add
checks that there are indeed 2 arguments, the called procedure is indeed
a subroutine (and not a function), that the first argument it a scalar
and the second one an array.
Going beyond, it could also check whether the array size is >= the first
argument. (But the size might not always been known to the caller.)

 * * *

If certain features are used, the compiler must know the interface of
the procedure. One way is by 'use'ing a module as above, but, alternatively,
an INTERFACE block can be used.

The INTERFACE block is required if the arguments are passed in a non-standard
way, e.g. by VALUE instead of by reference or (as above) not as byte stream
but wrapped in an array descriptor ('var(:)' - assumed-shape array (w/ array 
descriptor),
by contrast, var(n) is an explicit-size array (passes pointer to first element 
such
there is just the stream of bytes with the values.)

'mySub2' above is an example where the inferface is not needed – and would be
only helpful to find argument mismatches.
In the example below, the assumed-shape arrays and the VALUE attribute mean
that an interface is required:

subroutine mySub3(n, x)
  integer, value :: n
  integer :: x(:)
end

subroutine a_third_sub()
  real :: r(2)
  interface
    subroutine mySub3(n, x)
      integer :: n
      integer :: x(:)
    end
  end interface
  call mySub3(123, x)
end

When writing an INTERFACE block, it can easily happen that one misses
some property – like above where VALUE is missing in the INTERFACE block.

Or one misses to write the INTERFACE block but it is required due to,
e.g., the VALUE attribute.

 * * *

Regarding the implementation: The idea is to have one/two global variable(s)
which is/are a pointers

When then doing
  call mySub3(123, x)

there is before done the following (pseudo code in Fortran syntax):

  var_callee => mySub3  ! called function

  data%version = 1
  data%filename = "...."
  data%line_num = ...
  data%num_args = 2  ! property from the interface block (if available), 
otherwise from usage.
  data%arg[1]%type = integer  ! likewise
  data%arg[1]%by_value = .false.
  data%arg[2]%type = real
  data%arg[2]%array_type = assumed_shape
  data%arg[2]%array_size = size(x)
  var_args => data

  call mySub3(123, x)


And inside mySub3:

subroutine mySub3 (...)
  if (var_callee == mySub3) then
    data2%version = 1
    data2%num_args = 2
    data2%arg[1]%type = integer
    data2%arg[1]%by_value = .true.
    data2%arg[2]%type = real
    data2%arg[2]%array_type = assumed_shape
    call gfortran_argcheck (data, var_args, "mySub3")
  endif
  ...
end


Thus: One stores a bunch of information about the actual arguments
in a variable + saves it. In the callee, there is a check that
the data is indeed for that procedure (to permit compiling only
a subset of the files with this instrumentation) – and if it is,
the arguments are compared.

My impression is that it then makes sense to outsource this checking
into a library function. In this example, that could be:

  if (caller.arg[i].by_value != callee.arg[i].by_value)
    error ("%s:%d: Mismatch in VALUE attribute for argument %d in call to %s",
           caller.filename, caller.linenum, i, proc_name);

I think you get the idea. Thus, the work is to generate the code for the
arguments before the call + at the beginning of a procedure + call a comparison
function. That is all work done in the compiler itself.

And then the diagnostic in the library, which does the actual checking and 
writes
some nice words about it.

In terms of the compiler, the data structure has to be created on the fly. You
have two choices in the Fortran AST (abstract syntax tree, gfc_expr, gfc_symbol)
or in the one used by C/C++ and the middle end ("tree").

Side remark: of course a thread-private variable is needed to support 
concurrency.

 * * *

I think the first step is to get some basic checking done (e.g function vs. 
subroutine
+ number of arguments) – and then to extend it to check for more complicated 
things.
(Hence also the version field - to permit adding more changes in later 
releases.)

Fortran standards: https://gcc.gnu.org/wiki/GFortranStandards
Something you surely need as reference when working on it, but if you do not 
know
much of Fortran, some Fortran tutorial will help more.

 * * *

I hope it helps to give you a rough idea – if you need more, ask.
(In particular, without knowing your background, it is difficult to
link to the best-suited references.)

Cheers,

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: GSoC(run-time argument checking project)

Reply via email to