Hi Γιωργος, On 29.03.22 22:26, Γιωργος Μελλιος via Fortran wrote:
I am looking forward to applying for GCC so I was checking the project ideas list. I got interested in the Fortran - run-time argument checking project and I would like to learn some more information about it in order to start doing some research on the specific field so that I will be more productive if I get selected.
This feature relates more to older Fortran code - as modern Fortran code tends to use modules. With modules, one writes procedures (subroutines or functions) like: ! MODERN CODE - USING MODULES module myMod implicit none contains subroutine mySub(n,y,z) integer :: n real :: y(10) character(len=n) :: z(:,:) end subroutine end module And then when using it, just doing: use myMod ... call mySub(m, var, array) By 'use'ing the module, the compiler knows the data type and can use the proper ABI (here: all variables are passed by reference, 'y' is a contiguous stream of the actual data whereas 'z' uses some wrapper ("array descriptor", "dope vector"), which contains additional data (like array bounds). * * * OLD WAY: subroutine mySub2(n, x) integer :: n real :: y(n) end subroutine another_sub() real :: x(2) x = [1., 2.] call mySub2(size(x), x) end Even if you put this into the same file, in terms of the Fortran language, the compiler does not know anything about 'mySub' inside 'another_sub' except that it is a subroutine (because of the 'call mySub') - it does not know the number of arguments or the data types or how to pass the data. By usage, it can deduce 2 argument and it uses the standard argument passing known from Fortran 66 (i.e. pass by reference, pass arrays as stream of data). If the two subroutines are in different files, the Fortran semantic and what the compiler knows is the same. But of course, if both are in the same file, the compiler _can_ see the other subroutine and do checks between what is known locally – and how the subroutine looks in reality. (GCC/gfortran does such checks if possible. There is room for improvement but it already detects a lot.) With -fcheck=interface or some option like that, the compiler should add checks that there are indeed 2 arguments, the called procedure is indeed a subroutine (and not a function), that the first argument it a scalar and the second one an array. Going beyond, it could also check whether the array size is >= the first argument. (But the size might not always been known to the caller.) * * * If certain features are used, the compiler must know the interface of the procedure. One way is by 'use'ing a module as above, but, alternatively, an INTERFACE block can be used. The INTERFACE block is required if the arguments are passed in a non-standard way, e.g. by VALUE instead of by reference or (as above) not as byte stream but wrapped in an array descriptor ('var(:)' - assumed-shape array (w/ array descriptor), by contrast, var(n) is an explicit-size array (passes pointer to first element such there is just the stream of bytes with the values.) 'mySub2' above is an example where the inferface is not needed – and would be only helpful to find argument mismatches. In the example below, the assumed-shape arrays and the VALUE attribute mean that an interface is required: subroutine mySub3(n, x) integer, value :: n integer :: x(:) end subroutine a_third_sub() real :: r(2) interface subroutine mySub3(n, x) integer :: n integer :: x(:) end end interface call mySub3(123, x) end When writing an INTERFACE block, it can easily happen that one misses some property – like above where VALUE is missing in the INTERFACE block. Or one misses to write the INTERFACE block but it is required due to, e.g., the VALUE attribute. * * * Regarding the implementation: The idea is to have one/two global variable(s) which is/are a pointers When then doing call mySub3(123, x) there is before done the following (pseudo code in Fortran syntax): var_callee => mySub3 ! called function data%version = 1 data%filename = "...." data%line_num = ... data%num_args = 2 ! property from the interface block (if available), otherwise from usage. data%arg[1]%type = integer ! likewise data%arg[1]%by_value = .false. data%arg[2]%type = real data%arg[2]%array_type = assumed_shape data%arg[2]%array_size = size(x) var_args => data call mySub3(123, x) And inside mySub3: subroutine mySub3 (...) if (var_callee == mySub3) then data2%version = 1 data2%num_args = 2 data2%arg[1]%type = integer data2%arg[1]%by_value = .true. data2%arg[2]%type = real data2%arg[2]%array_type = assumed_shape call gfortran_argcheck (data, var_args, "mySub3") endif ... end Thus: One stores a bunch of information about the actual arguments in a variable + saves it. In the callee, there is a check that the data is indeed for that procedure (to permit compiling only a subset of the files with this instrumentation) – and if it is, the arguments are compared. My impression is that it then makes sense to outsource this checking into a library function. In this example, that could be: if (caller.arg[i].by_value != callee.arg[i].by_value) error ("%s:%d: Mismatch in VALUE attribute for argument %d in call to %s", caller.filename, caller.linenum, i, proc_name); I think you get the idea. Thus, the work is to generate the code for the arguments before the call + at the beginning of a procedure + call a comparison function. That is all work done in the compiler itself. And then the diagnostic in the library, which does the actual checking and writes some nice words about it. In terms of the compiler, the data structure has to be created on the fly. You have two choices in the Fortran AST (abstract syntax tree, gfc_expr, gfc_symbol) or in the one used by C/C++ and the middle end ("tree"). Side remark: of course a thread-private variable is needed to support concurrency. * * * I think the first step is to get some basic checking done (e.g function vs. subroutine + number of arguments) – and then to extend it to check for more complicated things. (Hence also the version field - to permit adding more changes in later releases.) Fortran standards: https://gcc.gnu.org/wiki/GFortranStandards Something you surely need as reference when working on it, but if you do not know much of Fortran, some Fortran tutorial will help more. * * * I hope it helps to give you a rough idea – if you need more, ask. (In particular, without knowing your background, it is difficult to link to the best-suited references.) Cheers, Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955