Hi,

I’ve spent many hours going down a rabbit hole, namely that we emit incorrect 
types for C interoperable procedure parameters of type CHARACTER(C_CHAR), 
VALUE. The full details can be found here: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103828 I will try here to be as 
brief as I can.

This bug manifests on the new target aarche64-apple-darwin, and there only if a 
CHARACTER(C_CHAR), VALUE argument is the 11th (or greater) argument to the 
function. Then, we would pass it as a 32-bit value, instead of an 8-bit value 
(as should a char). The front-end emits the wrong type information about the 
argument, but it turns out for all other targets this simply does not matter, 
due to their ABI.

Interoperable CHARACTER(C_CHAR), VALUE arguments are weird, because deep down 
they’re not actually character types (in the Fortran sense of the word), i.e. 
arrays, but scalar C char (i.e., integers!). So they’ve always been kind-of 
second class citizens in the front-end. A function was introduced in 2007 that 
basically patches their type on the fly: gfc_conv_scalar_char_value(). The 
problem is that function is simply wrong. It says (from comments):

      /* Modify the tree type for scalar character dummy arguments of bind(c)
        procedures if they are passed by value.  The tree type for them will
        be promoted to INTEGER_TYPE for the middle end, which appears to be
        what C would do with characters passed by-value.  The value attribute
         implies the dummy is a scalar.  */

The error is a misunderstanding of the C standard. char (and integer types 
smaller than int) are promoted to int for argument passing in unprototyped 
(old-school K&R) functions. In prototyped functions, they are most definitely 
not promoted. Therefore, the function is wrong.

More importantly, we can actually emit the right type much earlier in the 
front-end, and we already have a place to do so, in gfc_sym_type(). We already 
handle the result variables, but by-value arguments should have the same type, 
so the fix is easy.

In fact, once we emit the right type in gfc_sym_type(), we actually never need 
to fix it up in gfc_conv_scalar_char_value(). Removing the hack, I decided to 
leave it as a series of assertions, because we have seen in the past that this 
case is difficult to catch, and can easily regress due to changes elsewhere in 
the front-end.

Then, once the hack is removed, it becomes clear that the two callers of 
gfc_conv_scalar_char_value() expected different things from it: it’s called 
from generate_local_decl() to hack the sym->backend_decl; and it’s called from 
gfc_conv_procedure_call() for its true purpose, as a real conv_* function. For 
the first case, we can inline the checks (which have replaced the hack) into 
the caller in generate_local_decl(). Once only the second use case remains, we 
can make the function local to that file (make static, remove gfc_ prefix).


The patch comes with a couple of extra testcases, exercising the passing of 
char by value, and as integer, and their interoperability.
It was regtested on x86_64-pc-gnu-linux, on aarch64-apple-darwin (because its 
ABI is picky).

OK to commit?

FX

Attachment: 0001-Fortran-Emit-correct-types-for-CHARACTER-C_CHAR-VALU.patch
Description: Binary data

Reply via email to