[Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-17 Thread Tobias Burnus

Short version:

This fixes potential and real bugs related to 'len=:' character variables
as for the length/byte size an old/saved expression is used instead of
the current value. - That's fine but not for allocatable/pointer with 'len=:'.


Main part of the patch: Strip the SAVE_EXPR from the size expression:

  if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
{
  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
  TYPE_SIZE (type) = TREE_OPERAND (TYPE_SIZE (type), 0);
  TYPE_SIZE_UNIT (type) = TREE_OPERAND (TYPE_SIZE_UNIT (type), 0);
}


OK for mainline?

* * *

Long version:

BACKGROUND:


(A) VLA / EXPLICIT-SIZE ARRAYS + LEN= STRINGS


C knows something like VLA (variable length arrays), likewise Fortran
knows explicit size array and character length where the length/size
depends on an variable set before the current scoping unit. Examples:

void f(int N)
{
  int vla[N*5];
}

subroutine foo(n)
  integer :: n
  integer :: array(n*5)
  integer :: my_len
  ...
  my_len = 5
  block
character(len=my_len, kind=4) :: str

my_len = 99
print *, len(str)  ! still shows 5 - not 99
  end block
end


In all cases, the size / length is not known at compile time but it won't 
change.
Thus, expressions like (pseudo code)
   byte_size = n * 5 * sizeof(integer)
can be saved and re-used and do not have to be re-calculated every time the
   TYPE_SIZE or TYPE_UNIT_SIZE
is used.

In particular, the 'my_len' example shows that just using the current value of
'my_len' would be wrong as it can be overridden.


* * *


(B) DEFERRED-LENGTH STRINGS ('character(len=:), pointer/allocatable')


But with deferred-length strings, such as

  character(len=:), pointer :: pstr(:)
  ...
  allocate(character(len=2) :: pstr(5))
  ...
  !$omp target enter data map(alloc: pstr(2:5))


this leads to code like:

  integer(kind=8) .pstr;
  struct array01_character(kind=1) pstr;

  D.4302 = (sizetype) NON_LVALUE_EXPR <.pstr>;
  pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
...
.pstr = 2;  // during allocation/pointer assignment
...
  parm.1.data = pstr.data + (sizetype) ((~pstr.dim[0].lbound * D.4287)
* (integer(kind=8)) SAVE_EXPR );

And here D.4302 is the pre-calculated value instead of the current value,
which can be either 0 or some random value.


Such code happens when using code like:
elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));

Of course, there are various ways to avoid this – like obtaining somehow the
string length directly - either from the expression or from the type such as
  TYPE_DOMAIN (type)
but it can easily go wrong.

 * * *

IDEAL SOLUTION:

I think from the middle-end perspective, we should do:
  build_range_type (type, 0, NULL_TREE)
leaving the upper bound unspecified – which should also help with
type-is-the-same middle-end analysis.


PRACTICAL SOLUTION:

But as code like TYPE_SIZE_UNIT is very widely used - and we currently lack
a place to store the tree decl for the length, I propose the following as 
discussed
with Jakub yesterday:

We just remove SAVE_EXPR after generating the type.

Side note: In some cases, the type is already constructed with len = NULL; I 
have not
checked when. In that case, using TYPE_SIZE will fail at compile time.
(That remains unchanged by this patch.)

 * * *

OK for mainline?

Tobias

 * * *

PS: I have no real opinion whether we want to have any backports, thoughts?


PPS: I don't have any real example I want to add as most cases have been 
work-around
fixed in the meanwhile. If you want to test it, the following fails. I intent 
to add
an extended tests as part of a larger follow-up patch which fixes more OpenMP 
issues:

  character(len=:), pointer :: pstr(:)
  allocate(character(len=2) :: pstr(5))
  !$omp target enter data map(alloc: pstr(2:5))
end


Compile with -fopenmp -fdump-tree-original (or a later dump).


BEFORE the patch:

  integer(kind=8) .pstr;
  ...
  D.4291 = (sizetype) NON_LVALUE_EXPR <.pstr>;
  pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
  ...
.pstr = 2;
  ...
pstr.data = __builtin_malloc (10);
  ...
  parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
* (integer(kind=8)) SAVE_EXPR );

AFTER the patch:
  .
  parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
* NON_LVALUE_EXPR <.pstr>);
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Avoid SAVE_EXPR for deferred-len char types

Using TYPE_SIZE/TYPE_SIZE_UNIT with deferred-length character variables,
i.e. 'character(len=:), allocatable/pointer' used a SAVE_EXPR, i.e. the
value on entry to the scope instead of the lates

Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-17 Thread Steve Kargl via Fortran
On Fri, Feb 17, 2023 at 12:13:52PM +0100, Tobias Burnus wrote:
> Short version:
> 
> This fixes potential and real bugs related to 'len=:' character variables
> as for the length/byte size an old/saved expression is used instead of
> the current value. - That's fine but not for allocatable/pointer with 'len=:'.
> 
> 
> Main part of the patch: Strip the SAVE_EXPR from the size expression:
> 
>   if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
> {
>   gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
>   TYPE_SIZE (type) = TREE_OPERAND (TYPE_SIZE (type), 0);
>   TYPE_SIZE_UNIT (type) = TREE_OPERAND (TYPE_SIZE_UNIT (type), 0);
> }
> 
> 
> OK for mainline?

Short version: no.

> 
> * * *
> 
> Long version:
> 
> BACKGROUND:
> 
> 
> (A) VLA / EXPLICIT-SIZE ARRAYS + LEN= STRINGS
> 
> 
> C knows something like VLA (variable length arrays), likewise Fortran
> knows explicit size array and character length where the length/size
> depends on an variable set before the current scoping unit. Examples:
> 
> void f(int N)
> {
>   int vla[N*5];
> }
> 
> subroutine foo(n)
>   integer :: n
>   integer :: array(n*5)
>   integer :: my_len
>   ...
>   my_len = 5
>   block
> character(len=my_len, kind=4) :: str
> 
> my_len = 99
> print *, len(str)  ! still shows 5 - not 99
>   end block
> end

Are you sure about the above comment?  At the time 
that str is declared, it is given a kind type parameter
of len=5 and kind=4.  After changing my_len to 99
the kind type parameter of str does not change.


8.3 Automatic data objects

If a type parameter in a declaration-type-spec or in a char-length
in an entity-decl for a local variable of a subprogram or BLOCK
construct is defined by an expression that is not a constant
expression, the type parameter value is established on entry to a
procedure defined by the subprogram, or on execution of the BLOCK
statement, and is not affected by any redefinition or undefinition
of the variables in the expression during execution of the
procedure or BLOCK construct.

-- 
steve