Re: [Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function
Hi Harald, Please find attached my resubmission for pr113363. The changes are as follows: (i) The chunk in gfc_conv_procedure_call is new. This was the source of one of the memory leaks; (ii) The incorporation of the _len field in trans_class_assignment was done for the pr84006 patch; (iii) The source of all the invalid memory accesses and so on was down to the use of realloc. I tried all sorts of workarounds such as testing the vptrs and the sizes but only free followed by malloc worked. I have no idea at all why this is the case; and (iv) I took account of your remarks about the chunk in trans-array.cc by removing it and that the chunk in trans-stmt.cc would leak frontend memory. OK for mainline (and -14 branch after a few-weeks)? Regards Paul Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363] 2024-05-12 Paul Thomas gcc/fortran PR fortran/113363 * trans-array.cc (gfc_array_init_size): Use the expr3 dtype so that the correct element size is used. * trans-expr.cc (gfc_conv_procedure_call): Remove restriction that ss and ss->loop be present for the finalization of class array function results. (trans_class_assignment): Use free and malloc, rather than realloc, for character expressions assigned to unlimited poly entities. * trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for the assignment of an unlimited polymorphic 'source'. gcc/testsuite/ PR fortran/113363 * gfortran.dg/pr113363.f90: New test. > > The first chunk in trans-array.cc ensures that the array dtype is set to > > the source dtype. The second chunk ensures that the lhs _len field does > not > > default to zero and so is specific to dynamic types of character. > > > > Why the two gfc_copy_ref? valgrind pointed my to the tail > of gfc_copy_ref which already has: > >dest->next = gfc_copy_ref (src->next); > > so this looks redundant and leaks frontend memory? > > *** > > Playing with the testcase, I find several invalid writes with > valgrind, or a heap buffer overflow with -fsanitize=address . > > > diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7ec33fb1598..c5b56f4e273 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -5957,6 +5957,11 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset, tmp = gfc_conv_descriptor_dtype (descriptor); gfc_add_modify (pblock, tmp, gfc_get_dtype_rank_type (rank, type)); } + else if (expr3_desc && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (expr3_desc))) +{ + tmp = gfc_conv_descriptor_dtype (descriptor); + gfc_add_modify (pblock, tmp, gfc_conv_descriptor_dtype (expr3_desc)); +} else { tmp = gfc_conv_descriptor_dtype (descriptor); diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 4590aa6edb4..e315e2d3370 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -8245,8 +8245,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, call the finalization function of the temporary. Note that the nullification of allocatable components needed by the result is done in gfc_trans_assignment_1. */ - if (expr && ((gfc_is_class_array_function (expr) - && se->ss && se->ss->loop) + if (expr && (gfc_is_class_array_function (expr) || gfc_is_alloc_class_scalar_function (expr)) && se->expr && GFC_CLASS_TYPE_P (TREE_TYPE (se->expr)) && expr->must_finalize) @@ -12028,18 +12027,25 @@ trans_class_assignment (stmtblock_t *block, gfc_expr *lhs, gfc_expr *rhs, /* Reallocate if dynamic types are different. */ gfc_init_block (&re_alloc); - tmp = fold_convert (pvoid_type_node, class_han); - re = build_call_expr_loc (input_location, -builtin_decl_explicit (BUILT_IN_REALLOC), 2, -tmp, size); - re = fold_build2_loc (input_location, MODIFY_EXPR, TREE_TYPE (tmp), tmp, - re); - tmp = fold_build2_loc (input_location, NE_EXPR, - logical_type_node, rhs_vptr, old_vptr); - re = fold_build3_loc (input_location, COND_EXPR, void_type_node, - tmp, re, build_empty_stmt (input_location)); - gfc_add_expr_to_block (&re_alloc, re); - + if (UNLIMITED_POLY (lhs) && rhs->ts.type == BT_CHARACTER) + { + gfc_add_expr_to_block (&re_alloc, gfc_call_free (class_han)); + gfc_allocate_using_malloc (&re_alloc, class_han, size, NULL_TREE); + } + else + { + tmp = fold_convert (pvoid_type_node, class_han); + re = build_call_expr_loc (input_location, +builtin_decl_explicit (BUILT_IN_REALLOC), +2, tmp, size); + re = fold_build2_loc (input_location, MODIFY_EXPR, TREE_TYPE (tmp), +tmp, re); + tmp = fold_build2_loc (input_location, NE_EXPR, + logical_type_node, rhs_vptr, old_vptr); + re = fold_build3_loc (input_location, COND_EXPR, void_type_node, +tmp, re, build_empty_stmt (input_location)); + gfc_add_expr_to_block (&re_alloc, re); + } tree realloc_expr = lhs->ts.type == BT_CLASS ? gfc_finish_block
[PATCH] fortran: Assume there is no cyclic reference with submodule symbols [PR99798]
Hello, Here is my final patch to fix the ICE of PR99798. It's maybe overly verbose with comments, but the memory management is hopefully clarified. I tested this with a full fortran regression test on x86_64-linux and a manual check with valgrind on the testcase. OK for master? -- 8< -- This prevents a premature release of memory with procedure symbols from submodules, causing random compiler crashes. The problem is a fragile detection of cyclic references, which can match with procedures host-associated from a module in submodules, in cases where it shouldn't. The formal namespace is released, and with it the dummy arguments symbols of the procedure. But there is no cyclic reference, so the procedure symbol itself is not released and remains, with pointers to its dummy arguments now dangling. The fix adds a condition to avoid the case, and refactors to a new predicate by the way. Part of the original condition is also removed, for lack of a reason to keep it. PR fortran/99798 gcc/fortran/ChangeLog: * symbol.cc (gfc_release_symbol): Move the condition guarding the handling cyclic references... (cyclic_reference_break_needed): ... here as a new predicate. Remove superfluous parts. Add a condition preventing any premature release with submodule symbols. gcc/testsuite/ChangeLog: * gfortran.dg/submodule_33.f08: New test. --- gcc/fortran/symbol.cc | 54 +- gcc/testsuite/gfortran.dg/submodule_33.f08 | 20 2 files changed, 72 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/submodule_33.f08 diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc index 8f7deac1d1e..0a1646def67 100644 --- a/gcc/fortran/symbol.cc +++ b/gcc/fortran/symbol.cc @@ -3179,6 +3179,57 @@ gfc_free_symbol (gfc_symbol *&sym) } +/* Returns true if the symbol SYM has, through its FORMAL_NS field, a reference + to itself which should be eliminated for the symbol memory to be released + via normal reference counting. + + The implementation is crucial as it controls the proper release of symbols, + especially (contained) procedure symbols, which can represent a lot of memory + through the namespace of their body. + + We try to avoid freeing too much memory (causing dangling pointers), to not + leak too much (wasting memory), and to avoid expensive walks of the symbol + tree (which would be the correct way to check for a cycle). */ + +bool +cyclic_reference_break_needed (gfc_symbol *sym) +{ + /* Normal symbols don't reference themselves. */ + if (sym->formal_ns == nullptr) +return false; + + /* Procedures at the root of the file do have a self reference, but they don't + have a reference in a parent namespace preventing the release of the + procedure namespace, so they can use the normal reference counting. */ + if (sym->formal_ns == sym->ns) +return false; + + /* If sym->refs == 1, we can use normal reference counting. If sym->refs > 2, + the symbol won't be freed anyway, with or without cyclic reference. */ + if (sym->refs != 2) +return false; + + /* Procedure symbols host-associated from a module in submodules are special, + because the namespace of the procedure block in the submodule is different + from the FORMAL_NS namespace generated by host-association. So there are + two different namespaces representing the same procedure namespace. As + FORMAL_NS comes from host-association, which only imports symbols visible + from the outside (dummy arguments basically), we can assume there is no + self reference through FORMAL_NS in that case. */ + if (sym->attr.host_assoc && sym->attr.used_in_submodule) +return false; + + /* We can assume that contained procedures have cyclic references, because + the symbol of the procedure itself is accessible in the procedure body + namespace. So we assume that symbols with a formal namespace different + from the declaration namespace and two references, one of which is about + to be removed, are procedures with just the self reference left. At this + point, the symbol SYM matches that pattern, so we return true here to + permit the release of SYM. */ + return true; +} + + /* Decrease the reference counter and free memory when we reach zero. Returns true if the symbol has been freed, false otherwise. */ @@ -3188,8 +3239,7 @@ gfc_release_symbol (gfc_symbol *&sym) if (sym == NULL) return false; - if (sym->formal_ns != NULL && sym->refs == 2 && sym->formal_ns != sym->ns - && (!sym->attr.entry || !sym->module)) + if (cyclic_reference_break_needed (sym)) { /* As formal_ns contains a reference to sym, delete formal_ns just before the deletion of sym. */ diff --git a/gcc/testsuite/gfortran.dg/submodule_33.f08 b/gcc/testsuite/gfortran.dg/submodule_33.f08 new file mode 100644 index 000..b61d750def1 -
Re: [PATCH] fortran: Assume there is no cyclic reference with submodule symbols [PR99798]
Hi Mikael, That is an ingenious solution. Given the complexity, I think that the comments are well warranted. OK for master and, I would suggest, 14-branch after a few weeks. Thanks! Paul On Sun, 12 May 2024 at 14:16, Mikael Morin wrote: > Hello, > > Here is my final patch to fix the ICE of PR99798. > It's maybe overly verbose with comments, but the memory management is > hopefully clarified. > I tested this with a full fortran regression test on x86_64-linux and a > manual check with valgrind on the testcase. > OK for master? > > -- 8< -- > > This prevents a premature release of memory with procedure symbols from > submodules, causing random compiler crashes. > > The problem is a fragile detection of cyclic references, which can match > with procedures host-associated from a module in submodules, in cases > where it > shouldn't. The formal namespace is released, and with it the dummy > arguments > symbols of the procedure. But there is no cyclic reference, so the > procedure > symbol itself is not released and remains, with pointers to its dummy > arguments > now dangling. > > The fix adds a condition to avoid the case, and refactors to a new > predicate > by the way. Part of the original condition is also removed, for lack of a > reason to keep it. > > PR fortran/99798 > > gcc/fortran/ChangeLog: > > * symbol.cc (gfc_release_symbol): Move the condition guarding > the handling cyclic references... > (cyclic_reference_break_needed): ... here as a new predicate. > Remove superfluous parts. Add a condition preventing any premature > release with submodule symbols. > > gcc/testsuite/ChangeLog: > > * gfortran.dg/submodule_33.f08: New test. > --- > gcc/fortran/symbol.cc | 54 +- > gcc/testsuite/gfortran.dg/submodule_33.f08 | 20 > 2 files changed, 72 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gfortran.dg/submodule_33.f08 > > diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc > index 8f7deac1d1e..0a1646def67 100644 > --- a/gcc/fortran/symbol.cc > +++ b/gcc/fortran/symbol.cc > @@ -3179,6 +3179,57 @@ gfc_free_symbol (gfc_symbol *&sym) > } > > > +/* Returns true if the symbol SYM has, through its FORMAL_NS field, a > reference > + to itself which should be eliminated for the symbol memory to be > released > + via normal reference counting. > + > + The implementation is crucial as it controls the proper release of > symbols, > + especially (contained) procedure symbols, which can represent a lot of > memory > + through the namespace of their body. > + > + We try to avoid freeing too much memory (causing dangling pointers), > to not > + leak too much (wasting memory), and to avoid expensive walks of the > symbol > + tree (which would be the correct way to check for a cycle). */ > + > +bool > +cyclic_reference_break_needed (gfc_symbol *sym) > +{ > + /* Normal symbols don't reference themselves. */ > + if (sym->formal_ns == nullptr) > +return false; > + > + /* Procedures at the root of the file do have a self reference, but > they don't > + have a reference in a parent namespace preventing the release of the > + procedure namespace, so they can use the normal reference counting. > */ > + if (sym->formal_ns == sym->ns) > +return false; > + > + /* If sym->refs == 1, we can use normal reference counting. If > sym->refs > 2, > + the symbol won't be freed anyway, with or without cyclic reference. > */ > + if (sym->refs != 2) > +return false; > + > + /* Procedure symbols host-associated from a module in submodules are > special, > + because the namespace of the procedure block in the submodule is > different > + from the FORMAL_NS namespace generated by host-association. So > there are > + two different namespaces representing the same procedure namespace. > As > + FORMAL_NS comes from host-association, which only imports symbols > visible > + from the outside (dummy arguments basically), we can assume there is > no > + self reference through FORMAL_NS in that case. */ > + if (sym->attr.host_assoc && sym->attr.used_in_submodule) > +return false; > + > + /* We can assume that contained procedures have cyclic references, > because > + the symbol of the procedure itself is accessible in the procedure > body > + namespace. So we assume that symbols with a formal namespace > different > + from the declaration namespace and two references, one of which is > about > + to be removed, are procedures with just the self reference left. At > this > + point, the symbol SYM matches that pattern, so we return true here to > + permit the release of SYM. */ > + return true; > +} > + > + > /* Decrease the reference counter and free memory when we reach zero. > Returns true if the symbol has been freed, false otherwise. */ > > @@ -3188,8 +3239,7 @@ gfc_release_symbol (gfc_symbol *&sy
[PATCH, committed] Fortran: fix frontend memleak
Dear all, the attached obvious patch fixes a frontend memleak that was introduced recently, and which shows up when checking for inquiry references. I came across it when working on pr115039. Committed after regtesting as r15-391-g13b6ac4ebd04f0. Thanks, Harald From 13b6ac4ebd04f0703d92828c9268b0b216890b0d Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Sun, 12 May 2024 21:48:03 +0200 Subject: [PATCH] Fortran: fix frontend memleak gcc/fortran/ChangeLog: * primary.cc (gfc_match_varspec): Replace 'ref' argument to is_inquiry_ref() by NULL when the result is not needed to avoid a memleak. --- gcc/fortran/primary.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/fortran/primary.cc b/gcc/fortran/primary.cc index 606e84432be..8e7833769a8 100644 --- a/gcc/fortran/primary.cc +++ b/gcc/fortran/primary.cc @@ -2250,7 +2250,7 @@ gfc_match_varspec (gfc_expr *primary, int equiv_flag, bool sub_flag, can be found. If this was an inquiry reference with the same name as a derived component and the associate-name type is not derived or class, this is fixed up in 'gfc_fixup_inferred_type_refs'. */ - if (mm == MATCH_YES && is_inquiry_ref (name, &tmp) + if (mm == MATCH_YES && is_inquiry_ref (name, NULL) && !(sym->ts.type == BT_UNKNOWN && gfc_find_derived_types (sym, gfc_current_ns, name))) inquiry = true; -- 2.35.3
Re: [Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function
Hi Paul, this looks all good now, and is OK for mainline as well as backporting! *** While playing with the testcase, I found 3 remaining smaller issues that are pre-existing, so they should not delay your present work. To make it clear: these are not regressions. When "maliciously" perturbing the testcase by adding parentheses in the right places, I see the following: Replacing associate (var => foo ()) ! OK after r14-9489-g3fd46d859cda10 by associate (var => (foo ())) gives an ICE here with 14-branch and 15-mainline. Similarly replacing allocate (y, source = x(1)) ! Gave zero length here by allocate (y, source = (x(1))) Furthermore, replacing allocate(x, source = foo ()) by allocate(x, source = (foo ())) gives a runtime segfault with both 14-branch and 15-mainline. So this is something for another day... Thanks for the patch! Harald Am 12.05.24 um 13:27 schrieb Paul Richard Thomas: Hi Harald, Please find attached my resubmission for pr113363. The changes are as follows: (i) The chunk in gfc_conv_procedure_call is new. This was the source of one of the memory leaks; (ii) The incorporation of the _len field in trans_class_assignment was done for the pr84006 patch; (iii) The source of all the invalid memory accesses and so on was down to the use of realloc. I tried all sorts of workarounds such as testing the vptrs and the sizes but only free followed by malloc worked. I have no idea at all why this is the case; and (iv) I took account of your remarks about the chunk in trans-array.cc by removing it and that the chunk in trans-stmt.cc would leak frontend memory. OK for mainline (and -14 branch after a few-weeks)? Regards Paul Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363] 2024-05-12 Paul Thomas gcc/fortran PR fortran/113363 * trans-array.cc (gfc_array_init_size): Use the expr3 dtype so that the correct element size is used. * trans-expr.cc (gfc_conv_procedure_call): Remove restriction that ss and ss->loop be present for the finalization of class array function results. (trans_class_assignment): Use free and malloc, rather than realloc, for character expressions assigned to unlimited poly entities. * trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for the assignment of an unlimited polymorphic 'source'. gcc/testsuite/ PR fortran/113363 * gfortran.dg/pr113363.f90: New test. The first chunk in trans-array.cc ensures that the array dtype is set to the source dtype. The second chunk ensures that the lhs _len field does not default to zero and so is specific to dynamic types of character. Why the two gfc_copy_ref? valgrind pointed my to the tail of gfc_copy_ref which already has: dest->next = gfc_copy_ref (src->next); so this looks redundant and leaks frontend memory? *** Playing with the testcase, I find several invalid writes with valgrind, or a heap buffer overflow with -fsanitize=address .