On Thu, 13 Feb 2025, Jakub Jelinek wrote:
> Hi!
>
> The following testcase ICEs, because we have multiple levels of
> DECL_VALUE_EXPR VAR_DECLs:
> character(kind=1) id_string[1:.id_string] [value-expr: *id_string.55];
> character(kind=1)[1:.id_string] * id_string.55 [value-expr:
> FRAME.107.id_string.55];
> integer(kind=8) .id_string [value-expr: FRAME.107..id_string];
> id_string is the user variable mentioned in BLOCK_VARS, it has
> DECL_VALUE_EXPR because it is a VLA, id_string.55 is a temporary created by
> gimplify_vla_decl as the address that points to the start of the VLA, what
> is normally used in the IL to access it. But as this artificial var is then
> used inside of a nested function, tree-nested.cc adds DECL_VALUE_EXPR to it
> too and moves the actual value into the FRAME.107 object's member.
> Now, remove_unused_locals removes id_string.55 (and various other VAR_DECLs)
> from cfun->local_decls, simply because it is not mentioned in the IL at all
> (neither is id_string itself, but that is kept in BLOCK_VARS as it has
> DECL_VALUE_EXPR). So, after this point, id_string.55 tree isn't referenced
> from
> anywhere but id_string's DECL_VALUE_EXPR. Next GC collection is triggered,
> and we are unlucky enough that in the value_expr_for_decl hash table
> (underlying hash map for DECL_VALUE_EXPR) the id_string.55 entry comes
> before the id_string entry. id_string is ggc_marked_p because it is
> referenced from BLOCK_VARS, but id_string.55 is not, as we don't mark
> DECL_VALUE_EXPR anywhere but by gt_cleare_cache on value_expr_for_decl.
> But gt_cleare_cache does two things, it calls clear_slots on entries
> where the key is not ggc_marked_p (so the id_string.55 mapping to
> FRAME.107.id_string.55 is lost and DECL_VALUE_EXPR (id_string.55) becomes
> NULL) but then later we see id_string entry, which is ggc_marked_p, so mark
> the whole hash table entry, which sets ggc_set_mark on id_string.55. But
> at this point its DECL_VALUE_EXPR is lost.
> Later during dwarf2out.cc we want to emit DW_AT_location for id_string, see
> it has DECL_VALUE_EXPR, so emit it as indirection of id_string.55 for which
> we again lookup DECL_VALUE_EXPR as it has DECL_HAS_VALUE_EXPR_P, but as it
> is NULL, we ICE, instead of finding it is a subobject of FRAME.107 for which
> we can find its stack location.
>
> Now, as can be seen in the PR, I've tried to tweak tree-ssa-live.cc so that
> it would keep id_string.55 in cfun->local_decls; that prohibits it from
> the DECL_VALUE_EXPR of it being GC until expansion, but then we shrink and
> free cfun->local_decls completely and so GC at that point still can throw
> it away.
>
> The following patch adds an extension to the GTY ((cache)) option, before
> calling the gt_cleare_cache on some hash table by specifying
> GTY ((cache ("somefn"))) it calls somefn on that hash table as well.
> And this extra hook can do any additional ggc_set_mark needed so that
> gt_cleare_cache preserves everything that is actually needed and throws
> away the rest.
>
> In order to make it just 2 pass rather than up to n passes - (if we had
> say
> id1 -> something, id2 -> x(id1), id3 -> x(id2), id4 -> x(id3), id5 -> x(id4)
> in the value_expr_for_decl hash table in that order (where idN are VAR_DECLs
> with DECL_HAS_VALUE_EXPR_P, id5 is the only one mentioned from outside and
> idN -> X stands for idN having DECL_VALUE_EXPR X, something for some
> arbitrary tree and x(idN) for some arbitrary tree which mentions idN
> variable) and in each pass just marked the to part of entries with
> ggc_marked_p base.from we'd need to repeat until we don't mark anything)
> the patch calls walk_tree on DECL_VALUE_EXPR of the marked trees and if it
> finds yet unmarked tree, it marks it and walks its DECL_VALUE_EXPR as well
> the same way.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
So what this basically does is ensure we mark DECL_VALUE_EXPR when
VAR is marked which isn't done when marking a tree node.
That you special-case the hashtable walker is a workaround for
us not being able to say
struct GTY((mark_extra_stuff)) tree_decl_with_vis {
on 'tree' (or specifically the structs for a VAR_DECL). And that we
rely on gengtype producing the 'tree' marker. So we rely on the
hashtable keeping referenced trees live.
OK.
Thanks,
Richard.
> 2025-02-13 Jakub Jelinek <[email protected]>
>
> PR debug/118790
> * gengtype.cc (write_roots): Remove cache variable, instead break from
> the loop on match and test o for NULL. If the cache option has
> non-empty string argument, call the specified function with v->name
> as argument before calling gt_cleare_cache on it.
> * tree.cc (gt_value_expr_mark_2, gt_value_expr_mark_1,
> gt_value_expr_mark): New functions.
> (value_expr_for_decl): Use GTY ((cache ("gt_value_expr_mark"))) rather
> than just GTY ((cache)).
> * doc/gty.texi (cache): Document optional argument of cache option.
>
> * gfortran.dg/gomp/pr118790.f90: New test.
>
> --- gcc/gengtype.cc.jj 2025-01-02 11:23:02.613710956 +0100
> +++ gcc/gengtype.cc 2025-02-12 17:15:08.560424329 +0100
> @@ -4656,13 +4656,12 @@ write_roots (pair_p variables, bool emit
> outf_p f = get_output_file_with_visibility (CONST_CAST (input_file*,
> v->line.file));
> struct flist *fli;
> - bool cache = false;
> options_p o;
>
> for (o = v->opt; o; o = o->next)
> if (strcmp (o->name, "cache") == 0)
> - cache = true;
> - if (!cache)
> + break;
> + if (!o)
> continue;
>
> for (fli = flp; fli; fli = fli->next)
> @@ -4677,6 +4676,8 @@ write_roots (pair_p variables, bool emit
> oprintf (f, " ()\n{\n");
> }
>
> + if (o->kind == OPTION_STRING && o->info.string && o->info.string[0])
> + oprintf (f, " %s (%s);\n", o->info.string, v->name);
> oprintf (f, " gt_cleare_cache (%s);\n", v->name);
> }
>
> --- gcc/tree.cc.jj 2025-01-20 10:26:42.216048422 +0100
> +++ gcc/tree.cc 2025-02-12 18:05:44.368264018 +0100
> @@ -211,13 +211,61 @@ struct cl_option_hasher : ggc_cache_ptr_
>
> static GTY ((cache)) hash_table<cl_option_hasher> *cl_option_hash_table;
>
> +/* Callback called through walk_tree_1 to discover DECL_HAS_VALUE_EXPR_P
> + VAR_DECLs which weren't marked yet, in that case marks them and
> + walks their DECL_VALUE_EXPR expressions. */
> +
> +static tree
> +gt_value_expr_mark_2 (tree *tp, int *, void *data)
> +{
> + tree t = *tp;
> + if (VAR_P (t) && DECL_HAS_VALUE_EXPR_P (t) && !ggc_set_mark (t))
> + {
> + tree dve = DECL_VALUE_EXPR (t);
> + walk_tree_1 (&dve, gt_value_expr_mark_2, data,
> + (hash_set<tree> *) data, NULL);
> + }
> + return NULL_TREE;
> +}
> +
> +/* Callback called through traverse_noresize on the
> + value_expr_for_decl hash table. */
> +
> +int
> +gt_value_expr_mark_1 (tree_decl_map **e, hash_set<tree> *pset)
> +{
> + if (ggc_marked_p ((*e)->base.from))
> + walk_tree_1 (&(*e)->to, gt_value_expr_mark_2, pset, pset, NULL);
> + return 1;
> +}
> +
> +/* The value_expr_for_decl hash table can have mappings for trees
> + which are only referenced from mappings of other trees in the
> + same table, see PR118790. Without this routine, gt_cleare_cache
> + could clear hash table slot of a tree which isn't marked but
> + will be marked when processing later hash table slot of another
> + tree which is marked. This function marks with the above
> + helpers marks all the not yet marked DECL_HAS_VALUE_EXPR_P
> + VAR_DECLs mentioned in DECL_VALUE_EXPR expressions of marked
> + trees and in that case also recurses on their DECL_VALUE_EXPR. */
> +
> +void
> +gt_value_expr_mark (hash_table<tree_decl_map_cache_hasher> *h)
> +{
> + if (!h)
> + return;
> +
> + hash_set<tree> pset;
> + h->traverse_noresize<hash_set<tree> *, gt_value_expr_mark_1> (&pset);
> +}
> +
> /* General tree->tree mapping structure for use in hash tables. */
>
>
> static GTY ((cache))
> hash_table<tree_decl_map_cache_hasher> *debug_expr_for_decl;
>
> -static GTY ((cache))
> +static GTY ((cache ("gt_value_expr_mark")))
> hash_table<tree_decl_map_cache_hasher> *value_expr_for_decl;
>
> static GTY ((cache))
> --- gcc/doc/gty.texi.jj 2025-01-02 11:47:30.411219590 +0100
> +++ gcc/doc/gty.texi 2025-02-12 18:11:53.048139879 +0100
> @@ -313,6 +313,11 @@ Note that caches should generally use @c
> @code{cache} is only preferable if the value is impractical to
> recompute from the key when needed.
>
> +The @code{cache} option can have an optional argument, name of the function
> +which should be called before @samp{gt_cleare_cache}. This can be useful
> +if the hash table needs to be traversed and mark some pointers before
> +@samp{gt_cleare_cache} could clear slots in it.
> +
> @findex deletable
> @item deletable
>
> --- gcc/testsuite/gfortran.dg/gomp/pr118790.f90.jj 2025-02-12
> 18:19:14.799998043 +0100
> +++ gcc/testsuite/gfortran.dg/gomp/pr118790.f90 2025-02-12
> 18:18:43.606431737 +0100
> @@ -0,0 +1,182 @@
> +! PR debug/118790
> +! { dg-do compile }
> +! { dg-options "-O2 -g -fopenmp --param ggc-min-expand=0 --param
> ggc-min-heapsize=0" }
> +
> +module ec_args_mod
> + private
> + public :: ec_argc, ec_argv, ec_args
> + interface
> + function ec_argc() bind(c,name="ec_argc") result(argc)
> + end function
> + end interface
> +contains
> + function ec_argv(iarg) result(argv)
> + use, intrinsic :: iso_c_binding
> + character(len=:), allocatable :: argv
> + type(c_ptr), pointer :: argv_cptrs(:)
> + argv = to_string (argv_cptrs(iarg+1), 1024)
> + end function
> + subroutine ec_args()
> + use, intrinsic :: iso_c_binding
> + integer :: argc
> + type(c_ptr) :: argv(512)
> + if (ec_argc() == 0) then
> + call read_command_line(argc,argv)
> + end if
> + end subroutine
> + function to_string(cptr,maxlen) result(string)
> + use, intrinsic :: iso_c_binding
> + character(len=:), allocatable :: string
> + type(c_ptr) :: cptr
> + character(kind=c_char,len=1), pointer :: s(:)
> + call c_f_pointer (cptr, s, (/maxlen/))
> + do
> + if (s(i) == c_null_char) exit
> + i = i + 1
> + end do
> + nchars = i - 1
> + allocate (character(len=(nchars)) :: string)
> + do i=1,nchars
> + string(i:i) = s(i)
> + end do
> + end function
> + subroutine read_command_line(argc,argv)
> + use, intrinsic :: iso_c_binding
> + integer, parameter :: cmd_max_len = 1024 * 512
> + integer(c_int) :: argc
> + type(c_ptr) :: argv(:)
> + character(kind=c_char,len=1), save, target :: args(cmd_max_len)
> + character(kind=c_char,len=cmd_max_len), save, target :: cmd
> + character(kind=c_char,len=cmd_max_len) :: arg
> + integer(c_int) :: iarg, arglen, pos, ich, argpos
> + do ich=1,len(cmd)
> + if (cmd(ich:ich) == " ") then
> + cmd(ich:ich) = c_null_char
> + end if
> + end do
> + do iarg=1,argc
> + do ich=1,arglen
> + args(pos) = arg(ich:ich)
> + end do
> + args(pos) = c_null_char; pos = pos+1
> + argv(iarg+1) = c_loc(args(argpos))
> + end do
> + end subroutine
> +end module
> +module mpl_mpif
> + integer mpi_status_size
> +end module mpl_mpif
> +subroutine ec_meminfo(ku,cdstring,kcomm,kbarr,kiotask,kcall)
> + use mpl_mpif
> + interface
> + subroutine ec_pmon(energy,power)
> + end subroutine ec_pmon
> + end interface
> + character(len=*), intent(in) :: cdstring
> + integer :: ii,jj,i,j,k,myproc,nproc,len,error,nodenum,jid
> + integer :: tasksmall,nodehuge,memfree,cached,nfree
> + integer :: nnuma
> + integer,dimension(:),allocatable,save :: smallpage,hugepage
> + integer :: n18
> + integer,dimension(:,:),allocatable,save :: node, bucket
> + character(len=256) :: clstr
> + character(len=20) :: nodename,lastnode,clmaxnode
> + character(len=160) ::line
> + character(len=5+1+len(cdstring)) :: id_string
> + integer :: irecv_status(mpi_status_size)
> + logical :: llnocomm, llnohdr
> + logical, save :: llfirst_time = .true.
> + type ranknode_t
> + integer :: nodenum
> + integer :: pid
> + integer :: rank_world
> + integer :: rank
> + integer :: iorank
> + integer :: nodemaster
> + integer, allocatable :: coreids(:)
> + character(len=len(clstr)) :: str
> + end type
> + type (ranknode_t), allocatable, save :: rn(:)
> + integer, allocatable :: coreids(:)
> + character(len=64) :: clpfx
> + if (llfirst_time .and. .not. llnocomm) then
> + allocate(coreids(0:maxth-1))
> + coreids(:) = -1
> +!$omp parallel num_threads(maxth) shared(coreids) private(i,myth,icoreid)
> + do i=1,maxth
> + icoreid = ec_coreid()
> + myth = omp_get_thread_num()
> + coreids(myth) = icoreid
> + end do
> +!$omp end parallel
> + if (myproc == 0) then
> + call slash_proc
> + allocate(rn(0:nproc-1))
> + do i=0,nproc-1
> + rn(i)%nodenum = -1
> + if (i > 0) then
> + call mpi_recv(lastnode, len(lastnode), mpi_byte, i, itag, kcomm,
> irecv_status, error)
> + call check_error("from
> mpi_recv(lastnode)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",258)
> + call mpi_comm_rank(mpi_comm_world,k,error)
> + rn(i)%rank = 0
> + rn(i)%str = cdstring
> + rn(i)%pid = ec_getpid()
> + end if
> + rn(i)%rank_world = k
> + rn(i)%iorank = iorank
> + rn(i)%nodemaster = 0
> + call check_error("from
> mpi_send(iam_nodemaster)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",305)
> + end do
> + call mpi_send(clstr,len(clstr),mpi_byte,0,itag+5,kcomm,error)
> + call mpi_send(clstr,maxth,mpi_integer4,0,itag+6,kcomm,error)
> + call
> mpi_recv(lastnode,1,mpi_integer4,0,itag+7,kcomm,irecv_status,error)
> + end if
> + end if
> +contains
> + subroutine slash_proc
> + read(line(idx+iclkeylen-1:),*,err=99,end=98) node(:,0)
> +98 continue
> + do k=1,maxnuma-1
> + read(line(idx+iclkeylen-1:),*,err=99) node(0:n18-1,k)
> + end do
> +99 continue
> + close(502)
> + smallpage(:) = 0
> + do k=0,nnuma-1
> + do j=0,n18-1
> + end do
> + smallpage(k) = sum(bucket(0:8,k))/onemega
> + hugepage(k) = sum(bucket(9:n18-1,k))/onemega
> + end do
> + open(file="/proc/meminfo",unit=502,status="old",action="read",err=977)
> + do i=1,10
> + read(502,'(a)',err=988,end=988) line
> + if(line(1:7) == "memfree") then
> + read(line(9:80),*) memfree
> + else if(line(1:6) == "cached") then
> + read(line(8:80),*) cached
> + end if
> + end do
> +988 continue
> + close(502)
> +977 continue
> + memfree=memfree/1024
> + end subroutine slash_proc
> + subroutine prt_data(kun,knodenum,cdlastnode,kcall)
> + character(len=4096) :: clbuf
> + write(clbuf(ilen+1:),'(2x,2i8,3x,2f6.1,1x,i9,1x,i6,1x,a7,1x,a)')
> trim(id_string)
> + end subroutine prt_data
> + subroutine check_error(clwhat,srcfile,srcline)
> + character(len=*), intent(in) :: clwhat, srcfile
> + integer, intent(in) :: srcline
> + if (error /= 0) then
> + write(0,'(a,i0,1x,a,1x,"(",a,":",i0,")")') &
> + & clpfx(1:ipfxlen)//"## ec_meminfo error code
> =",error,clwhat,srcfile,srcline
> + call mpi_abort(kcomm,-1,error)
> + end if
> + end subroutine check_error
> + subroutine rnsort(kun)
> + do i=0,nproc-1
> + end do
> + end subroutine rnsort
> +end subroutine ec_meminfo
>
> Jakub
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)