On Thu, 13 Feb 2025, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs, because we have multiple levels of
> DECL_VALUE_EXPR VAR_DECLs:
>   character(kind=1) id_string[1:.id_string] [value-expr: *id_string.55];
>   character(kind=1)[1:.id_string] * id_string.55 [value-expr: 
> FRAME.107.id_string.55];
>   integer(kind=8) .id_string [value-expr: FRAME.107..id_string];
> id_string is the user variable mentioned in BLOCK_VARS, it has
> DECL_VALUE_EXPR because it is a VLA, id_string.55 is a temporary created by
> gimplify_vla_decl as the address that points to the start of the VLA, what
> is normally used in the IL to access it.  But as this artificial var is then
> used inside of a nested function, tree-nested.cc adds DECL_VALUE_EXPR to it
> too and moves the actual value into the FRAME.107 object's member.
> Now, remove_unused_locals removes id_string.55 (and various other VAR_DECLs)
> from cfun->local_decls, simply because it is not mentioned in the IL at all
> (neither is id_string itself, but that is kept in BLOCK_VARS as it has
> DECL_VALUE_EXPR).  So, after this point, id_string.55 tree isn't referenced 
> from
> anywhere but id_string's DECL_VALUE_EXPR.  Next GC collection is triggered,
> and we are unlucky enough that in the value_expr_for_decl hash table
> (underlying hash map for DECL_VALUE_EXPR) the id_string.55 entry comes
> before the id_string entry.  id_string is ggc_marked_p because it is
> referenced from BLOCK_VARS, but id_string.55 is not, as we don't mark
> DECL_VALUE_EXPR anywhere but by gt_cleare_cache on value_expr_for_decl.
> But gt_cleare_cache does two things, it calls clear_slots on entries
> where the key is not ggc_marked_p (so the id_string.55 mapping to
> FRAME.107.id_string.55 is lost and DECL_VALUE_EXPR (id_string.55) becomes
> NULL) but then later we see id_string entry, which is ggc_marked_p, so mark
> the whole hash table entry, which sets ggc_set_mark on id_string.55.  But
> at this point its DECL_VALUE_EXPR is lost.
> Later during dwarf2out.cc we want to emit DW_AT_location for id_string, see
> it has DECL_VALUE_EXPR, so emit it as indirection of id_string.55 for which
> we again lookup DECL_VALUE_EXPR as it has DECL_HAS_VALUE_EXPR_P, but as it
> is NULL, we ICE, instead of finding it is a subobject of FRAME.107 for which
> we can find its stack location.
> 
> Now, as can be seen in the PR, I've tried to tweak tree-ssa-live.cc so that
> it would keep id_string.55 in cfun->local_decls; that prohibits it from
> the DECL_VALUE_EXPR of it being GC until expansion, but then we shrink and
> free cfun->local_decls completely and so GC at that point still can throw
> it away.
> 
> The following patch adds an extension to the GTY ((cache)) option, before
> calling the gt_cleare_cache on some hash table by specifying
> GTY ((cache ("somefn"))) it calls somefn on that hash table as well.
> And this extra hook can do any additional ggc_set_mark needed so that
> gt_cleare_cache preserves everything that is actually needed and throws
> away the rest.
> 
> In order to make it just 2 pass rather than up to n passes - (if we had
> say
> id1 -> something, id2 -> x(id1), id3 -> x(id2), id4 -> x(id3), id5 -> x(id4)
> in the value_expr_for_decl hash table in that order (where idN are VAR_DECLs
> with DECL_HAS_VALUE_EXPR_P, id5 is the only one mentioned from outside and
> idN -> X stands for idN having DECL_VALUE_EXPR X, something for some
> arbitrary tree and x(idN) for some arbitrary tree which mentions idN
> variable) and in each pass just marked the to part of entries with
> ggc_marked_p base.from we'd need to repeat until we don't mark anything)
> the patch calls walk_tree on DECL_VALUE_EXPR of the marked trees and if it
> finds yet unmarked tree, it marks it and walks its DECL_VALUE_EXPR as well
> the same way.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

So what this basically does is ensure we mark DECL_VALUE_EXPR when
VAR is marked which isn't done when marking a tree node.

That you special-case the hashtable walker is a workaround for
us not being able to say

struct GTY((mark_extra_stuff)) tree_decl_with_vis {

on 'tree' (or specifically the structs for a VAR_DECL).  And that we
rely on gengtype producing the 'tree' marker.  So we rely on the
hashtable keeping referenced trees live.

OK.

Thanks,
Richard.

> 2025-02-13  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR debug/118790
>       * gengtype.cc (write_roots): Remove cache variable, instead break from
>       the loop on match and test o for NULL.  If the cache option has
>       non-empty string argument, call the specified function with v->name
>       as argument before calling gt_cleare_cache on it.
>       * tree.cc (gt_value_expr_mark_2, gt_value_expr_mark_1,
>       gt_value_expr_mark): New functions.
>       (value_expr_for_decl): Use GTY ((cache ("gt_value_expr_mark"))) rather
>       than just GTY ((cache)).
>       * doc/gty.texi (cache): Document optional argument of cache option.
> 
>       * gfortran.dg/gomp/pr118790.f90: New test.
> 
> --- gcc/gengtype.cc.jj        2025-01-02 11:23:02.613710956 +0100
> +++ gcc/gengtype.cc   2025-02-12 17:15:08.560424329 +0100
> @@ -4656,13 +4656,12 @@ write_roots (pair_p variables, bool emit
>        outf_p f = get_output_file_with_visibility (CONST_CAST (input_file*,
>                                                             v->line.file));
>        struct flist *fli;
> -      bool cache = false;
>        options_p o;
>  
>        for (o = v->opt; o; o = o->next)
>       if (strcmp (o->name, "cache") == 0)
> -       cache = true;
> -       if (!cache)
> +       break;
> +       if (!o)
>       continue;
>  
>        for (fli = flp; fli; fli = fli->next)
> @@ -4677,6 +4676,8 @@ write_roots (pair_p variables, bool emit
>         oprintf (f, " ()\n{\n");
>       }
>  
> +      if (o->kind == OPTION_STRING && o->info.string && o->info.string[0])
> +     oprintf (f, "  %s (%s);\n", o->info.string, v->name);
>        oprintf (f, "  gt_cleare_cache (%s);\n", v->name);
>      }
>  
> --- gcc/tree.cc.jj    2025-01-20 10:26:42.216048422 +0100
> +++ gcc/tree.cc       2025-02-12 18:05:44.368264018 +0100
> @@ -211,13 +211,61 @@ struct cl_option_hasher : ggc_cache_ptr_
>  
>  static GTY ((cache)) hash_table<cl_option_hasher> *cl_option_hash_table;
>  
> +/* Callback called through walk_tree_1 to discover DECL_HAS_VALUE_EXPR_P
> +   VAR_DECLs which weren't marked yet, in that case marks them and
> +   walks their DECL_VALUE_EXPR expressions.  */
> +
> +static tree
> +gt_value_expr_mark_2 (tree *tp, int *, void *data)
> +{
> +  tree t = *tp;
> +  if (VAR_P (t) && DECL_HAS_VALUE_EXPR_P (t) && !ggc_set_mark (t))
> +    {
> +      tree dve = DECL_VALUE_EXPR (t);
> +      walk_tree_1 (&dve, gt_value_expr_mark_2, data,
> +                (hash_set<tree> *) data, NULL);
> +    }
> +  return NULL_TREE;
> +}
> +
> +/* Callback called through traverse_noresize on the
> +   value_expr_for_decl hash table.  */
> +
> +int
> +gt_value_expr_mark_1 (tree_decl_map **e, hash_set<tree> *pset)
> +{
> +  if (ggc_marked_p ((*e)->base.from))
> +    walk_tree_1 (&(*e)->to, gt_value_expr_mark_2, pset, pset, NULL);
> +  return 1;
> +}
> +
> +/* The value_expr_for_decl hash table can have mappings for trees
> +   which are only referenced from mappings of other trees in the
> +   same table, see PR118790.  Without this routine, gt_cleare_cache
> +   could clear hash table slot of a tree which isn't marked but
> +   will be marked when processing later hash table slot of another
> +   tree which is marked.  This function marks with the above
> +   helpers marks all the not yet marked DECL_HAS_VALUE_EXPR_P
> +   VAR_DECLs mentioned in DECL_VALUE_EXPR expressions of marked
> +   trees and in that case also recurses on their DECL_VALUE_EXPR.  */
> +
> +void
> +gt_value_expr_mark (hash_table<tree_decl_map_cache_hasher> *h)
> +{
> +  if (!h)
> +    return;
> +
> +  hash_set<tree> pset;
> +  h->traverse_noresize<hash_set<tree> *, gt_value_expr_mark_1> (&pset);
> +}
> +
>  /* General tree->tree mapping  structure for use in hash tables.  */
>  
>  
>  static GTY ((cache))
>       hash_table<tree_decl_map_cache_hasher> *debug_expr_for_decl;
>  
> -static GTY ((cache))
> +static GTY ((cache ("gt_value_expr_mark")))
>       hash_table<tree_decl_map_cache_hasher> *value_expr_for_decl;
>  
>  static GTY ((cache))
> --- gcc/doc/gty.texi.jj       2025-01-02 11:47:30.411219590 +0100
> +++ gcc/doc/gty.texi  2025-02-12 18:11:53.048139879 +0100
> @@ -313,6 +313,11 @@ Note that caches should generally use @c
>  @code{cache} is only preferable if the value is impractical to
>  recompute from the key when needed.
>  
> +The @code{cache} option can have an optional argument, name of the function
> +which should be called before @samp{gt_cleare_cache}.  This can be useful
> +if the hash table needs to be traversed and mark some pointers before
> +@samp{gt_cleare_cache} could clear slots in it.
> +
>  @findex deletable
>  @item deletable
>  
> --- gcc/testsuite/gfortran.dg/gomp/pr118790.f90.jj    2025-02-12 
> 18:19:14.799998043 +0100
> +++ gcc/testsuite/gfortran.dg/gomp/pr118790.f90       2025-02-12 
> 18:18:43.606431737 +0100
> @@ -0,0 +1,182 @@
> +! PR debug/118790
> +! { dg-do compile }
> +! { dg-options "-O2 -g -fopenmp --param ggc-min-expand=0 --param 
> ggc-min-heapsize=0" }
> +
> +module ec_args_mod
> +  private
> +  public :: ec_argc, ec_argv, ec_args
> +  interface
> +    function ec_argc() bind(c,name="ec_argc") result(argc)
> +    end function
> +  end interface
> +contains
> +  function ec_argv(iarg) result(argv)
> +    use, intrinsic :: iso_c_binding
> +    character(len=:), allocatable :: argv
> +    type(c_ptr), pointer :: argv_cptrs(:)
> +    argv = to_string (argv_cptrs(iarg+1), 1024)
> +  end function
> +  subroutine ec_args()
> +    use, intrinsic :: iso_c_binding
> +    integer :: argc
> +    type(c_ptr) :: argv(512)
> +    if (ec_argc() == 0) then
> +      call read_command_line(argc,argv)
> +    end if
> +  end subroutine
> +  function to_string(cptr,maxlen) result(string)
> +    use, intrinsic :: iso_c_binding
> +    character(len=:), allocatable :: string
> +    type(c_ptr) :: cptr
> +    character(kind=c_char,len=1), pointer :: s(:)
> +    call c_f_pointer (cptr, s, (/maxlen/))
> +    do
> +      if (s(i) == c_null_char) exit
> +      i = i + 1
> +    end do
> +    nchars = i - 1
> +    allocate (character(len=(nchars)) :: string)
> +    do i=1,nchars
> +      string(i:i) = s(i)
> +    end do
> +  end function
> +  subroutine read_command_line(argc,argv)
> +    use, intrinsic :: iso_c_binding
> +    integer, parameter :: cmd_max_len = 1024 * 512
> +    integer(c_int) :: argc
> +    type(c_ptr) :: argv(:)
> +    character(kind=c_char,len=1), save, target :: args(cmd_max_len)
> +    character(kind=c_char,len=cmd_max_len), save, target :: cmd
> +    character(kind=c_char,len=cmd_max_len) :: arg
> +    integer(c_int) :: iarg, arglen, pos, ich, argpos
> +    do ich=1,len(cmd)
> +      if (cmd(ich:ich) == " ") then
> +        cmd(ich:ich) = c_null_char
> +      end if
> +    end do
> +    do iarg=1,argc
> +      do ich=1,arglen
> +        args(pos) = arg(ich:ich)
> +      end do
> +      args(pos) = c_null_char;  pos = pos+1
> +      argv(iarg+1) = c_loc(args(argpos))
> +    end do
> +  end subroutine
> +end module
> +module mpl_mpif
> +  integer mpi_status_size
> +end module mpl_mpif
> +subroutine ec_meminfo(ku,cdstring,kcomm,kbarr,kiotask,kcall)
> +  use mpl_mpif
> +  interface
> +    subroutine ec_pmon(energy,power)
> +    end subroutine ec_pmon
> +  end interface
> +  character(len=*), intent(in) :: cdstring
> +  integer :: ii,jj,i,j,k,myproc,nproc,len,error,nodenum,jid
> +  integer :: tasksmall,nodehuge,memfree,cached,nfree
> +  integer :: nnuma
> +  integer,dimension(:),allocatable,save :: smallpage,hugepage
> +  integer :: n18
> +  integer,dimension(:,:),allocatable,save :: node, bucket
> +  character(len=256) :: clstr
> +  character(len=20)  :: nodename,lastnode,clmaxnode
> +  character(len=160) ::line
> +  character(len=5+1+len(cdstring)) :: id_string
> +  integer :: irecv_status(mpi_status_size)
> +  logical :: llnocomm, llnohdr
> +  logical, save :: llfirst_time = .true.
> +  type ranknode_t
> +    integer :: nodenum
> +    integer :: pid
> +    integer :: rank_world
> +    integer :: rank
> +    integer :: iorank
> +    integer :: nodemaster
> +    integer, allocatable :: coreids(:)
> +    character(len=len(clstr)) :: str
> +  end type
> +  type (ranknode_t), allocatable, save :: rn(:)
> +  integer, allocatable :: coreids(:)
> +  character(len=64) :: clpfx
> +  if (llfirst_time .and. .not. llnocomm) then
> +    allocate(coreids(0:maxth-1))
> +    coreids(:) = -1
> +!$omp parallel num_threads(maxth) shared(coreids) private(i,myth,icoreid)
> +    do i=1,maxth
> +      icoreid = ec_coreid()
> +      myth = omp_get_thread_num()
> +      coreids(myth) = icoreid
> +    end do
> +!$omp end parallel
> +    if (myproc == 0) then
> +      call slash_proc
> +      allocate(rn(0:nproc-1))
> +      do i=0,nproc-1
> +        rn(i)%nodenum = -1
> +        if (i > 0) then
> +          call mpi_recv(lastnode, len(lastnode), mpi_byte, i, itag, kcomm, 
> irecv_status, error)
> +          call check_error("from 
> mpi_recv(lastnode)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",258)
> +          call mpi_comm_rank(mpi_comm_world,k,error)
> +          rn(i)%rank = 0
> +          rn(i)%str = cdstring
> +          rn(i)%pid = ec_getpid()
> +        end if
> +        rn(i)%rank_world = k
> +        rn(i)%iorank = iorank
> +        rn(i)%nodemaster = 0
> +        call check_error("from 
> mpi_send(iam_nodemaster)","/tmp/fiat/src/fiat/util/ec_meminfo.f90",305)
> +      end do
> +      call mpi_send(clstr,len(clstr),mpi_byte,0,itag+5,kcomm,error)
> +      call mpi_send(clstr,maxth,mpi_integer4,0,itag+6,kcomm,error)
> +      call 
> mpi_recv(lastnode,1,mpi_integer4,0,itag+7,kcomm,irecv_status,error)
> +    end if
> +  end if
> +contains
> +  subroutine slash_proc
> +    read(line(idx+iclkeylen-1:),*,err=99,end=98) node(:,0)
> +98  continue
> +    do k=1,maxnuma-1
> +      read(line(idx+iclkeylen-1:),*,err=99) node(0:n18-1,k)
> +    end do
> +99  continue
> +    close(502)
> +    smallpage(:) = 0
> +    do k=0,nnuma-1
> +      do j=0,n18-1
> +      end do
> +      smallpage(k) = sum(bucket(0:8,k))/onemega
> +      hugepage(k) = sum(bucket(9:n18-1,k))/onemega
> +    end do
> +    open(file="/proc/meminfo",unit=502,status="old",action="read",err=977)
> +    do i=1,10
> +      read(502,'(a)',err=988,end=988) line
> +      if(line(1:7) == "memfree") then
> +        read(line(9:80),*) memfree
> +      else if(line(1:6) == "cached") then
> +        read(line(8:80),*) cached
> +      end if
> +    end do
> +988 continue
> +    close(502)
> +977 continue
> +    memfree=memfree/1024
> +  end subroutine slash_proc
> +  subroutine prt_data(kun,knodenum,cdlastnode,kcall)
> +    character(len=4096) :: clbuf
> +    write(clbuf(ilen+1:),'(2x,2i8,3x,2f6.1,1x,i9,1x,i6,1x,a7,1x,a)') 
> trim(id_string)
> +  end subroutine prt_data
> +  subroutine check_error(clwhat,srcfile,srcline)
> +    character(len=*), intent(in) :: clwhat, srcfile
> +    integer, intent(in) :: srcline
> +    if (error /= 0) then
> +      write(0,'(a,i0,1x,a,1x,"(",a,":",i0,")")') &
> +            & clpfx(1:ipfxlen)//"## ec_meminfo error code 
> =",error,clwhat,srcfile,srcline
> +      call mpi_abort(kcomm,-1,error)
> +    end if
> +  end subroutine check_error
> +  subroutine rnsort(kun)
> +    do i=0,nproc-1
> +    end do
> +  end subroutine rnsort
> +end subroutine ec_meminfo
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to