For the largest bucket (size==80), the size reduction is 20%. Not bad. David
On Wed, Sep 12, 2012 at 1:44 PM, Dehao Chen <de...@google.com> wrote: > Attached is the memory consumption report for a very large source > file. Looks like this patch actually reduced the memory consumption by > 2%. > > Dehao > > On Thu, Sep 13, 2012 at 1:18 AM, Xinliang David Li <davi...@google.com> wrote: >> On Wed, Sep 12, 2012 at 10:05 AM, Dehao Chen <de...@google.com> wrote: >>> There are two parts that needs memory management: >>> >>> 1. The BLOCK structure. This is managed by GC. I originally thought >>> that removing blocks from tree.gsbase would paralyze GC. This turned >>> out not to be a concern because DECL_INITIAL will still mark those >>> used tree nodes. This patch may decrease the memory consumption by >>> removing blocks from tree/gimple. However, as it makes more blocks >>> become used, they also increase the memory consumption. >> >> You mean when you also make the location table GC root. >> >> Can you share the mem-stats information for the large program with and >> without your patch? >> >> thanks, >> >> David >> >>> 2. The data structure in libcpp that maintains the hashtable for the >>> location->block mapping. This is relatively minor because for the >>> largest source I've seen, it only maintains less than 100K entries in >>> the array (less than 1M total memory consumption). However, as it is a >>> global data structure, it may make LTO unhappy. Honza is helping >>> testing the memory consumption on LTO (but we first need to make this >>> patch work for LTO). If the LTO result turns out ok, we probably don't >>> want to put these under GC because: 1. it'll make things much more >>> complicated. 2. using self managed memory is more efficient (as this >>> is frequently used in many passes). 3. not using GC actually saves >>> memory because even though the block is in the map, it can still be >>> GCed as soon as it's not reachable from DECL_INITIAL. >>> >>> I've tested this on some very large C++ files (each one takes more >>> than 10s to build), the memory consumption does not see noticeable >>> increase/decrease. >>> >>> Thanks, >>> Dehao >>> >>> On Wed, Sep 12, 2012 at 9:39 AM, Xinliang David Li <davi...@google.com> >>> wrote: >>>> On Wed, Sep 12, 2012 at 2:13 AM, Richard Guenther >>>> <richard.guent...@gmail.com> wrote: >>>>> On Wed, Sep 12, 2012 at 7:06 AM, Dehao Chen <de...@google.com> wrote: >>>>>> Now I think we are facing a more complex problem. The data structure >>>>>> we use to store the location_adhoc_data are file-static in linemap.c >>>>>> in libcpp. These data structures are not guarded by GTY(()). >>>>>> Meanwhile, as we have removed the block data structure from >>>>>> gimple.gsbase as well as tree.exp (encoding them into an location_t). >>>>>> This could cause block being GCed and the LOCATION_BLOCK becoming >>>>>> dangling pointers. >>>>> >>>>> Uh. Note that it is quite important that we are able to garbage-collect >>>>> unused >>>>> BLOCKs, this is the whole point of removing unused BLOCK scopes in >>>>> remove_unused_locals. So this indeed becomes much more complicated ... >>>>> What would be desired is that the garbage collector can NULL an entry in >>>>> the mapping table when it is not referenced in any other way (that other >>>>> reference would be the BLOCK tree as stored in a FUNCTION_DECLs >>>>> DECL_INITIAL). >>>> >>>> It would be nice to GC those unused BLOCKS. I wonder how many BLOCKS >>>> are created for a large C++ program. This patch saves memory by >>>> shrinking tree size, is it a net win or loss without GC those BLOCKS? >>>> >>>> thanks, >>>> >>>> David >>>> >>>> >>>>> >>>>>> I tried to manipulate GTY to make it recognize the LOCATION_BLOCK from >>>>>> gimple.gsbase.location. However, neigher nested_ptr nor mark_hook can >>>>>> help me. >>>>>> >>>>>> Another approach would be guard the location_adhoc_data and related >>>>>> data structures in GTY(()). However, this is non-trivial because tree >>>>>> is not visible in libcpp. At the same time, my implementation heavily >>>>>> relies on hashtable to make the code efficient, thus it's quite tricky >>>>>> to make "param_is" and "use_params" work. >>>>>> >>>>>> The final approach, which I'll try tomorrow, would be move all my >>>>>> implementation from libcpp to gcc, and guard them with GTY(()). I >>>>>> still haven't thought of any potential problem of this approach. Any >>>>>> comments? >>>>> >>>>> I think moving the mapping to GC in a lazy manner as I described above >>>>> would be the way to go. For hashtables GC already supports if_marked, >>>>> not sure if similar support is available for arrays/vecs. >>>>> >>>>> Richard. >>>>> >>>>>> Thanks, >>>>>> Dehao >>>>>> >>>>>> On Tue, Sep 11, 2012 at 9:00 AM, Dehao Chen <de...@google.com> wrote: >>>>>>> I saw comments in tree-streamer-out.c: >>>>>>> >>>>>>> /* Do not stream BLOCK_SOURCE_LOCATION. We cannot handle debug >>>>>>> information >>>>>>> for early inlining so drop it on the floor instead of ICEing in >>>>>>> dwarf2out.c. */ >>>>>>> streamer_write_chain (ob, BLOCK_VARS (expr), ref_p); >>>>>>> >>>>>>> However, what the code is doing seemed contradictory with the comment. >>>>>>> Or am I missing something? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz <m...@suse.de> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> On Tue, 11 Sep 2012, Dehao Chen wrote: >>>>>>>> >>>>>>>>> Looks like we have two choices: >>>>>>>>> >>>>>>>>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t) >>>>>>>> >>>>>>>> This will actually not work correctly in some cases. The problem is, >>>>>>>> if >>>>>>>> the prevailing decl is already part of another chain (say in another >>>>>>>> block_var list) you would break the current chain. Hence block vars >>>>>>>> need >>>>>>>> special handling in the lto streamer (another reason why tree_chain is >>>>>>>> not >>>>>>>> the most clever think to use for this chain). This problem area needs >>>>>>>> to >>>>>>>> be solved somehow if block info is to be preserved correctly. >>>>>>>> >>>>>>>>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL >>>>>>>>> (TREE_CHAIN (t)). >>>>>>>> >>>>>>>> That's also a large hammer as it basically will mean no debug info >>>>>>>> after >>>>>>>> LTO :-/ Sigh, at this point I have no good solution that doesn't >>>>>>>> involve >>>>>>>> quite some work, perhaps your hack is good enough for the time being, >>>>>>>> though I hate it :) >>>>>>> >>>>>>> I got it. Then I'll keep the patch as it is (remove the >>>>>>> LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and >>>>>>> then we should be good to check in? >>>>>>> >>>>>>> Thanks, >>>>>>> Dehao >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Ciao, >>>>>>>> Michael.