> > Index: lto-section-in.c > > =================================================================== > > --- lto-section-in.c (revision 202047) > > +++ lto-section-in.c (working copy) > > @@ -414,6 +414,41 @@ lto_get_function_in_decl_state (struct l > > return slot? ((struct lto_in_decl_state*) *slot) : NULL; > > } > > > > +/* Free decl_states. */ > > + > > +void > > +lto_free_function_in_decl_state (struct lto_in_decl_state *state) > > +{ > > + int i; > > + for (i = 0; i < LTO_N_DECL_STREAMS; i++) > > + ggc_free (state->streams[i].trees); > > We likely also waste a lot of memory by means of GC overhead here. > Moving this array out of GC space would be better (we know the > exact size upfront) - should be doable with making lto_tree_ref_able > GTY((user)) and a custom marker.
Possibly, yes. It should be way how to get this memory back for malloc stuff without convincing ggc to do something useful ;) I will be happy to play with this incrementally. (It bit makes me feel bit worried about maintenance costs when user markes starts to do smart things everywhere, but this one is important enough special case for sure). > > Stray changes? Yes, sorry, I was running both tests at one machine. Removed from the patch now. > > Just leave the ggc_collect ()s there, they are free. First ggc_collect after type reading executes garbage collector and the other ggc_collects do nothing (since we never allocate enough of ggc memory to trigger them). Currently we trigger ggc_collect just after tree streaming. It spend time to walk memory, free almost nothing, produce debug output so I know how much memory we need, and mark the bounds high enough so it ggc_collect won't happen again. All in all it is pointless for overall memory use of WPA. I want the first run to happen only after things are merged & dead code removed. Then it frees about 0.5-1GB of garbage that is result of decl state streams disappearing and some trees actually being rendered dead. This way ggc allocations of IPA optimizers can use the memory freed. I can drop this change from the patch removing states now and we can discuss these incrementally. Honza > > > timevar_pop (TV_IPA_LTO_DECL_MERGE); > > /* Each pass will set the appropriate timer. */ > > @@ -3503,6 +3501,9 @@ read_cgraph_and_symbols (unsigned nfiles > > gcc_assert (all_file_decl_data[i]->symtab_node_encoder); > > lto_symtab_encoder_delete > > (all_file_decl_data[i]->symtab_node_encoder); > > all_file_decl_data[i]->symtab_node_encoder = NULL; > > + lto_free_function_in_decl_state > > (all_file_decl_data[i]->global_decl_state); > > + all_file_decl_data[i]->global_decl_state = NULL; > > + all_file_decl_data[i]->current_decl_state = NULL; > > } > > > > /* Finally merge the cgraph according to the decl merging decisions. */ > > @@ -3513,7 +3514,12 @@ read_cgraph_and_symbols (unsigned nfiles > > dump_symtab (cgraph_dump_file); > > } > > lto_symtab_merge_symbols (); > > - ggc_collect (); > > + > > + /* Do not GGC collect here; streaming in should not produce garbage. > > + Be sure we first collect after merging symbols, setting up > > visibilities > > + and removing unreachable nodes. This will happen after whole program > > + visibility pass. This should release more memory back to the system > > + and possibly allow us to re-use it for heap. */ > > But it's not wrong to collect here, no? See above, it should be > mostly free. > > Thanks, > Richard.