http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #96 from Jan Hubicka <hubicka at gcc dot gnu.org> 2011-05-27 
21:57:27 UTC ---
Stream in oprofile is now quite changed:

33258     9.6313  lto1                     htab_find_slot_with_hash
29679     8.5949  lto1                     lto_input_tree
18338     5.3106  lto1                     gt_ggc_mx_lang_tree_node
15723     4.5533  lto1                     ggc_set_mark
15109     4.3755  lto1                     inflate_fast
13883     4.0204  lto1                     ht_lookup_with_hash
12957     3.7523  lto1                     pointer_map_insert
12433     3.6005  libc-2.11.1.so           memset
8661      2.5082  lto1                     lto_input_uleb128
8584      2.4859  libc-2.11.1.so           _int_malloc
6832      1.9785  lto1                     ggc_internal_alloc_stat
6722      1.9467  lto1                     ht_lookup

We do have nice improvements on merging and streaming effectivity. Still
burning over 10% in hashing don't seem quite reasonable.

I am not sure if most of the htab overhead is still the type merging given that
rest of it is off profile.  It may be something stupid, like the file name
hash, that is queried every time file is changed in the location. Probably
should re-do callgraph profile later next week.

I do have some extra patches to reduce uleb streaming overhead and further make
lto_input_tree bit more streamlined that might help a little. Not sure how much
real room for improvement for simple optimizations in this direction is left
and how much we really need to look into streaming fewer trees.

 garbage collection    :  16.29 ( 6%) usr   0.02 ( 0%) sys  16.33 ( 6%) wall   
   0 kB ( 0%) ggc
 ipa lto decl in       :  76.15 (28%) usr   2.96 (21%) sys  79.33 (28%) wall 
722892 kB (44%) ggc
 ipa lto decl out      :  83.36 (31%) usr   4.58 (32%) sys  88.37 (31%) wall   
   0 kB ( 0%) ggc
 ipa lto decl merge    :  14.59 ( 5%) usr   0.00 ( 0%) sys  14.64 ( 5%) wall   
 801 kB ( 0%) ggc
 inline heuristics     :  40.95 (15%) usr   0.19 ( 1%) sys  41.40 (14%) wall 
241725 kB (15%) ggc

Memory needed is down, too, at about 4.3GB (in 64bit compilation).

GIMPLE type table: size 1048573, 570402 elements, 5098430 searches, 3158421
collisions (ratio: 0.619489)
GIMPLE type hash table: size 4194301, 1441169 elements, 44401918 searches,
37071081 collisions (ratio: 0.834898)
GIMPLE canonical type table: size 65521, 49079 elements, 896788 searches,
575628 collisions (ratio: 0.641877)
GIMPLE canonical type hash table: size 1048573, 524811 elements, 2845518
searches, 2279153 collisions (ratio: 0.800962)
[WPA] Compression: 424774798 input bytes, 1619588170 uncompressed bytes (ratio:
3.812816)

Reply via email to