Andrew Haley writes:
 > Jakub Jelinek writes:
 > 
 >  > >  > While I still like using dl_iterate_phdr instead of
 >  > >  > __register_frame_info_bases for totally aesthetic reasons, there
 >  > >  > have been changes made to the dl_iterate_phdr interface since the
 >  > >  > gcc support was written that would allow the dl_iterate_phdr
 >  > >  > results to be cached.
 >  > > 
 >  > > That would be nice.  Also, we could fairly easily build a tree of
 >  > > nodes, one for each loaded object, then we wouldn't be doing a linear
 >  > > search through them.  We could do that lazily, so it wouldn't kick in
 >  > > 'til needed.
 >  > 
 >  > Here is a rough patch for what you can do.
 > 
 > Thanks very much.  I'm working on it.

OK, I've roughed out a very simple patch and it certainly seems to
improve things.

Here's the before:

samples  cum. samples  %        cum. %     app name                 symbol name
17962    17962         25.8164  25.8164    libgcc_s.so.1            
_Unwind_IteratePhdrCallback
7019     24981         10.0882  35.9046    libc-2.3.3.so            
dl_iterate_phdr
6966     31947         10.0121  45.9167    libgcc_s.so.1            
read_encoded_value_with_base
3756     35703          5.3984  51.3151    libgcj.so.6.0.0          GC_mark_from
3643     39346          5.2360  56.5511    libgcc_s.so.1            
search_object
2032     41378          2.9205  59.4717    libgcc_s.so.1            
__i686.get_pc_thunk.bx
1555     42933          2.2350  61.7066    libgcj.so.6.0.0          
_Jv_MonitorExit
1413     44346          2.0309  63.7375    libgcj.so.6.0.0          
_Jv_MonitorEnter
1288     45634          1.8512  65.5887    libgcj.so.6.0.0          
java::util::IdentityHashMap::hash(java::lang::Object*)

And here's the after:

samples  cum. samples  %        cum. %     app name                 symbol name
7020     7020          14.7674  14.7674    libgcc_s.so.1            
read_encoded_value_with_base
3808     10828          8.0106  22.7780    libgcc_s.so.1            
_Unwind_IteratePhdrCallback
3680     14508          7.7413  30.5194    libgcj.so.6.0.0          GC_mark_from
3463     17971          7.2849  37.8042    libgcc_s.so.1            
search_object
1587     19558          3.3385  41.1427    libgcj.so.6.0.0          
_Jv_MonitorExit
1577     21135          3.3174  44.4601    libc-2.3.3.so            
dl_iterate_phdr
1288     22423          2.7095  47.1696    libgcj.so.6.0.0          
_Jv_MonitorEnter
1230     23653          2.5875  49.7570    libgcj.so.6.0.0          
java::util::IdentityHashMap::hash(java::lang::Object*)

So, the time spent unwinding before was about 50% of the total
runtime, and after about 28%.  I measured the a miss rate of 0.006%
with 27 entries used.

Still, 28% is a heavy overhead.  I think it's because we're doing a
great deal of class lookups, and that does a stack trace as a security
check.  I'll look at caching secirity contexts in libgcj.

Andrew.



Reply via email to