Karel Gardas <[EMAIL PROTECTED]> writes:

> I've thought that L1 and L2 DTLB misses are the most important for the
> overall performance or performance degradation, if not please correct
> me since this is my first attempt to measure and interpret such data.

TLB is just for caching the translations from virtual to physical
addresses. Normally the data/instruction cache misses are more
important. There are a few TLB intensive workloads too, but they tend
to use much more memory than gcc normally does.

So I think you should rather use ICACHE_MISSES and 
DATA_CACHE_REFILLS_FROM_SYSTEM,
which measure the "real" L2 caches.

And perhaps run a normal instruction profile (CPU_CLK_UNHALTED) in parallel and
double check the hot spots displayed by the others match the real
time hogs. Note you can use upto three performance counters at the same time.

-Andi

Reply via email to