https://bugs.kde.org/show_bug.cgi?id=386256

--- Comment #2 from Milian Wolff <m...@milianw.de> ---
Git commit a189ad4d2aa09e7afcd47987bdc75537dd22d5d3 by Milian Wolff.
Committed on 16/03/2018 at 23:51.
Pushed by mwolff into branch 'master'.

Optimize: only map trace indices for allocation infos

Previously, we used to run the binary search to map a trace index
to an allocation object on every (de)allocation. These events are
occurring extremely often, of course - usually orders of magnitudes
more often than the allocation info events.

Now, we only map the trace index to an allocation when a new
allocation info is parsed. This way, we don't need to run the
slow binary search and can access the allocation object directly
through the mapped index in the allocation index.

For a large data file (~13GB uncompressed) the results are quite
impressive: Before this patch, heaptrack_print took ca. 3min to
parse the zstd compressed data. With this patch applied, we are
down to 2min6s!

Before:

 Performance counter stats for 'heaptrack_print
heaptrack.Application.19285.zst':

     178798,164042      task-clock:u (msec)       #    0,998 CPUs utilized
                 0      context-switches:u        #    0,000 K/sec
                 0      cpu-migrations:u          #    0,000 K/sec
            30.570      page-faults:u             #    0,171 K/sec
   551.902.999.436      cycles:u                  #    3,087 GHz
 1.540.185.452.300      instructions:u            #    2,79  insn per cycle
   332.833.340.539      branches:u                # 1861,503 M/sec
     1.350.342.839      branch-misses:u           #    0,41% of all branches

     179,193276255 seconds time elapsed

After:

 Performance counter stats for 'heaptrack_print
heaptrack.Application.19285.zst':

     125579,754384      task-clock:u (msec)       #    0,999 CPUs utilized
                 0      context-switches:u        #    0,000 K/sec
                 0      cpu-migrations:u          #    0,000 K/sec
            33.982      page-faults:u             #    0,271 K/sec
   393.084.840.177      cycles:u                  #    3,130 GHz
 1.127.147.336.034      instructions:u            #    2,87  insn per cycle
   238.225.815.121      branches:u                # 1897,008 M/sec
       998.456.200      branch-misses:u           #    0,42% of all branches

     125,663808724 seconds time elapsed

M  +20   -10   src/analyze/accumulatedtracedata.cpp
M  +8    -2    src/analyze/accumulatedtracedata.h
M  +2    -1    src/analyze/gui/parser.cpp

https://commits.kde.org/heaptrack/a189ad4d2aa09e7afcd47987bdc75537dd22d5d3

--- Comment #3 from Milian Wolff <m...@milianw.de> ---
Git commit ef4a460cc69310618ec72cbaf284501bd19a6133 by Milian Wolff.
Committed on 16/03/2018 at 23:50.
Pushed by mwolff into branch 'master'.

Optimize AccumulatedTraceData::findAllocation

Instead of sorting the vector of Allocation objects which are
48byte large, introduce a separate sorted vector of pairs of
TraceIndex and and AllocationIndex, both just 4 byte large.
Lookup and insertion in the middle of this much smaller container
is considerably faster, improving the heaptrack_print analysis
time by ~10% in one of my larger test files (from ~3minutes down
to 2min40s).

M  +20   -11   src/analyze/accumulatedtracedata.cpp
M  +3    -0    src/analyze/accumulatedtracedata.h
M  +5    -0    src/analyze/allocationdata.h

https://commits.kde.org/heaptrack/ef4a460cc69310618ec72cbaf284501bd19a6133

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to