https://bugs.kde.org/show_bug.cgi?id=354636
--- Comment #10 from Oded Arbel <o...@geek.co.il> --- > 3. The index file is huge - about 19GB, which doesn't make a lot of sense to > me. `balooctl indexSize` has this to say: > > ----8<---- > File Size: 18.75 GiB > Used: 948.13 MiB > > PostingDB: 2.93 GiB 316.627 % > PositionDB: 85.44 MiB 9.011 % > DocTerms: 1.39 GiB 149.920 % > DocFilenameTerms: 152.72 MiB 16.107 % > DocXattrTerms: 8.39 MiB 0.885 % > IdTree: 35.69 MiB 3.764 % > IdFileName: 175.18 MiB 18.476 % > DocTime: 92.85 MiB 9.793 % > DocData: 43.49 MiB 4.587 % > ContentIndexingDB: 448.00 KiB 0.046 % > FailedIdsDB: 0 B 0.000 % > MTimeDB: 26.48 MiB 2.793 % > ----8<---- > > and to that I can only say "wahhh?!?!?" After reviewing the code at https://github.com/KDE/baloo/blob/master , I'm more befuddled by the above numbers: 1. "Used" is `DatabaseSize.expectedSize` 2. The percentages are computed by 100 * "entry size" / "Used", so the 316% makes sense as it is larger than "Used". 3. `DatabaseSize.expectedSize` is calculated (src/engine/transaction.cpp:474) by adding up the sizes of all of the entries listed!! so it cannot be smaller than the sum of its parts, unless one of the parts is negative - which it can't be as the sizes are of type `size_t`, which - unless something really weird is going on in the build server - should be unsigned long int. There's something about page sizes, but that isn't relevant to the above calculation which seem to suggest that a/(a+b) > 1 where both a and b are non-negative integers. BTW - here's the result of running the `mdb_stat` tool from lmdb-utils on the baloo index: ----8<---- $ mdb_stat -af <path-to-index-db> Freelist Status Tree depth: 2 Branch pages: 1 Leaf pages: 41 Overflow pages: 5046 Entries: 3253 Free pages: 2566315 Status of Main DB Tree depth: 1 Branch pages: 0 Leaf pages: 1 Overflow pages: 0 Entries: 12 Status of docfilenameterms Tree depth: 4 Branch pages: 315 Leaf pages: 38726 Overflow pages: 0 Entries: 2104603 Status of docterms Tree depth: 4 Branch pages: 633 Leaf pages: 79407 Overflow pages: 284028 Entries: 2103699 Status of documentdatadb Tree depth: 3 Branch pages: 90 Leaf pages: 11012 Overflow pages: 38 Entries: 664790 Status of documenttimedb Tree depth: 3 Branch pages: 187 Leaf pages: 23555 Overflow pages: 0 Entries: 2111124 Status of docxatrrterms Tree depth: 3 Branch pages: 21 Leaf pages: 2040 Overflow pages: 86 Entries: 31253 Status of failediddb Tree depth: 0 Branch pages: 0 Leaf pages: 0 Overflow pages: 0 Entries: 0 Status of idfilename Tree depth: 4 Branch pages: 363 Leaf pages: 44411 Overflow pages: 0 Entries: 2120309 Status of idtree Tree depth: 3 Branch pages: 52 Leaf pages: 6960 Overflow pages: 2118 Entries: 223613 Status of indexingleveldb Tree depth: 3 Branch pages: 3 Leaf pages: 49 Overflow pages: 0 Entries: 5471 Status of mtimedb Tree depth: 3 Branch pages: 42 Leaf pages: 6719 Overflow pages: 0 Entries: 2111124 Status of positiondb Tree depth: 4 Branch pages: 6657 Leaf pages: 735531 Overflow pages: 328761 Entries: 42876611 Status of postingdb Tree depth: 4 Branch pages: 6181 Leaf pages: 657348 Overflow pages: 105167 Entries: 45851508 ----8<---- -- You are receiving this mail because: You are watching all bug changes.