> On March 10, 2014, 10:18 a.m., Sergio Luis Martins wrote: > > Hi, Is this the same patch I tested a few days ago ? That one didn't fix > > the problem, I still reach 1.5GB of memory usage with a 100k e-mail maildir. > > Dan Vrátil wrote: > There is still the problem with Akonadi/KMime "leaking" lots and lots of > memory. That causes the app to grow memory a lot. I think it's related to the > malloc_trim thing you found a while ago.
It's not the precise patch, no. The main functional difference is that it lowers the commit limit from 500 to 100. There is still some fragmentation that occurs and this will be a function of the amount of data in the commit. At least here, akonadi_baloo_indexer no longer grows to 250MB+ but stays at a more reasonable ~20MB after a couple days. I've also had it index a folder of 695 new emails (copied from one folder to another). Without the patch, it reliably hits 17MB of usage when complete and spikes regularly to a little over 19MB. With the patch it sits at under 15MB and never spikes above it. Without the patch it takes ~17 seconds (wall clock) to complete indexing; with the patch ~30 seconds. So there is a definite trade-off, and the memory fragmentation is not completely resolved (as expected, actually) but it does limit the problem. As I noted in the description, while not perfect this prevents baloo from hitting 200MB+ on my system through normal usage. - Aaron J. ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://git.reviewboard.kde.org/r/116692/#review52509 ----------------------------------------------------------- On March 10, 2014, 11:12 a.m., Aaron J. Seigo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://git.reviewboard.kde.org/r/116692/ > ----------------------------------------------------------- > > (Updated March 10, 2014, 11:12 a.m.) > > > Review request for Akonadi and Baloo. > > > Repository: baloo > > > Description > ------- > > Baloo is using Xapian for storing processed results from data fed to it by > akonadi; in doing so it processes all the data it is sent to index and only > once this is complete is the data committed to the Xapian database. From > http://xapian.org/docs/apidoc/html/classXapian_1_1WritableDatabase.html#acbea2163142de795024880a7123bc693 > we see: "For efficiency reasons, when performing multiple updates to a > database it is best (indeed, almost essential) to make as many modifications > as memory will permit in a single pass through the database. To ensure this, > Xapian batches up modifications." This means that *all* the data to be stored > in the Xapian database first ends up in RAM. When indexing large mailboxes > (or any other large chunk of data) this results in a very large amount of > memory allocation. On one test of 100k mails in a maildir folder this > resulted in 1.5GB of RAM used. In normal daily usage with maildir I find that > it easily balloons to several hundred megabytes within day s. This makes the Baloo indexer unusable on systems with smaller amounts of memory (e.g. mobile devices, which typically have only 512MB-2GB of RAM) > > Making this even worse is that the indexer is both long-lived *and* the > default glibc allocator is unable to return the used memory back to the OS > (probably due to memory fragmentation, though I have not confirmed this). Use > of other allocators shows the temporary ballooning of memory during > processing, but once that is done the memory is released and returned back to > the OS. As such, this is not a memory leak .. but it behaves like one on > systems with the default glibc allocator with akonai_baloo_indexer taking > increasingly large amounts of memory on the system that never get returned to > the OS. (This is actually how I noticed the problem in the first place.) > > The approach used to address this problem is to periodically commit data to > the Xapian database. This happens uniformly and transparently to the > AbstractIndexer subclasses. The exact behavior is controlled by the > s_maxUncommittedItems constant which is set arbitrarily to 100: after an > indexer hits 100 uncommitted changes, the results are committed immediately. > Caveats: > > * This is not a guaranteed fix for the memory fragmentation issue experienced > with glibc: it is still possible for the memory to grow slowly over time as > each smaller commit leaves some % of un-releasable memory due to > fragmentation. It has helped with day to day usage here, but in the "100k > mails in a maildir structure" test memory did still balloon upwards. > > * It make indexing non-atomic from akonadi's perspective: data fed to > akonadi_baloo_indexer to be indexed may show up in chunks and even, in the > case of a crash of the indexer, be only partially added to the database. > > Alternative approaches (not necessarily mutually exclusive to this patch or > each other): > > * send smaller data sets from akonadi to akonadi_baloo_indexer for > processing. This would allow akonadi_baloo_indexer to retain the atomic > commit approach while avoiding the worst of the Xapian memory usage; it would > not address the issue of memory fragmentation > * restart akonadi_baloo_indexer process from time to time; this would resolve > the fragmentation-over-time issue but not the massive memory usage due to > atomically indexing large datasets > * improve Xapian's chert backend (to become default in 1.4) to not fragment > memory so much; this would not address the issue of massive memory usage due > to atomically indexing large datasets > * use an allocator other than glibc's; this would not address the issue of > massive memory usage due to atomically indexing large datasets > > > Diffs > ----- > > src/pim/agent/emailindexer.cpp 05f80cf > src/pim/agent/abstractindexer.h 8ae6f5c > src/pim/agent/abstractindexer.cpp fa9e96f > src/pim/agent/akonotesindexer.h 83f36b7 > src/pim/agent/akonotesindexer.cpp ac3e66c > src/pim/agent/contactindexer.h 49dfdeb > src/pim/agent/contactindexer.cpp a5a6865 > src/pim/agent/emailindexer.h 9a5e5cf > > Diff: https://git.reviewboard.kde.org/r/116692/diff/ > > > Testing > ------- > > I have been running with the patch for a couple of days and one other person > on irc has tested an earlier (but functionally equivalent) version. Rather > than reaching the common 250MB+ during regular usage it now idles at ~20MB > (up from ~7MB when first started; so some fragmentation remains as noted in > the description, but with far better long-term results) > > > Thanks, > > Aaron J. Seigo > >
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<