https://bugs.kde.org/show_bug.cgi?id=502641

--- Comment #6 from Niklāvs Koļesņikovs <89q1r1...@relay.firefox.com> ---
Okay, so the TL;DR is that Baloo should:
1) deal with excessively large indexes, since there's a long history of that
happening not due to huge filesystems but rather Baloo bugs
2) set a reasonable MemoryHigh value based on percentage as a backstop for
OOM'ing the system
3) be more resource aware and either avoid using too much RAM (or even
defaulting to off) when resource constrained or on the contrary going all out
with locking its index to memory for best indexing performance, when there's
headroom for that.

Yeah, but if it the expected behavior is to consume a lot more than 512M, then
I'm not sure it does anyone any good to slow Baloo to a crawl, when there's
plenty of resources available.

Case in point, I have 32GiB of RAM and I honestly and literally *do not care*
about 512M here or there, because on a typical day I have about 20G free and
Baloo is more than welcome to use a few G, if it needs to. Meanwhile on our
circular economy "TV" with 2 GiB RAM and integrated graphics perhaps Baloo
should either switch to a memory conserving mode or just straight up default to
off, since that system barely plays YouTube without dropping frames and there's
no memory, CPU or I/O budget for extras such as Baloo, that I only use on my
main system for tagging family photos. Of course, disabling Baloo is the first
thing I did with that system but the point of my example is that probably
should be the default on resource constrained systems.

Regarding that Btrfs issue, it's supposed to be fixed, although I did catch
Baloo earlier this month with again an almost 1G index and stuck indexing some
random file while consuming __half__ of the RAID array bandwith for a *month*
(yikes and mea culpa for ignoring the unusual drive activity for so long).
Presumably it was this same issue at heart but I was able to "solve" it by just
banning the folders where it got stuck and purging, since I'll never need Baloo
to find anything in them.

This is outside my area of expertise but perhaps it would make sense to use an
established database meant for storing large data sets and is already optimized
for good performance on modern hardware? Furthermore, a file being memory
mapped does not guarantee it's actually in the memory. The only real guarantee
is to lock that index into memory but that requires either CAP_IPC_LOCK or
large enough memlock (`ulimit -l`) and Baloo better makes sure that the system
has enough resources to spare at all times to avoid a silly OOM situation,
because the locked memory will not get paged out.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to