[frameworks-baloo] [Bug 434926] Crash in Baloo::IdFilenameDB::get() after Baloo::DocumentUrlDB::get

bugzilla_noreply Sun, 26 Jun 2022 15:14:35 -0700

https://bugs.kde.org/show_bug.cgi?id=434926


--- Comment #15 from tagwer...@innerjoin.org ---
> mdb_dump -a .
The rationale is that being able to dump the database gives you a necessary
(although not sufficient) test that the index is OK? Nice ;-)

> I haven't tried modifying LMDB to scan the *entire* database, continuing on
> errors, and logging *all* data inconsistencies. I think that would help gather
> more data to understand what kind of corruption is happening.
There was an effort to write a consistency checker (a "baloodb"? tool). I
remember it came with *many* *warnings*. I think it has dropped out of the
current distributions but it rather sounds like it needs a revisit :-/

I see there are bugs resurfacing mentioning MDB_BAD_TXN (Bug 406868), I wonder
if these are related...

> So yeah, long-running read transactions cause written unused data to pile up.
> And since the PDF says "No compaction or garbage collection phase is ever
> needed", I suspect Baloo's index file size will *never* decrease, even if data
> gets freed (eg. by closing a long-running read transaction, excluding folders
> from indexing, deleting files, or turning off content indexing). This is...
> suboptimal.
I see the behaviour of baloo grabbing space and not releasing it; the index
gradually increases in size with time. I'm not so worried about the disc usage
but that "rather sparse" data might be pulled into memory is not so good.

There is the option to copy/compress the database:
    mdb_copy -n -c index index.new
Sometime this does well, sometimes just so-so...

> Reading https://www.openldap.org/lists/openldap-devel/201710/msg00019.html
It *may* be that this is/was the upstream responsible for Bug 389848 as
    https://bugs.openldap.org/show_bug.cgi?id=8756
is referenced.

> Seeking Audacity to offset 93474816, I see data with a periodicity of 10...
You are going too deep for me and I doubt that I'm be able to help much. Let me
try the "mdb_dump -a -n index" trick to see if I get any catches though.

It might be worth confirming you hit trouble with the database on an ext4
filesystem (and not BTRFS where I'd want to know that COW is disabled on the
directory).

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 434926] Crash in Baloo::IdFilenameDB::get() after Baloo::DocumentUrlDB::get

Reply via email to