https://bugs.kde.org/show_bug.cgi?id=434926
--- Comment #14 from nyanpasu64 <nyanpas...@tuta.io> ---
# Identifying the bad table

I did find something interesting. Baloo creates 12 databases (lmdb's name for
tables):

- postingdb
- positiondb
- docterms
- docfilenameterms
- docxatrrterms
- idtree
- idfilename
- documenttimedb
- documentdatadb
- indexingleveldb
- failediddb
- mtimedb

As mentioned, I copied my bad Baloo index to the name "data.mdb" so mdb_dump
would find it.

I then created a modified build of mdb_dump that skips over corrupted subtrees
(logging them) instead of aborting the program (though it fails if the *first*
leaf page in the entire database/table is corrupted, which I've not
encountered). I've pushed the code to
https://codeberg.org/nyanpasu64/lmdb-debug.

- To build this repo, run `make -j(N)` to produce a statically linked
`mdb_dump` binary which ignores the system LMDB.
- To scan a single table, log all structural errors found (midway nodes with no
children, or leaf nodes with children), and measure the size of the text dump,
run:
> .../lmdb/libraries/liblmdb/mdb_dump . -s (NAME) |pv -b>/dev/null
- To scan all tables together, and log each table name and all of its errors,
run: 
> .../lmdb/libraries/liblmdb/mdb_dump . -a -f /dev/null
All tables aside from mtimedb have like 0-5 errors. There is no clear pattern
among the contents of the bad pages; some have 16-byte headers followed by
2-byte-periodic pointers then content (like a normal LMDB page), while one of
them (0x5925000) has 10-byte-periodic data.

The last corrupted table ("mtimedb") uses `MDB_INTEGERKEY | MDB_DUPSORT |
MDB_DUPFIXED | MDB_INTEGERDUP`, where MDB_DUPSORT triggers a radically
different internal codepath in lmdb... and this table has *182 distinct
errors*.

Oddly it appears that the mtimedb table has been *entirely* replaced by a
corrupted (older?) version of another table (docterms)! I don't know if mtimedb
or something else caused the corruption though.

# Inspecting the bad table (mtimedb)

To dump the contents of mtimedb, I ran:

.../lmdb/libraries/liblmdb/mdb_dump . -s mtimedb -f mtimedb

The non-corrupted entries in mtimedb alternate between 8-byte (keys?), and "M"
followed by a mimetype interspersed with null bytes (values?). When I run the
same code on my *good* Baloo index, mdb_dump's output contains (as expected)
4-byte keys and 8-byte values. On the corrupted index file, `mdb_dump -s
mtimedb` almost entirely matches `mdb_dump -s dupsort`, except the initial
metadata is different (duplicates, dupsort, etc.) and their endings are
different. Regular `mdb_dump -s mtimedb` aborts before reaching the end, but
after writing 33.0 out of 33.1 MiB (including a bit of different data).

Baloo's source code matches the good index; it treats the "mtimedb" table as
mapping from quint32 to 1 or more quint64, with not a "M" or mimetype in sight.
The docterms table appears to map from Document::id() to a list of
null-separated type-prefixed tags. The exact format of values lives in Baloo's
DocTermsCodec, and it *might* be better off normalized and replaced with SQL
foreign keys, unless that's too slow to read from.

# Now what?

I think a corruption bug (either on-disk, or in baloo_file or
baloo_file_extractor or balooctl, possibly caused by misidentifying pages
referenced by the *currently written* database tree as free) caused mtimedb's
root pointer to point to docterms. It's possible this was caused by
simultaneous transactions on a single thread, or passing transactions across
threads, though I haven't looked into it. It's also possible that items were
somehow added to the free list (mentioned in
https://schd.ws/hosted_files/buildstuff14/96/20141120-BuildStuff-Lightning.pdf)
despite still being referenced, and were overwritten by new contents.

We know mtimedb is pointing to the wrong table. Why does reading it report
corrupted data at the end? I think it's not a result of misinterpreting the
docterms database (non-dupsort) as dupsort, because 99% of the database is read
properly and is identical to docterms.

- Maybe it was incorrectly pointed to an *old* copy of the docterms database,
which was not itself overwritten, but 182 pages near the end have been
overwritten by pages of a different format.
- Or it's correctly pointing to what *used* to be the mtimedb root page, but
the root page was incorrectly overwritten by a docterms root page.
- Or maybe after the mtimedb pointer was corrupted to treat a non-dupsort
database as a dupsort one, mutation operations corrupted the mtimedb tree
heavily (and other databases randomly). (Note that docterms itself is
uncorrupted; would this have corrupted it too or marked it as freed?)
Semi-related: https://github.com/PowerDNS/pdns/issues/8873

Is LMDB designed to reuse the same page from multiple reachable paths (aka
currently active parents), forming a DAG rather than a tree? If not, is there a
program to verify that isn't happening?

Is it possible to check the current database and see if there are currently any
pages reachable from the root, but also present in the "free list"? If that
happened, what caused it? Multiple pages owning one page and one parent being
freed? Or one parent owning a page, freeing it, but referencing it afterwards?

(At this point, is it worth asking LMDB's author Howard Chu for help?)

## Request for data

Is anyone willing to share their own corrupted database files for me to
analyze, so I have more samples of how a database gets corrupted? Note that it
will contain possibly-sensitive file paths (and perhaps even contents).

Alternatively, can you build and run my custom lmdb, run
`.../lmdb/libraries/liblmdb/mdb_dump . -a -f /dev/null`, and report the errors
detected (this does not leak personal information)?

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to