Alec Matusis wrote: > A follow-up question: > > We have an environment with one database with MDB_INTEGERKEY | MDB_DUPSORT > flags. > The keys and values are never deleted or modified (append-only).
That is not what append-only means in LMDB. I already defined it in my previous email. > There's a single RW transaction on the environment, that is writing the > data. It also needs to read some prior data in order to write new data. It > is committed and then recreated after every 1000 to 3000 mdb_cursor_put() > ops. All keys and values in the mdb_cursor_put ops are about the same size. > > I am printing and plotting mdb_stat after every commit. > I have observed a sudden Free Pages jump as shown on the top graph here: > https://imgur.com/a/RZi1Vjn The ratio of Free/Used pages has similar jump on > the second graph. The number of Used pages(computed as > MDB_envinfo.me_last_pgno - free_pages or equivalently " Number of pages > used" - " Free pages" from mdb_stat -fe utility) just grown monotonously on > the third graph, as expected. > > We commit about the same data each time(same size * same number of put ops). > > What might cause this jump? Tree rebalancing when a page gets full and causes a ripple effect back up to the root of the tree. In a pure append-only workload (using the MDB_APPEND write flag) this doesn't happen. In a workload where keys are inserted in random order, it does. > We have other databases, but they do not have this behavior- the ratio of > free/used pages is more or less constant after each commit, oscillating > around 0.001. > > -----Original Message----- > From: Howard Chu [mailto:[email protected]] > Sent: Wednesday, March 10, 2021 8:12 AM > To: Alec Matusis <[email protected]>; [email protected] > Subject: Re: data.mdb grows faster with more frequent commits > > Alec Matusis wrote: >> We have an environment with no flags that contains a database with no >> flags. The database is append only, no deletions or modifications. It >> is written using a single RW transaction, in the absence of any RO >> transactions. We observe that when we commit and recreate the RW >> transaction every 2000 insertion ops, the data.mdb file size on disk is 2x > larger than when committing every 64000 insertion ops. The mdb_copy –c > utility shrinks the large 2k ops commit file to almost the same file size as > the 64k commit one. mdb_stat –e on the data.mdb shows that when we have > more commits and bigger file, we have more pages used by the same > proportion. >> >> In production we will have several large DBs (>1TB) on an NVMe card >> and we do not have the 2x space for periodic mdb_copy –c >> compactifications (and we cannot stop the writing process). We also >> need to commit every 2000 write ops, because there will be short-lived >> RO transactions that need to see the DB updates every >> 2000 writes. >> >> >> >> 1. Why is the file size on disk dependent on the commit frequency? (I >> suppose because with less frequent commits it can allocate data between > pages more efficiently)? > > LMDB does copy-on-write. Every time you start a new transaction, any page > you modify must be copied first. > If you do many operations in the same transaction, the modified pages can be > reused as-is, instead of needing to be copied again. > >> 2. What can we do to reduce data.mdb, if we must commit frequently? Can > we use any environment, transaction or db flags, or anything else? > > If it is truly, strictly append-only use, which means every newly inserted > key is greater than all existing keys, then you should use the MDB_APPEND > flag. That will cut growth by half. > >> We are on Linux 5.4.0 / ext4 fs. The DB that grows 2x faster with more >> frequent commits has bytearr key -> u32 val structure (the byterarray key > is between 31 and 36 bytes). Another DB that has a reverse u32 key -> > bytearr structure oonly grows 10% larger in the more frequent commits > regime. >> > > > -- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/ > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
