Ivan Unknown wrote:
> Hello!
> 
> I have been looking at the LMDB project code trying to learn and understand 
> how a database could be implemented; however, I have been struggling to 
> answer the
> questions below for quite some time, partially due to my limited knowledge of 
> C:
> 
> - How does LMDB store keys and values in the page? I have learned that a page 
> consists of the header, slots, and key-value pairs but how does the database
> handle keys and values that are too large to fit within a page?

That is already documented in the code and Doxygen.
> 
> - I found that LMDB does not keep a fixed number of keys per page, so that 
> would depend on the key-value pairs' sizes already inserted into the page. Is 
> this
> correct?

Correct. This is a major difference from textbook Btree or B+tree 
implementations, but it is essential
for good storage utilization.

> Does it mean that B+tree pages (branch or leaf) could have a different number 
> of keys depending on the key size?

Yes.

> How does it affect performance or implementation of the B+tree?

No particular impact.
> 
> - Are there any limitations on the size of a key?

Yes.

> Can it be of an arbitrary length?

No.
There is work underway to remove length limits on keys in LMDB 1.0 but that 
feature isn't working yet.
> 
> - What is the difference between IS_LEAF and IS_LEAF2 flags in the page 
> header? What is the difference between these pages?

That is already documented in the code and Doxygen.
> 
> - How do overflow pages work in LMDB? From what IĀ could understand, if a key 
> or a value does not fit in the page, it will be stored in the overflow page 
> (the
> entire page is allocated for that specific key or value). Is this correct?

Yes for values. Not for keys since they have a max length smaller than a page.

> What happens when the key size is several times larger than the page size, 
> e.g. 1MB
> value with 4KB pages?

That is already documented in the code and Doxygen.
> 
> - What is a sub-page in LMDB (F_SUBDATA)? How does it work?

That is already documented in the code and Doxygen.

> I would greatly appreciate it if someone could share links to the 
> documentation that covers internals of the database, online videos, research 
> papers, mailing
> lists, or any notes you could share to help me understand the above. Thank 
> you very much!

Doxygen docs are embedded in the source code already. You can format them using 
the doxygen tool.

Other info is linked at https://www.symas.com/symas-lmdb-tech-info

> Cheers,
> Ivan
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Reply via email to