[ https://issues.apache.org/jira/browse/LUCENE-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097370#comment-17097370 ]
Adrien Grand commented on LUCENE-9148: -------------------------------------- Something else that is a bit annoying is that opening a reader does O(numFields) seeks because of this organization of the file. So I started working on splitting it into multiple files. > Move the BKD index to its own file. > ----------------------------------- > > Key: LUCENE-9148 > URL: https://issues.apache.org/jira/browse/LUCENE-9148 > Project: Lucene - Core > Issue Type: Task > Reporter: Adrien Grand > Priority: Minor > > Lucene60PointsWriter stores both inner nodes and leaf nodes in the same file, > interleaved. For instance if you have two fields, you would have > {{<leaf_nodes_A, inner_nodes_A, leaf_nodes_B, inner_nodes_B>}}. It's not > ideal since leaves and inner nodes have quite different access patterns. > Should we split this into two files? In the case when the BKD index is > off-heap, this would also help force it into RAM with > {{MMapDirectory#setPreload}}. > Note that Lucene60PointsFormat already has a file that it calls "index" but > it's really only about mapping fields to file pointers in the other file and > not what I'm discussing here. But we could possibly store the BKD indices in > this existing file if we want to avoid creating a new one. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org