[ 
https://issues.apache.org/jira/browse/LUCENE-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097370#comment-17097370
 ] 

Adrien Grand commented on LUCENE-9148:
--------------------------------------

Something else that is a bit annoying is that opening a reader does 
O(numFields) seeks because of this organization of the file. So I started 
working on splitting it into multiple files.

> Move the BKD index to its own file.
> -----------------------------------
>
>                 Key: LUCENE-9148
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9148
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Lucene60PointsWriter stores both inner nodes and leaf nodes in the same file, 
> interleaved. For instance if you have two fields, you would have 
> {{<leaf_nodes_A, inner_nodes_A, leaf_nodes_B, inner_nodes_B>}}. It's not 
> ideal since leaves and inner nodes have quite different access patterns. 
> Should we split this into two files? In the case when the BKD index is 
> off-heap, this would also help force it into RAM with 
> {{MMapDirectory#setPreload}}.
> Note that Lucene60PointsFormat already has a file that it calls "index" but 
> it's really only about mapping fields to file pointers in the other file and 
> not what I'm discussing here. But we could possibly store the BKD indices in 
> this existing file if we want to avoid creating a new one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to