[ 
https://issues.apache.org/jira/browse/LUCENE-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-9694.
----------------------------------------
    Fix Version/s: 8.9
                   master (9.0)
       Resolution: Fixed

Thank you [~zhai7631]!

> New tool for creating a deterministic index
> -------------------------------------------
>
>                 Key: LUCENE-9694
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9694
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: general/tools
>            Reporter: Haoyu Zhai
>            Priority: Minor
>             Fix For: master (9.0), 8.9
>
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Lucene's index is segmented, and sometimes number of segments and documents 
> arrangement greatly impact performance.
> Given a stable index sort, our team create a tool that records document 
> arrangement (called index map) of an index and rearrange another index 
> (consists of same documents) into the same structure (segment num, and 
> documents included in each segment).
> This tool could be also used in lucene benchmarks for a faster deterministic 
> index construction (if I understand correctly lucene benchmark is using a 
> single thread manner to achieve this).
>  
> We've already had some discussion in email
> [https://markmail.org/message/lbtdntclpnocmfuf]
> And I've implemented the first method, using {{IndexWriter.addIndexes}} and a 
> customized {{FilteredCodecReader}} to achieve the goal. The index 
> construction time is about 25min and time executing this tool is about 10min.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to