[ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17196521#comment-17196521
 ] 

ASF subversion and git services commented on LUCENE-9510:
---------------------------------------------------------

Commit 97a4af6890d3efc64bfb4a4d4dd6ffc02d8b7240 in lucene-solr's branch 
refs/heads/SOLR-14866 from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=97a4af6 ]

LUCENE-9510: Don't pull a merge instance when flushing stored fields 
out-of-order. (#1872)

With recent changes to stored fields that split blocks into several sub
blocks, the merge instance has become much slower at random access since
it would decompress all sub blocks when accessing a document. Since
stored fields likely get accessed in random order at flush time when
index sorting is enabled, it's better not to use the merge instance.

On a synthetic benchmark that has one stored field and one numeric
doc-value field that is used for sorting and fed with random values,
this made indexing more than 4x faster.

> SortingStoredFieldsConsumer should use a format that has better random-access
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-9510
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9510
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to