[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

kkewwei (Jira) Fri, 18 Jun 2021 02:49:05 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365411#comment-17365411
 ]


kkewwei edited comment on LUCENE-10004 at 6/18/21, 9:48 AM:
------------------------------------------------------------

I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flushed to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.


was (Author: kkewwei):
I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flush to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10004
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10004
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: 8.8.2
>            Reporter: kkewwei
>            Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>                  matchingFieldsReader.getChunkSize() == chunkSize && 
>                  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>                  liveDocs == null &&
>                  !tooDirty(matchingFieldsReader)) { 
>        ......
>        // flush any pending chunks
>         if (numBufferedDocs > 0) {
>           flush();
>           numDirtyChunks++; // incomplete: we had to force this flush
>         }
>        ......
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

Reply via email to