Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

via GitHub Sun, 07 Jul 2024 14:20:58 -0700


jpountz commented on code in PR #13535:
URL: https://github.com/apache/lucene/pull/13535#discussion_r1667767007



##########
lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java:
##########
@@ -578,7 +578,8 @@ public synchronized boolean writeFieldUpdates(
       // IndexWriter.commitMergedDeletes).
       final SegmentReader reader;
       if (this.reader == null) {
-        reader = new SegmentReader(info, indexCreatedVersionMajor, 
IOContext.READONCE);
+        IOContext context = info.info.getUseCompoundFile() ? IOContext.DEFAULT 
: IOContext.READONCE;

Review Comment:
   > This looks correct with READONCE, as we close the segment reader down 
there, so it is only open for short time and we never look into it by different 
threads.
   
   I keep being confused by `READONCE` because the name suggests that data is 
read "once" but often it's used to mean "mostly sequentially" but not 
necessarily "once", such as here: we'd pass the doc values reader from this 
`SegmentReader` to a doc-values consumer, which typically performs multiple 
passes on the data: a first one to collect stastistics, and a second one to 
actually write the data. So data is not actually read once, but multiple times. 
(We do have files that get read once, e.g. all metadata files: `.tmd`, `.fdm`, 
`.kdm`, etc.)
   
   I'm unsure if we should rename `READONCE` or introduce a new context, but it 
looks like there is something worth fixing here.
   
   Separately, `OrdsBlockTreeTermsReader` is bad: it stores index data and 
metadata next to one another in the same file, and needs to seek towards the 
end of the file at open time to read the offset where metadata starts. We 
should fix it by storing data and metadata in separate files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

Reply via email to