jpountz commented on code in PR #12685:
URL: https://github.com/apache/lucene/pull/12685#discussion_r1361188738


##########
lucene/core/src/java/org/apache/lucene/index/SegmentInfo.java:
##########
@@ -153,6 +157,16 @@ public boolean getUseCompoundFile() {
     return isCompoundFile;
   }
 
+  /** Returns true if this segment contains documents written as blocks. */

Review Comment:
   Add a link to `addDocuments` and `updateDocuments`? I wonder if this should 
be a bit more specific, e.g. "as blocks of 2 docs or more" to clarify that 
calling `addDocuments` with a single document doesn't count.



##########
lucene/core/src/test/org/apache/lucene/index/TestAddIndexes.java:
##########
@@ -1815,4 +1815,71 @@ public void testAddIndicesWithSoftDeletes() throws 
IOException {
     assertEquals(wrappedReader.numDocs(), writer.getDocStats().maxDoc);
     IOUtils.close(reader, writer, dir3, dir2, dir1);
   }
+
+  public void testAddIndicesWithBlocks() throws IOException {
+    boolean addHasBlocks = random().nextBoolean();
+    boolean baseHasBlocks = rarely();

Review Comment:
   All these cases look worth testing every time intead of randomly picking a 
single combination?



##########
lucene/core/src/java/org/apache/lucene/index/SegmentInfo.java:
##########
@@ -153,6 +157,16 @@ public boolean getUseCompoundFile() {
     return isCompoundFile;
   }
 
+  /** Returns true if this segment contains documents written as blocks. */

Review Comment:
   Maybe also mention that this started being recorded in 9.9 and that indexes 
created earlier than that will return `false` regardless?



##########
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java:
##########
@@ -3368,9 +3368,15 @@ public void addIndexesReaderMerge(MergePolicy.OneMerge 
merge) throws IOException
     String mergedName = newSegmentName();
     Directory mergeDirectory = mergeScheduler.wrapForMerge(merge, directory);
     int numSoftDeleted = 0;
+    boolean hasBlocks = false;
     for (MergePolicy.MergeReader reader : merge.getMergeReader()) {
       CodecReader leaf = reader.codecReader;
       numDocs += leaf.numDocs();
+      if (reader.reader == null) {
+        hasBlocks = true; // NOCOMMIT: can we just assume that it has blocks 
and go with worst case here?

Review Comment:
   Maybe we could we expose getHasBlocks in LeafMetaData to be able to get this 
information from a CodecReader?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to