mikemccand commented on a change in pull request #128:
URL: https://github.com/apache/lucene/pull/128#discussion_r638784407



##########
File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
##########
@@ -926,17 +1100,19 @@ public Status checkIndex(List<String> onlySegments) 
throws IOException {
    * @lucene.experimental
    */
   public static Status.LiveDocStatus testLiveDocs(
-      CodecReader reader, PrintStream infoStream, boolean failFast) throws 
IOException {
+      CodecReader reader, PrintStream infoStream, String segmentId) {
     long startNS = System.nanoTime();
+    String segmentPartId = segmentId + "[LiveDocs]";
     final Status.LiveDocStatus status = new Status.LiveDocStatus();
 
     try {
-      if (infoStream != null) infoStream.print("    test: check live 
docs.....");
+      if (infoStream != null) infoStream.print(segmentPartId + "    test: 
check live docs.....");

Review comment:
       > > Ideally, all infoStream.print for a given "part" of the index 
checking would first append to a per-part log, and then (under lock) print to 
console/main infoStream as a single "block" of output? (So that we don't see 
confusing interleaved across segments/parts checks)?
   > 
   > Oh I see, haven't thought about this approach before, and it sounds 
interesting! I assume by "per-part log" you meant an array of in-memory, per 
part buffers that accumulate messages over concurrent check right? If we were 
to combine these buffers at the end of / after the concurrent index check, we 
should be ok to just print them out to main InfoStream without locking?
   
   Yes, exactly!  So we won't see the logged output coming out in real-time as 
the checks happen, like `CheckIndex` does today, but rather all things are 
running concurrently, and then, once you've joined all those concurrent checker 
threads back to main thread, the main thread prints all per-part output 
messages to the console.  So then the user would still see the same (coherent 
looking) output, just with some delay since we wait for all concurrent checks 
to finish.
   
   Or, alternatively, once any concurrent check finishes, you immediately 
acquire "console printing lock", and print its full output.  This is a bit 
better because you see the output as each part finishes, and the long-pole slow 
checker parts won't delay the output of the fast parts.  Less nail-biting for 
the user ...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to