[GitHub] [lucene] zacharymorn commented on a change in pull request #128: LUCENE-9662: [WIP] CheckIndex should be concurrent

GitBox Tue, 18 May 2021 23:49:44 -0700


zacharymorn commented on a change in pull request #128:
URL: https://github.com/apache/lucene/pull/128#discussion_r634957136




##########
File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
##########
@@ -926,17 +1100,19 @@ public Status checkIndex(List<String> onlySegments) 
throws IOException {
    * @lucene.experimental
    */
   public static Status.LiveDocStatus testLiveDocs(
-      CodecReader reader, PrintStream infoStream, boolean failFast) throws 
IOException {
+      CodecReader reader, PrintStream infoStream, String segmentId) {
     long startNS = System.nanoTime();
+    String segmentPartId = segmentId + "[LiveDocs]";
     final Status.LiveDocStatus status = new Status.LiveDocStatus();
 
     try {
-      if (infoStream != null) infoStream.print("    test: check live 
docs.....");
+      if (infoStream != null) infoStream.print(segmentPartId + "    test: 
check live docs.....");

Review comment:
       > Sorry about not answering the // nocommit question before.
   
   No problem, and thanks again for the review and feedback!
   
   > Ideally, all infoStream.print for a given "part" of the index checking 
would first append to a per-part log, and then (under lock) print to 
console/main infoStream as a single "block" of output? (So that we don't see 
confusing interleaved across segments/parts checks)?
   
   Oh I see, haven't thought about this approach before, and it sounds 
interesting! I assume by "per-part log" you meant an array of in-memory, per 
part buffers that accumulate messages over concurrent check right? If we were 
to combine these buffers at the end of / after the concurrent index check, we 
should be ok to just print them out to main InfoStream without locking?

##########
File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
##########
@@ -926,17 +1100,19 @@ public Status checkIndex(List<String> onlySegments) 
throws IOException {
    * @lucene.experimental
    */
   public static Status.LiveDocStatus testLiveDocs(
-      CodecReader reader, PrintStream infoStream, boolean failFast) throws 
IOException {
+      CodecReader reader, PrintStream infoStream, String segmentId) {
     long startNS = System.nanoTime();
+    String segmentPartId = segmentId + "[LiveDocs]";
     final Status.LiveDocStatus status = new Status.LiveDocStatus();
 
     try {
-      if (infoStream != null) infoStream.print("    test: check live 
docs.....");
+      if (infoStream != null) infoStream.print(segmentPartId + "    test: 
check live docs.....");
       final int numDocs = reader.numDocs();
       if (reader.hasDeletions()) {
         Bits liveDocs = reader.getLiveDocs();
         if (liveDocs == null) {
-          throw new RuntimeException("segment should have deletions, but 
liveDocs is null");
+          throw new RuntimeException(

Review comment:
       Done.

##########
File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
##########
@@ -2106,16 +2286,6 @@ static void checkImpacts(Impacts impacts, int 
lastTarget) {
     }
   }
 
-  /**
-   * Test the term index.
-   *
-   * @lucene.experimental
-   */
-  public static Status.TermIndexStatus testPostings(CodecReader reader, 
PrintStream infoStream)

Review comment:
       I think I accidentally removed it...I've restored it as well as another 
one.

##########
File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
##########
@@ -2737,13 +2910,14 @@ public Relation compare(byte[] minPackedValue, byte[] 
maxPackedValue) {
    * @lucene.experimental
    */
   public static Status.StoredFieldStatus testStoredFields(
-      CodecReader reader, PrintStream infoStream, boolean failFast) throws 
IOException {
+      CodecReader reader, PrintStream infoStream, String segmentId) {
     long startNS = System.nanoTime();
+    String segmentPartId = segmentId + "[StoredFields]";

Review comment:
       Done (I used `CheckIndexException` instead of `CheckIndexFailure` for 
naming consistency). I also replaced all `RuntimeException` in `CheckIndex` 
with this new exception class.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] zacharymorn commented on a change in pull request #128: LUCENE-9662: [WIP] CheckIndex should be concurrent

Reply via email to