[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1552: LUCENE-8962

GitBox Fri, 12 Jun 2020 09:03:28 -0700


mikemccand commented on a change in pull request #1552:
URL: https://github.com/apache/lucene-solr/pull/1552#discussion_r439492017




##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -4483,6 +4593,7 @@ public int length() {
         // Merge would produce a 0-doc segment, so we do nothing except commit 
the merge to remove all the 0-doc segments that we "merged":
         assert merge.info.info.maxDoc() == 0;
         commitMerge(merge, mergeState);
+        success = true;

Review comment:
       I think this was a small pre-existing bug.
   
   I.e. the merge has in fact succeeded on this path.  Before this change we 
are calling `closeMergeReaders` twice (once in the line above this, then again 
on line 4720 below.  Maybe that is harmless, but code-wise I think this path 
did succeed.
   
   If necessary, we could pull this out into its own PR?  But I think it's a 
good, if subtle, catch.  The merge did succeed in this path.

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -3228,15 +3268,38 @@ private long prepareCommitInternal() throws IOException 
{
               // sneak into the commit point:
               toCommit = segmentInfos.clone();
 
+              if (anyChanges) {
+                // Find any merges that can execute on commit (per 
MergePolicy).
+                MergePolicy.MergeSpecification mergeSpec =

Review comment:
       I think what makes this tricky is that this is a combination of 
`MergePolicy` (to pick the small merges) and `MergeScheduler` (to run them and 
await their completion, subject to a time limit) purposes.
   
   I do not think you can achieve this by just wrapping in `MergePolicy`, but I 
agree it would be better if we could find a simpler way to achieve it.

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java
##########
@@ -459,6 +463,31 @@ public IndexWriterConfig setCommitOnClose(boolean 
commitOnClose) {
     return this;
   }
 
+  /**
+   * Expert: sets the amount of time to wait for merges returned by 
MergePolicy.findFullFlushMerges(...).
+   * If this time is reached, we proceed with the commit based on segments 
merged up to that point.
+   * The merges are not cancelled, and may still run to completion independent 
of the commit.
+   */
+  public IndexWriterConfig setMaxCommitMergeWaitSeconds(double 
maxCommitMergeWaitSeconds) {

Review comment:
       I think @msfroh had considered a separate `IndexWriter` method before 
but something went wrong with that approach?
   
   I don't think this should be a separate method, actually.
   
   We have a `MergePolicy` that governs which merges should happen upon which 
events/triggers and what this change is adding is a new trigger (on commit) at 
which merging could conceivably occur.  If we added this method, the 
implication to fresh eyes would be that the existing `prepareCommit` will also 
wait for merges with some default parameter, while this new method lets you 
change the default.
   
   Anyway, let's hear from @msfroh if there was some wrinkle on making a 
dedicated method for this, but I still think that's a messy API.  We should 
rather use our existing `MergePolicy` API correctly.

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -3257,6 +3320,52 @@ private long prepareCommitInternal() throws IOException {
       } finally {
         maybeCloseOnTragicEvent();
       }
+
+      if (mergeAwaitLatchRef != null) {
+        CountDownLatch mergeAwaitLatch = mergeAwaitLatchRef.get();
+        // If we found and registered any merges above, within the flushLock, 
then we want to ensure that they
+        // complete execution. Note that since we released the lock, other 
merges may have been scheduled. We will
+        // block until  the merges that we registered complete. As they 
complete, they will update toCommit to
+        // replace merged segments with the result of each merge.
+        config.getIndexWriterEvents().beginMergeOnCommit();
+        mergeScheduler.merge(mergeSource, MergeTrigger.COMMIT);
+        long mergeWaitStart = System.nanoTime();
+        int abandonedCount = 0;
+        long waitTimeMillis = (long) (config.getMaxCommitMergeWaitSeconds() * 
1000.0);
+        try {
+          if (mergeAwaitLatch.await(waitTimeMillis, TimeUnit.MILLISECONDS) == 
false) {
+            synchronized (this) {
+              // Need to do this in a synchronized block, to make sure none of 
our commit merges are currently
+              // executing mergeFinished (since mergeFinished itself is called 
from within the IndexWriter lock).
+              // After we clear the value from mergeAwaitLatchRef, the merges 
we schedule will still execute as
+              // usual, but when they finish, they won't attempt to update 
toCommit or modify segment reference
+              // counts.
+              mergeAwaitLatchRef.set(null);
+              for (MergePolicy.OneMerge commitMerge : commitMerges) {
+                if (runningMerges.contains(commitMerge) || 
pendingMerges.contains(commitMerge)) {
+                  abandonedCount++;
+                }
+              }
+            }
+          }
+        } catch (InterruptedException ie) {
+          throw new ThreadInterruptedException(ie);
+        } finally {
+          if (infoStream.isEnabled("IW")) {
+            infoStream.message("IW", String.format(Locale.ROOT, "Waited %.1f 
ms for commit merges",
+                (System.nanoTime() - mergeWaitStart)/1_000_000.0));
+            infoStream.message("IW", "After executing commit merges, had " + 
toCommit.size() + " segments");
+            if (abandonedCount > 0) {
+              infoStream.message("IW", "Abandoned " + abandonedCount + " 
commit merges after " + waitTimeMillis + " ms");
+            }
+          }
+          if (abandonedCount > 0) {
+            
config.getIndexWriterEvents().abandonedMergesOnCommit(abandonedCount);

Review comment:
       The idea with this is to be able to track externally to `IndexWriter` 
how many merges completed during the commit window, how many did not, how long 
the commit waited for small merges to finish, etc.  This is useful telemetry 
for understanding how the feature is actually working in your production 
cluster.  I don't think we can get the equivalent telemetry by e.g. wrapping 
`MergePolicy` or `MergeScheduler`, because it is `IndexWriter` that knows how 
long it will wait and knows which merges made it and which did not.
   
   I think `testPoints` only run under assertion, and are really designed for 
unit tests to do interesting things.  But maybe we could somehow abuse it for 
this use case?
   
   I agree it should be tested better if we want to include it here.
   
   If it is really controversial then +1 to remove the `IndexWriterEvents` 
entirely here and pursue it as a separate issue.  It was useful for us (Amazon 
product search) to understand how this feature impacts our production clusters, 
and to help us tune a reasonable timeout value.

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java
##########
@@ -109,6 +110,9 @@
   
   /** Default value for whether calls to {@link IndexWriter#close()} include a 
commit. */
   public final static boolean DEFAULT_COMMIT_ON_CLOSE = true;
+
+  /** Default value for time to wait for merges on commit (when using a {@link 
MergePolicy} that implements findFullFlushMerges). */
+  public static final double DEFAULT_MAX_COMMIT_MERGE_WAIT_SECONDS = 30.0;

Review comment:
       Maybe 5.0 or 10.0 seconds instead?  A "typical" commit, writing new 
Lucene segments and `fsync`ing many files can often take several seconds, but 
30 seems high.

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -3152,6 +3154,42 @@ public final boolean flushNextBuffer() throws 
IOException {
     }
   }
 
+  private MergePolicy.OneMerge 
updateSegmentInfosOnMergeFinish(MergePolicy.OneMerge merge, final SegmentInfos 
toCommit,
+                                                                
AtomicReference<CountDownLatch> mergeLatchRef) {
+    return new MergePolicy.OneMerge(merge.segments) {
+      public void mergeFinished() throws IOException {
+        super.mergeFinished();
+        CountDownLatch mergeAwaitLatch = mergeLatchRef.get();
+        if (mergeAwaitLatch == null) {
+          // Commit thread timed out waiting for this merge and moved on. No 
need to manipulate toCommit.
+          return;
+        }
+        if (committed) {
+          deleter.incRef(this.info.files());

Review comment:
       I think @msfroh might remember some context on why he settled on this 
approach.
   
   This is inherently a complex problem: we want to let `MergePolicy` pick the 
"right" merges to do on commit (the smallish ones).  But, it may pick poorly, 
or maybe the box is IO starved currently, and so we want to 1) ask 
`MergeScheduler` to kick off the merges it had requested, but 2) those merges 
that complete it time will make it into the commit point, while those that do 
not will still be allowed to run to completion and be switched in the live 
`SegmentInfos` (no wasted work), but will not be in the current commit point.
   
   Anyway, +1 if we can re-use existing code that "merges the merged segments" 
into a live or the pending commit `SegmentInfos`.  Maybe we can factor out the 
common code to reduce the added complexity here?

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java
##########
@@ -459,6 +463,31 @@ public IndexWriterConfig setCommitOnClose(boolean 
commitOnClose) {
     return this;
   }
 
+  /**
+   * Expert: sets the amount of time to wait for merges returned by 
MergePolicy.findFullFlushMerges(...).
+   * If this time is reached, we proceed with the commit based on segments 
merged up to that point.
+   * The merges are not cancelled, and may still run to completion independent 
of the commit.
+   */
+  public IndexWriterConfig setMaxCommitMergeWaitSeconds(double 
maxCommitMergeWaitSeconds) {
+    this.maxCommitMergeWaitSeconds = maxCommitMergeWaitSeconds;
+    return this;
+  }
+
+  /**
+   * Set the callback that gets invoked when IndexWriter performs various 
actions.
+   */
+  public IndexWriterConfig setIndexWriterEvents(IndexWriterEvents 
indexWriterEvents) {

Review comment:
       Yeah for sure if we include it in this PR it should be tested by the PR.
   
   See my comment above about this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1552: LUCENE-8962

Reply via email to