luyuncheng opened a new pull request, #13822:
URL: https://github.com/apache/lucene/pull/13822

   ### Description
   
   we see the scenarios like #13354, it would make abort waiting for merge 
finished, like #13354 and 
https://github.com/elastic/elasticsearch/issues/107513 elasticsearch 
#`removeIndex` or `stop shard`. Athough we can make close shard async, when 
there is a long time merge like hnsw graph build, it would take a long time to 
build graph and eventually we just want to remove the index.
   
   the abort waiting on
   ```
        at java.lang.Object.wait(java.base@17.0.2/Native Method)
        - waiting on <no object reference available>
        at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5410)
        - locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
        at 
org.apache.lucene.index.IndexWriter.abortMerges(IndexWriter.java:2721)
        - locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
        at 
org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2469)
        - locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
        at 
org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2449)
        - locked <0x0000001022bae6d0> (a java.lang.Object)
   ```
   
   and lucene is doing hnsw merge build graph
   ```
        at 
org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues.vectorValue(OffHeapFloatVectorValues.java:61)
        at 
org.apache.lucene.codecs.lucene95.OffHeapFloatVectorValues$DenseOffHeapVectorValues.vectorValue(OffHeapFloatVectorValues.java:86)
        at 
org.apache.lucene.util.hnsw.HnswGraphSearcher.compare(HnswGraphSearcher.java:333)
        at 
org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:310)
        at 
org.apache.lucene.util.hnsw.HnswGraphSearcher.searchLevel(HnswGraphSearcher.java:251)
        at 
org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:289)
        at 
org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:297)
        at 
org.apache.lucene.util.hnsw.HnswGraphBuilder.addVectors(HnswGraphBuilder.java:246)
        at 
org.apache.lucene.util.hnsw.HnswGraphBuilder.build(HnswGraphBuilder.java:171)
   ```
   
   ### ISSUE 
   #13354
   
   ### Proposal
   
   i think we can pass the MergePolicy.OneMergeProgress into MergeState which 
`OneMergeProgress#aborted` is volatile variable, and every heavy merge 
processing codec can add some probe in the process, and verify the status to 
throw `MergeAbortedException` 
   
   this pr only pass the progress into state.
   
   And sample like `Lucene90CompressingStoredFieldsWriter` we can add probe 
code like 
   ```
   public int merge(MergeState mergeState) throws IOException {
      ...
      while (sub != null) {
          ...processing storefield compressing for one sub...
   
         //probe for check progress is aborted
         if (mergeState.mergeProgress.isAborted()) {
           throw new MergePolicy.MergeAbortedException("Merge aborted.");
         }
      }
   }
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to