Hi,

-Solr 6.5.1
-SSD disk
-23M docs index 64G single shard

I'm trying to do around 4M in-place docValue updates to a collection
(single shard or around 23M docs) [these are ALL in-place updates]

 I can add the updates in around 7mins, but flushing to disk takes around
40mins! I've been able to add the updates quickly by adding:

<indexConfig>
    <ramBufferSizeMB>4000</ramBufferSizeMB>
  </indexConfig>

autoSoftCommit/autoCommit currently disabled.

>From the thread dump I see that the flush is in a single thread and
extremely slow. Dump below, the culprit seems to be [

   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(BufferedUpdatesStream.java:666)]

:


   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame​(SegmentTermsEnum.java:256)
   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.pushFrame​(SegmentTermsEnum.java:248)
   -
   
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact​(SegmentTermsEnum.java:538)



   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdates​(BufferedUpdatesStream.java:666)
   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDocValuesUpdatesList​(BufferedUpdatesStream.java:612)
   -
   
org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates​(BufferedUpdatesStream.java:269)
   -
   
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates​(IndexWriter.java:3454)
   -
   
org.apache.lucene.index.IndexWriter.applyDeletesAndPurge​(IndexWriter.java:4990)
   -
   
org.apache.lucene.index.DocumentsWriter$ApplyDeletesEvent.process​(DocumentsWriter.java:717)
   -
   org.apache.lucene.index.IndexWriter.processEvents​(IndexWriter.java:5040)
   -
   org.apache.lucene.index.IndexWriter.processEvents​(IndexWriter.java:5031)
   -
   org.apache.lucene.index.IndexWriter.updateDocValues​(IndexWriter.java:1731)
   -
   
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues​(DirectUpdateHandler2.java:911)
   -
   
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate​(DirectUpdateHandler2.java:302)
   -
   
org.apache.solr.update.DirectUpdateHandler2.addDoc0​(DirectUpdateHandler2.java:239)
   -
   
org.apache.solr.update.DirectUpdateHandler2.addDoc​(DirectUpdateHandler2.java:194)


I think this is related to
SOLR-6838 [https://issues.apache.org/jira/browse/SOLR-6838]
and
LUCENE-6161 [https://issues.apache.org/jira/browse/LUCENE-6161]

I need to make the flush faster, to complete the update quicker. Has anyone
a workaround or have any suggestions?

Many thanks,
Dan

Reply via email to