Re: large scale indexing issues / single threaded bottleneck

Roman Alekseenkov Fri, 28 Oct 2011 11:56:45 -0700

I'm wondering if this is relevant:
https://issues.apache.org/jira/browse/LUCENE-2680 - Improve how
IndexWriter flushes deletes against existing segments


Roman

On Fri, Oct 28, 2011 at 11:38 AM, Roman Alekseenkov
<ralekseen...@gmail.com> wrote:
> Hi everyone,
>
> I'm looking for some help with Solr indexing issues on a large scale.
>
> We are indexing few terabytes/month on a sizeable Solr cluster (8
> masters / serving writes, 16 slaves / serving reads). After certain
> amount of tuning we got to the point where a single Solr instance can
> handle index size of 100GB without much issues, but after that we are
> starting to observe noticeable delays on index flush and they are
> getting larger. See the attached picture for details, it's done for a
> single JVM on a single machine.
>
> We are posting data in 8 threads using javabin format and doing commit
> every 5K documents, merge factor 20, and ram buffer size about 384MB.
> From the picture it can be seen that a single-threaded index flushing
> code kicks in on every commit and blocks all other indexing threads.
> The hardware is decent (12 physical / 24 virtual cores per machine)
> and it is mostly idle when the index is flushing. Very little CPU
> utilization and disk I/O (<5%), with the exception of a single CPU
> core which actually does index flush (95% CPU, 5% I/O wait).
>
> My questions are:
>
> 1) will Solr changes from real-time branch help to resolve these
> issues? I was reading
> http://blog.mikemccandless.com/2011/05/265-indexing-speedup-with-lucenes.html
> and it looks like we have exactly the same problem
>
> 2) what would be the best way to port these (and only these) changes
> to 3.4.0? I tried to dig into the branching and revisions, but got
> lost quickly. Tried something like "svn diff
> […]realtime_search@r953476 […]realtime_search@r1097767", but I'm not
> sure if it's even possible to merge these into 3.4.0
>
> 3) what would you recommend for production 24/7 use? 3.4.0?
>
> 4) is there a workaround that can be used? also, I listed the stack trace 
> below
>
> Thank you!
> Roman
>
> P.S. This single "index flushing" thread spends 99% of all the time in
> "org.apache.lucene.index.BufferedDeletesStream.applyDeletes", and then
> the merge seems to go quickly. I looked it up and it looks like the
> intent here is deleting old commit points (we are keeping only 1
> non-optimized commit point per config). Not sure why is it taking that
> long.
>
> pool-2-thread-1 [RUNNABLE] CPU time: 3:31
> java.nio.Bits.copyToByteArray(long, Object, long, long)
> java.nio.DirectByteBuffer.get(byte[], int, int)
> org.apache.lucene.store.MMapDirectory$MMapIndexInput.readBytes(byte[], int, 
> int)
> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos)
> org.apache.lucene.index.SegmentTermEnum.next()
> org.apache.lucene.index.TermInfosReader.<init>(Directory, String,
> FieldInfos, int, int)
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentReader,
> Directory, SegmentInfo, int, int)
> org.apache.lucene.index.SegmentReader.get(boolean, Directory,
> SegmentInfo, int, boolean, int)
> org.apache.lucene.index.IndexWriter$ReaderPool.get(SegmentInfo,
> boolean, int, int)
> org.apache.lucene.index.IndexWriter$ReaderPool.get(SegmentInfo, boolean)
> org.apache.lucene.index.BufferedDeletesStream.applyDeletes(IndexWriter$ReaderPool,
> List)
> org.apache.lucene.index.IndexWriter.doFlush(boolean)
> org.apache.lucene.index.IndexWriter.flush(boolean, boolean)
> org.apache.lucene.index.IndexWriter.closeInternal(boolean)
> org.apache.lucene.index.IndexWriter.close(boolean)
> org.apache.lucene.index.IndexWriter.close()
> org.apache.solr.update.SolrIndexWriter.close()
> org.apache.solr.update.DirectUpdateHandler2.closeWriter()
> org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand)
> org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run()
> java.util.concurrent.Executors$RunnableAdapter.call()
> java.util.concurrent.FutureTask$Sync.innerRun()
> java.util.concurrent.FutureTask.run()
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor$ScheduledFutureTask)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
>

Re: large scale indexing issues / single threaded bottleneck

Reply via email to