As an update, I have confirmed that it doesn't seem to have anything to do with child documents, or standard deletes, just deleteByQuery. If I do a deleteByQuery on any collection while also adding/updating in separate threads I am experiencing this blocking behavior on the non-leader replica.
Has anyone else experienced this/have any thoughts on what to try? On Sun, Nov 5, 2017 at 2:20 PM, Chris Troullis <cptroul...@gmail.com> wrote: > Hi, > > I am experiencing an issue where threads are blocking for an extremely > long time when I am indexing while deleteByQuery is also running. > > Setup info: > -Solr Cloud 6.6.0 > -Simple 2 Node, 1 Shard, 2 replica setup > -~12 million docs in the collection in question > -Nodes have 64 GB RAM, 8 CPUs, spinning disks > -Soft commit interval 10 seconds, Hard commit (open searcher false) 60 > seconds > -Default merge policy settings (Which I think is 10/10). > > We have a query heavy index heavyish use case. Indexing is constantly > running throughout the day and can be bursty. The indexing process handles > both updates and deletes, can spin up to 15 simultaneous threads, and sends > to solr in batches of 3000 (seems to be the optimal number per trial and > error). > > I can build the entire collection from scratch using this method in < 40 > mins and indexing is in general super fast (averages about 3 seconds to > send a batch of 3000 docs to solr). The issue I am seeing is when some > threads are adding/updating documents while other threads are issuing > deletes (using deleteByQuery), solr seems to get into a state of extreme > blocking on the replica, which results in some threads taking 30+ minutes > just to send 1 batch of 3000 docs. This collection does use child documents > (hence the delete by query _root_), not sure if that makes a difference, I > am trying to duplicate on a non-child doc collection. CPU/IO wait seems > minimal on both nodes, so not sure what is causing the blocking. > > Here is part of the stack trace on one of the blocked threads on the > replica: > > qtp592179046-576 (576) > java.lang.Object@608fe9b5 > org.apache.solr.update.DirectUpdateHandler2.addAndDelete( > DirectUpdateHandler2.java:354) > org.apache.solr.update.DirectUpdateHandler2.addDoc0( > DirectUpdateHandler2.java:237) > org.apache.solr.update.DirectUpdateHandler2.addDoc( > DirectUpdateHandler2.java:194) > org.apache.solr.update.processor.RunUpdateProcessor.processAdd( > RunUpdateProcessorFactory.java:67) > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd( > UpdateRequestProcessor.java:55) > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd( > DistributedUpdateProcessor.java:979) > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd( > DistributedUpdateProcessor.java:1192) > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd( > DistributedUpdateProcessor.java:748) > org.apache.solr.handler.loader.JavabinLoader$1.update > (JavabinLoader.java:98) > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1. > readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180) > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1. > readIterator(JavaBinUpdateRequestCodec.java:136) > org.apache.solr.common.util.JavaBinCodec.readObject( > JavaBinCodec.java:306) > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251) > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1. > readNamedList(JavaBinUpdateRequestCodec.java:122) > org.apache.solr.common.util.JavaBinCodec.readObject( > JavaBinCodec.java:271) > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251) > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173) > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal( > JavaBinUpdateRequestCodec.java:187) > org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs( > JavabinLoader.java:108) > org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55) > org.apache.solr.handler.UpdateRequestHandler$1.load( > UpdateRequestHandler.java:97) > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody( > ContentStreamHandlerBase.java:68) > org.apache.solr.handler.RequestHandlerBase.handleRequest( > RequestHandlerBase.java:173) > org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > > A cursory search lead me to this JIRA https://issues.apache. > org/jira/browse/SOLR-7836, not sure if related though. > > Can anyone shed some light on this issue? We don't do deletes very > frequently, but it is bringing solr to it's knees when we do, which is > causing some big problems. > > Thanks, > > Chris >