Hi,

I am experiencing an issue where threads are blocking for an extremely long
time when I am indexing while deleteByQuery is also running.

Setup info:
-Solr Cloud 6.6.0
-Simple 2 Node, 1 Shard, 2 replica setup
-~12 million docs in the collection in question
-Nodes have 64 GB RAM, 8 CPUs, spinning disks
-Soft commit interval 10 seconds, Hard commit (open searcher false) 60
seconds
-Default merge policy settings (Which I think is 10/10).

We have a query heavy index heavyish use case. Indexing is constantly
running throughout the day and can be bursty. The indexing process handles
both updates and deletes, can spin up to 15 simultaneous threads, and sends
to solr in batches of 3000 (seems to be the optimal number per trial and
error).

I can build the entire collection from scratch using this method in < 40
mins and indexing is in general super fast (averages about 3 seconds to
send a batch of 3000 docs to solr). The issue I am seeing is when some
threads are adding/updating documents while other threads are issuing
deletes (using deleteByQuery), solr seems to get into a state of extreme
blocking on the replica, which results in some threads taking 30+ minutes
just to send 1 batch of 3000 docs. This collection does use child documents
(hence the delete by query _root_), not sure if that makes a difference, I
am trying to duplicate on a non-child doc collection. CPU/IO wait seems
minimal on both nodes, so not sure what is causing the blocking.

Here is part of the stack trace on one of the blocked threads on the
replica:

qtp592179046-576 (576)
java.lang.Object@608fe9b5
org.apache.solr.update.DirectUpdateHandler2.addAndDelete​(DirectUpdateHandler2.java:354)
org.apache.solr.update.DirectUpdateHandler2.addDoc0​(DirectUpdateHandler2.java:237)
org.apache.solr.update.DirectUpdateHandler2.addDoc​(DirectUpdateHandler2.java:194)
org.apache.solr.update.processor.RunUpdateProcessor.processAdd​(RunUpdateProcessorFactory.java:67)
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd​(UpdateRequestProcessor.java:55)
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd​(DistributedUpdateProcessor.java:979)
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd​(DistributedUpdateProcessor.java:1192)
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd​(DistributedUpdateProcessor.java:748)
org.apache.solr.handler.loader.JavabinLoader$1.update​(JavabinLoader.java:98)
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator​(JavaBinUpdateRequestCodec.java:180)
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator​(JavaBinUpdateRequestCodec.java:136)
org.apache.solr.common.util.JavaBinCodec.readObject​(JavaBinCodec.java:306)
org.apache.solr.common.util.JavaBinCodec.readVal​(JavaBinCodec.java:251)
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList​(JavaBinUpdateRequestCodec.java:122)
org.apache.solr.common.util.JavaBinCodec.readObject​(JavaBinCodec.java:271)
org.apache.solr.common.util.JavaBinCodec.readVal​(JavaBinCodec.java:251)
org.apache.solr.common.util.JavaBinCodec.unmarshal​(JavaBinCodec.java:173)
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal​(JavaBinUpdateRequestCodec.java:187)
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs​(JavabinLoader.java:108)
org.apache.solr.handler.loader.JavabinLoader.load​(JavabinLoader.java:55)
org.apache.solr.handler.UpdateRequestHandler$1.load​(UpdateRequestHandler.java:97)
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody​(ContentStreamHandlerBase.java:68)
org.apache.solr.handler.RequestHandlerBase.handleRequest​(RequestHandlerBase.java:173)
org.apache.solr.core.SolrCore.execute​(SolrCore.java:2477)
org.apache.solr.servlet.HttpSolrCall.execute​(HttpSolrCall.java:723)
org.apache.solr.servlet.HttpSolrCall.call​(HttpSolrCall.java:529)

A cursory search lead me to this JIRA
https://issues.apache.org/jira/browse/SOLR-7836, not sure if related though.

Can anyone shed some light on this issue? We don't do deletes very
frequently, but it is bringing solr to it's knees when we do, which is
causing some big problems.

Thanks,

Chris

Reply via email to