Varun Thacker created SOLR-14909:
------------------------------------

             Summary: Add replica is very slow on a large cluster
                 Key: SOLR-14909
                 URL: https://issues.apache.org/jira/browse/SOLR-14909
             Project: Solr
          Issue Type: Task
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 7.6
            Reporter: Varun Thacker


We create ~100 collections every day for new incoming data

We first issue a create-collection request for all the collections (4 shards 
and createNodeSet=empty). This would create collections with no replicas

We then issue async add-replica calls for all the shards creating 1 replica 
each. 100 collection X 4 shards = 400 add-replica calls. All the add replica 
calls pass the node parameter telling Solr where the replica should be created

The cluster has 190 nodes currently and when we upgraded to Solr 7.7.3 we 
noticed that the add replicas took 2 hours and 45 mins to complete! Clearly 
something was wrong as the same cluster previously running Solr 7.3.1 was 
taking a few mins only.

A thread dump of the overseer showed a 100 threads stuck here ( Why 100? That's 
the Solr default thread pool size set by MAX_PARALLEL_TASKS in 
OverseerTaskProcessor )

 
{code:java}
"OverseerThreadFactory-13-thread-1226-processing-n:10.128.18.69:8983_solr" 
#11163 prio=5 os_prio=0 cpu=0.69ms elapsed=987.97s tid=0x00007f01f8051000 
nid=0xd7a waiting for monitor entry [0x00007f01c1121000]
 java.lang.Thread.State: BLOCKED (on object monitor)
 at java.lang.Object.wait(java.base@11.0.5/Native Method)
 - waiting on <no object reference available>
 at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:449)
 - waiting to re-lock in wait() <0x00000007259e6a98> (a java.lang.Object)
 at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:493)
 at 
org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:121)
 at 
org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
 at 
org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
 at 
org.apache.solr.cloud.api.collections.Assign.getNodesForNewReplicas(Assign.java:368)
 at 
org.apache.solr.cloud.api.collections.AddReplicaCmd.buildReplicaPositions(AddReplicaCmd.java:360)
 at 
org.apache.solr.cloud.api.collections.AddReplicaCmd.addReplica(AddReplicaCmd.java:146)
 at 
org.apache.solr.cloud.api.collections.AddReplicaCmd.call(AddReplicaCmd.java:91)
 at 
org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:294)
{code}
 

 

It's strange because each add-replica API call would create a single replica 
and specify which node is must be created on.

 

Assign.getNodesForNewReplicas is where the slowdown was and we noticed 
SKIP_NODE_ASSIGNMENT flag ( 
https://github.com/apache/lucene-solr/commit/17cb1b17172926d0d9aed3dfd3b9adb90cf65e0f#diff-ee29887eff6e474e58fcf3c02077f179R355
 ) that the overseer reads could have skipped the method from being called.

So we started passing SKIP_NODE_ASSIGNMENT=true and still no luck! The replicas 
took just as long to create. It turned out that the Collections Handler wasn't 
passing the SKIP_NODE_ASSIGNMENT parameter to the overseer.

The add replica call only passes a specific set of params to the overseer 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.3/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L823
 . We changed this to also pass SKIP_NODE_ASSIGNMENT.

Now when we try to create the replicas it takes 4 minutes approximately vs 2 
hours 45 mins that it was taking previosuly.

Only master respects that param to the overseer ( 
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L938
 ) . However it doesn't matter in master because the autoscaling framework is 
gone ( https://github.com/apache/lucene-solr/commit/cc0c111/ )

I believe this will be seen in all versions since Solr 7.6 ( 
https://issues.apache.org/jira/browse/SOLR-12739 ) through every 8.x release

Lastly, I manually tried to add a replica with and without the flag. Without 
the flag it took 20 second and with the flag 2 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to