FW: Question about Overseer calling SPLITSHARD collection API command during autoscaling
I sent this a few mins ago, but wasn’t yet subscribed. Forwarding the message along to make sure it’s received! From: Matthew Faw Date: Thursday, March 15, 2018 at 12:28 PM To: "solr-user@lucene.apache.org" Cc: Matthew Faw , Alex Meijer Subject: Question about Overseer calling SPLITSHARD collection API command during autoscaling Hi, So I’ve been trying out the new autoscaling features in solr 7.2.1. I run the following commands when creating my solr cluster: Set up overseer role: curl -s "solr-service-core:8983/solr/admin/collections?action=ADDROLE&role=overseer&node=$thenode" Create cluster prefs: clusterprefs=$(cat <<-EOF { "set-cluster-preferences" : [ {"minimize":"sysLoadAvg"}, {"minimize":"cores"} ] } EOF ) echo "The cluster prefs request body is: $clusterprefs" curl -H "Content-Type: application/json" -X POST -d "$clusterprefs" solr-service-core:8983/api/cluster/autoscaling Cluster policy: clusterpolicy=$(cat <<-EOF { "set-cluster-policy": [ {"replica": 0, "nodeRole": "overseer"}, {"replica": "<2", "shard": "#EACH", "node": "#ANY"}, {"cores": ">0", "node": "#ANY"}, {"cores": "<5", "node": "#ANY"}, {"replica": 0, "sysLoadAvg": ">80"} ] } EOF ) echo "The cluster policy is $clusterpolicy" curl -H "Content-Type: application/json" -X POST -d "$clusterpolicy" solr-service-core:8983/api/cluster/autoscaling nodeaddtrigger=$(cat <<-EOF { "set-trigger": { "name" : "node_added_trigger", "event" : "nodeAdded", "waitFor" : "1s" } } EOF ) echo "The node added trigger request: $nodeaddtrigger" curl -H "Content-Type: application/json" -X POST -d "$nodeaddtrigger" solr-service-core:8983/api/cluster/autoscaling I then create a collection with 2 shards and 3 replicas, under a set of nodes in an autoscaling group (initially 4, scales up to 10): curl -s "solr-service-core:8983/solr/admin/collections?action=CREATE&name=${COLLECTION_NAME}&numShards=${NUM_SHARDS}&replicationFactor=${NUM_REPLICAS}&autoAddReplicas=${AUTO_ADD_REPLICAS}&collection.configName=${COLLECTION_NAME}&waitForFinalState=true" I’ve observed several autoscaling actions being performed – automatically re-adding replicas, and moving shards to nodes based on my cluster policy/prefs. However, I have not observed a SPLITSHARD operation. My question is: 1) should I expect the Overseer to be able to call the SPLITSHARD command, or is this feature not yet implemented? 2) If it is possible, do you have any recommendations as to how I might force this type of behavior to happen? 3) If it’s not implemented yet, when could I expect the feature to be available? If you need any more details, please let me know! Really excited about these new features. Thanks, Matthew The content of this email is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
Question about Overseer calling SPLITSHARD collection API command during autoscaling
Hi, So I’ve been trying out the new autoscaling features in solr 7.2.1. I run the following commands when creating my solr cluster: Set up overseer role: curl -s "solr-service-core:8983/solr/admin/collections?action=ADDROLE&role=overseer&node=$thenode" Create cluster prefs: clusterprefs=$(cat <<-EOF { "set-cluster-preferences" : [ {"minimize":"sysLoadAvg"}, {"minimize":"cores"} ] } EOF ) echo "The cluster prefs request body is: $clusterprefs" curl -H "Content-Type: application/json" -X POST -d "$clusterprefs" solr-service-core:8983/api/cluster/autoscaling Cluster policy: clusterpolicy=$(cat <<-EOF { "set-cluster-policy": [ {"replica": 0, "nodeRole": "overseer"}, {"replica": "<2", "shard": "#EACH", "node": "#ANY"}, {"cores": ">0", "node": "#ANY"}, {"cores": "<5", "node": "#ANY"}, {"replica": 0, "sysLoadAvg": ">80"} ] } EOF ) echo "The cluster policy is $clusterpolicy" curl -H "Content-Type: application/json" -X POST -d "$clusterpolicy" solr-service-core:8983/api/cluster/autoscaling nodeaddtrigger=$(cat <<-EOF { "set-trigger": { "name" : "node_added_trigger", "event" : "nodeAdded", "waitFor" : "1s" } } EOF ) echo "The node added trigger request: $nodeaddtrigger" curl -H "Content-Type: application/json" -X POST -d "$nodeaddtrigger" solr-service-core:8983/api/cluster/autoscaling I then create a collection with 2 shards and 3 replicas, under a set of nodes in an autoscaling group (initially 4, scales up to 10): curl -s "solr-service-core:8983/solr/admin/collections?action=CREATE&name=${COLLECTION_NAME}&numShards=${NUM_SHARDS}&replicationFactor=${NUM_REPLICAS}&autoAddReplicas=${AUTO_ADD_REPLICAS}&collection.configName=${COLLECTION_NAME}&waitForFinalState=true" I’ve observed several autoscaling actions being performed – automatically re-adding replicas, and moving shards to nodes based on my cluster policy/prefs. However, I have not observed a SPLITSHARD operation. My question is: 1) should I expect the Overseer to be able to call the SPLITSHARD command, or is this feature not yet implemented? 2) If it is possible, do you have any recommendations as to how I might force this type of behavior to happen? 3) If it’s not implemented yet, when could I expect the feature to be available? If you need any more details, please let me know! Really excited about these new features. Thanks, Matthew The content of this email is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
Trouble using the MIGRATE command in the collections API on solr 7.3.1
Hello, For background, I’m using solr version 7.3.1 and lucene version 7.3.1 I have a solr collection with 2 shards and 3 replicas using the compositeId router. Each solr document has “id” as its unique key, where each id is of format DERP_${X}, where ${X} is some 24 character alphanumerical string. I create this collection in the following way: curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=derp&collection.configName=derp&numShards=2&replicationFactor=3&maxShardsPerNode=0&autoAddReplicas=true"; Suppose I have some other collection named herp, created in the same fashion, and a collection named blurp, with 1 shard, but otherwise created in the same fashion. Also suppose that there are 2000 documents in the derp collection, but none in the herp or blurp collections. I’ve been attempting to do two things with the MIGRATE Collections API: 1. Migrate all documents from the derp collection to the herp collection using the following command: curl "http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=derp&target.collection=herp&split.key=DERP/0\!&async=30"; | jq 2. Migrate all documents from the derp collection to the blurp collection using the same MIGRATE command, swapping herp for blurp. (I chose split.key=DERP/0! With the intent of capturing all documents in my source collection, since the /0 should tell the migrate command to only look at the hash of the id field, since I’m not using a shard key). In both cases, the response of the corresponding REQUESTSTATUS indicates success. For example: ╰─$ curl "localhost:8985/solr/admin/collections?action=REQUESTSTATUS&requestid=30" { "responseHeader":{ "status":0, "QTime":2}, "success":{ "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":10}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":94}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":0}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":82}}, "100.109.8.33:8983_solr":{ "responseHeader":{ "status":0, "QTime":85}}}, "3023875288733778":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 3023875288733778 webapp=null path=/admin/cores params={async=3023875288733778&qt=/admin/cores&name=herp_shard2_replica_n8&action=REQUESTBUFFERUPDATES&wt=javabin&version=2} status=0 QTime=10"}, "302387540601230023875636935846":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 302387540601230023875636935846 webapp=null path=/admin/cores params={qt=/admin/cores&collection.configName=derp&newCollection=true&collection=split_shard2_temp_shard2&version=2&replicaType=NRT&async=302387540601230023875636935846&coreNodeName=core_node2&name=split_shard2_temp_shard2_shard1_replica_n1&action=CREATE&numShards=1&shard=shard1&wt=javabin} status=0 QTime=0"}, "3023878903291448":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 3023878903291448 webapp=null path=/admin/cores params={core=derp_shard2_replica_n8&async=3023878903291448&split.key=Z!&qt=/admin/cores&ranges=3dba-3dba&action=SPLIT&targetCore=split_shard2_temp_shard2_shard1_replica_n1&wt=javabin&version=2} status=0 QTime=0"}, "3023880308944216":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 3023880308944216 webapp=null path=/admin/cores params={async=3023880308944216&qt=/admin/cores&coreNodeName=core_node4&collection.configName=derp&name=split_shard2_temp_shard2_shard1_replica_n3&action=CREATE&collection=split_shard2_temp_shard2&shard=shard1&wt=javabin&version=2&replicaType=NRT} status=0 QTime=0"}, "3023882401961074":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 3023882401961074 webapp=null path=/admin/cores params={nodeName=100.109.8.33:8983_solr&core=split_shard2_temp_shard2_shard1_replica_n1&async=3023882401961074&qt=/admin/cores&coreNodeName=core_node4&action=PREPRECOVERY&checkLive=true&state=active&onlyIfLeader=true&wt=javabin&version=2} status=0 QTime=0"}, "3023885405877119":{ "responseHeader":{ "status":0, "QTime":0}, "STATUS":"completed", "Response":"TaskId: 30238854058
Re: Trouble using the MIGRATE command in the collections API on solr 7.3.1
Hi Shawn, Thanks for your reply. According to the MIGRATE documentation, the split.key parameter is required, and removing it returns a missing parameter exception. I’ve tried setting the “split.key=DERP_”, and after doing that I still see no documents in the destination collection. Additionally, the CLUSTERSTATUS command indicates that the routeRanges using this split key are "routeRanges": "16f98178-16f98178", but when I use the split.key=DERP/0!, I get the route ranges I expect (8000- on one shard, and 0-7fff on the other). So, to me, it seems like this particular API endpoint does not work. I’d love for someone to prove me wrong. Thanks, Matthew On 6/21/18, 11:02 AM, "Shawn Heisey" wrote: On 6/21/2018 7:08 AM, Matthew Faw wrote: > For background, I’m using solr version 7.3.1 and lucene version 7.3.1 > > I have a solr collection with 2 shards and 3 replicas using the compositeId router. Each solr document has “id” as its unique key, where each id is of format DERP_${X}, where ${X} is some 24 character alphanumerical string. I create this collection in the following way: > > curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=derp&collection.configName=derp&numShards=2&replicationFactor=3&maxShardsPerNode=0&autoAddReplicas=true"; > > Suppose I have some other collection named herp, created in the same fashion, and a collection named blurp, with 1 shard, but otherwise created in the same fashion. Also suppose that there are 2000 documents in the derp collection, but none in the herp or blurp collections. > > I’ve been attempting to do two things with the MIGRATE Collections API: > >1. Migrate all documents from the derp collection to the herp collection using the following command: > curl "http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=derp&target.collection=herp&split.key=DERP/0\!&async=30"; | jq >2. Migrate all documents from the derp collection to the blurp collection using the same MIGRATE command, swapping herp for blurp. > > (I chose split.key=DERP/0! With the intent of capturing all documents in my source collection, since the /0 should tell the migrate command to only look at the hash of the id field, since I’m not using a shard key). The Collections API documentation doesn't mention any ability to use /N with split.key. Which may mean that it is looking for the literal text "DERP/0!" or "DERP/0\!" in your source documents, and since it's not there, not choosing any documents to migrate. The reason I have mentioned two possible strings there is that the ! character doesn't need escaping in a URL. The URL encoded version of that string is this: DERP%2f0! Because you want to choose all documents, I don't think you need the split.key parameter for this, or that you may need to use split.key=DERP_ instead. Because you're not using routing prefixes in your indexing, I am leading more towards just removing the parameter entirely. I have never actually used the MIGRATE action. So I'm basing all this on the documentation. Thanks, Shawn The content of this email is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
Re: Trouble using the MIGRATE command in the collections API on solr 7.3.1
Hi Shawn, So I’ve tried running MIGRATE on solr 7.3.1 using the following parameters: 1) “split.key=” 2) “split.key=!” 3) “split.key=DERP_” 4) “split.key=DERP/0!” For 1-3, I am seeing the same ERRORs you see. For 4, I do not see any ERRORs. Interestingly, I’m seeing this WARN message for all 4 scenarios: org.apache.solr.common.SolrException: SolrCore not found:split_shard2_temp_shard2_shard1_replica_n3 in [derp_shard1_replica_n1, derp_shard2_replica_n6, herp_shard1_replica_n1, herp_shard2_replica_n6] at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:312) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:170) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:135) at org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:56) at org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:348) at org.apache.solr.common.cloud.SolrZkClient$1.lambda$process$1(SolrZkClient.java:269) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) I wonder if this is part of the issue? Thanks, Matthew On 6/22/18, 12:53 PM, "Shawn Heisey" wrote: On 6/21/2018 9:41 AM, Matthew Faw wrote: > Thanks for your reply. According to the MIGRATE documentation, the split.key parameter is required, and removing it returns a missing parameter exception. I’ve tried setting the “split.key=DERP_”, and after doing that I still see no documents in the destination collection. Additionally, the CLUSTERSTATUS command indicates that the routeRanges using this split key are "routeRanges": "16f98178-16f98178", but when I use the split.key=DERP/0!, I get the route ranges I expect (8000- on one shard, and 0-7fff on the other). > > So, to me, it seems like this particular API endpoint does not work. I’d love for someone to prove me wrong. I have no idea how I managed to NOT see that parameter in the MIGRATE section of the docs. I was looking through the parameter list in the 6.6 version earlier and didn't even see it there, but when I just looked at it right now, it's there plain as day, and says it's required. I did see it mentioned in the *text* of the MIGRATE section, just not in the list of parameters. The fact that this parameter is required makes no sense to me. I cannot say for sure, but since using routing prefixes with SolrCloud is not required, requiring a split key when migrating seems wrong to me. I tried a test on the 7.3.0 cloud example where I manually created a bunch of documents in one collection and then tried to do a MIGRATE with "split.key=". I got some errors in the log: ERROR - 2018-06-21 20:17:13.611; [ ] org.apache.solr.update.SolrIndexSplitter; Splitting _0(7.3.0):C1: 1 documents belong to no shards and will be dropped ERROR - 2018-06-21 20:17:13.611; [ ] org.apache.solr.update.SolrIndexSplitter; Splitting _1(7.3.0):C1: 1 documents belong to no shards and will be dropped ERROR - 2018-06-21 20:17:13.611; [ ] org.apache.solr.update.SolrIndexSplitter; Splitting _3(7.3.0):C1: 1 documents belong to no shards and will be dropped ERROR - 2018-06-21 20:17:13.611; [ ] org.apache.solr.update.SolrIndexSplitter; Splitting _4(7.3.0):C1: 1 documents belong to no shards and will be dropped Are you seeing ERROR or WARN in your log? Just to be sure, I also reloaded the target collection, and I didn't see any documents in it. Thanks, Shawn The content of this email is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
Re: Trouble using the MIGRATE command in the collections API on solr 7.3.1
Basically, we have an environment that has a large number of solr nodes (~100) and an environment with fewer solr nodes (~10). In the “big” environment, we have lots of smaller cores (around 3Gb), and in the smaller environment, we have fewer bigger cores (around 30 Gb). We transfer data between these two environments around once per month or so. We’ve traditionally followed the model of 1 core per solr node, so we typically reindex solr when we move between environments, which takes 2 days typically, whereas using solr’s BACKUP and RESTORE apis each take a few minutes typically to run. I’m planning to investigate performance differences between having several small cores on a single solr node vs having one big solr core on each node. In the meantime, however, I was interested to see if it would be possible, at least in the short term, to replace our current procedure with the following: 1) BACKUP solr collection in the big environment 2) RESTORE the collection in the small environment 3) MIGRATE the collection in the small environment to another collection in the same environment with 1 shard per solr node. I’ve also heard mention of an API to combine shards (https://github.com/bloomreach/solrcloud-rebalance-api and https://issues.apache.org/jira/browse/SOLR-9241). Doesn’t seem like there’s been any development on integrating this work into the official solr distribution, but this also seems like it would probably solve my requirements. Let me know if anything is still unclear. Thanks, Matthew On 6/25/18, 1:38 PM, "Shawn Heisey" wrote: On 6/22/2018 12:14 PM, Matthew Faw wrote: > So I’ve tried running MIGRATE on solr 7.3.1 using the following parameters: > 1) “split.key=” > 2) “split.key=!” > 3) “split.key=DERP_” > 4) “split.key=DERP/0!” > > For 1-3, I am seeing the same ERRORs you see. For 4, I do not see any ERRORs. > > Interestingly, I’m seeing this WARN message for all 4 scenarios: > > org.apache.solr.common.SolrException: SolrCore not found:split_shard2_temp_shard2_shard1_replica_n3 in [derp_shard1_replica_n1, derp_shard2_replica_n6, herp_shard1_replica_n1, herp_shard2_replica_n6] I saw something similar as well. I think the way that MIGRATE works internally is to copy data from the source collection to a temporary index, and then from there to the final target. I think I've figured out why split.key is required. The entire reason the MIGRATE api was created was for people who use route keys to split one of those route keys into a separate collection. It does not appear to have been intended for handling everything in a collection, but only for splitting indexes where such keys are in use. https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSOLR-5308&data=02%7C01%7CMatthew.Faw%40verato.com%7C0f854de95816464908de08d5dac283ae%7Ca837817fdbe1417692831265955652cf%7C0%7C0%7C636655451338924556&sdata=%2FFF5tCNs5CWEGo3Qkv7589lhRjosjeVG15l7YGJ9MMU%3D&reserved=0 With id values like DERP_3e5bc047f13f6c562f985f00 you're not using routing prefix keys, so I think you probably aren't able to use the migrate API at all. So let's back up a couple of steps so we can find you a workable solution. Is this a one-time migration that you're trying to do, or are you expecting to do this frequently? What requirement are you trying to satisfy by copying data from one collection to another, and what are the details of the requirement? Thanks, Shawn The content of this email is intended solely for the individual or entity named above and access by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.