Re: shard1 gone missing ...

Mark Miller Fri, 31 Jan 2014 08:21:43 -0800

<solr persistent=“false”

You have to set that to true. When a core starts up, it’s assigned a 
coreNodeName. That is persisted in solr.xml.


This will happen every time you restart with persistent=false.

As far as fixing. Yes, you simple want shard1 and remove the replica info.

You would also need to add to the <solr tag: coreNodeName="node1:8080_x_col1”

That is how it will match up in ZK and not create a new replica.

- Mark

http://about.me/markrmiller

On Jan 31, 2014, at 11:11 AM, David Santamauro <david.santama...@gmail.com> 
wrote:

> 
> There is nothing of note in the zookeeper logs. My solr.xml (sanitized for 
> privacy) and identical on all 4 nodes.
> 
> <solr persistent="false" 
> zkHost="xx.xx.xx.xx:2181,xx.xx.xx.xx:2181,xx.xx.xx.xx:2181,xx.xx.xx.xx:2181,xx.xx.xx.xx:2181">
>  <cores adminPath="/admin/cores"
>         host="${host:}"
>         hostPort="8080"
>         hostContext="${hostContext:/x}"
>         zkClientTimeout="${zkClientTimeout:15000}"
>         defaultCoreName="c1"
>         shareSchema="true" >
> 
>     <core name="c1"
>           collection="col1"
>           instanceDir="/dir/x"
>           config="solrconfig.xml"
>           dataDir="/dir/x/data/y"
>     />
>  </cores>
> </solr>
> 
> I don't specify coreNodeName nor a genericCoreNodeNames default value ...  
> should I?
> 
> The tomcat log is basically just a replay of what happened.
> 
> 16443 [coreLoadExecutor-4-thread-2] INFO org.apache.solr.core.CoreContainer  
> ? registering core: ...
> 
> # this is, I think what you are talking about above with new coreNodeName
> 16444 [coreLoadExecutor-4-thread-2] INFO org.apache.solr.cloud.ZkController  
> ? Register replica - core:c1 address:http://xx.xx.xx.xx:8080/x collection: 
> col1 shard:shard4
> 
> 16453 [coreLoadExecutor-4-thread-2] INFO 
> org.apache.solr.client.solrj.impl.HttpClientUtil  ? Creating new http client, 
> config:maxConnections=10000&maxConnectionsPerHost=20&connTimeout=30000&socketTimeout=30000&retry=false
> 
> 16505 [coreLoadExecutor-4-thread-2] INFO org.apache.solr.cloud.ZkController  
> ? We are http://node1:8080/x and leader is http://node2:8080/x
> 
> Then it just starts replicating.
> 
> If there is anything specific I should be groking for in these logs, let me 
> know.
> 
> Also, given that my clusterstate.json now looks like this:
> 
> assume:
>  node1=xx.xx.xx.1
>  node2=xx.xx.xx.2
> 
> "shard4":{
>        "range":"20000000-3fffffff",
>        "state":"active",
>        "replicas":{
>          "node2:8080_x_col1":{
>            "state":"active",
>            "core":"c1",
>            "node_name":"node2:8080_x",
>            "base_url":"http://node2:8080/x";,
>            "leader":"true"},
> **** this should not be a replica of shard2 but its own shard1
>          "node1:8080_x_col1":{
>            "state":"recovering",
>            "core":"c1",
>            "node_name":"node1:8080_x",
>            "base_url":"http://node1:8080/x"}},
> 
> Can I just recreate shard1
> 
> "shard1":{
> ***** NOTE: range is assumed based on ranges of other nodes
>        "range":"0-1fffffff",
>        "state":"active",
>        "replicas":{
>          "node1:8080_x_col1":{
>            "state":"active",
>            "core":"c1",
>            "node_name":"node1:8080_x",
>            "base_url":"http://node1:8080/x";,
>            "leader":"true"}},
> 
> ... and then remove the replica ..
> "shard4":{
>        "range":"20000000-3fffffff",
>        "state":"active",
>        "replicas":{
>          "node2:8080_x_col1":{
>            "state":"active",
>            "core":"c1",
>            "node_name":"node2:8080_x",
>            "base_url":"http://node2:8080/x";,
>            "leader":"true"}},
> 
> That would be great...
> 
> thanks for your help
> 
> David
>

Re: shard1 gone missing ...

Reply via email to