Thank you Shawn. We will be adjusting solr.solr.home to point some place else so that our puppet module will work. We actually didn't loose any data since the indexes are in HDFS. Our configuration for our largest collection is 100 shards with 3 replicas each on top of HDFS with 3x replication. Perhaps overkill. It's just the core properties files that we lost. I ended up writing a program that uses the CloudSolrClient to get all the info from zookeeper and then rebuild the core properties files. Looks like it is working. For example, for a collection called COL1 with config called COL1:

        File output;
Iterator<Slice> iSlice = mainServer.getZkStateReader().getClusterState().getCollection("COL1").getActiveSlices().iterator();
        while (iSlice != null && iSlice.hasNext()) {
            Slice s = iSlice.next();
            Iterator<Replica> replicaIt = s.getReplicas().iterator();
            while (replicaIt != null && replicaIt.hasNext()) {
                Replica r = replicaIt.next();
                System.out.println("Name: "+r.getCoreName());
                System.out.println("CodeNodeName: "+r.getName());
                System.out.println("Node name: "+r.getNodeName());
                System.out.println("Shard: "+s.getName());

                output = new File(r.getNodeName()+"/"+r.getCoreName());
                output.mkdirs();
output = new File(r.getNodeName()+"/"+r.getCoreName()+"/"+"core.properties");
                StringBuilder buff = new StringBuilder();
                buff.append("collection.configName=COL1\n");
                buff.append("name=").append(r.getCoreName());
                buff.append("\nshard=").append(s.getName());
                buff.append("\ncollection=COL1");
buff.append("\ncoreNodeName=").append(r.getName());
                try {
                    setContents(output, buff.toString());
                } catch (IOException ex) {
                    System.out.println("Error writting: "+ex);
                }
            }
        }


Then I copied the files to the 45 servers and restarted solr 6.6.0 on each. It came back up OK, and it has been indexing all night long.

-Joe

On 7/17/2017 3:15 PM, Erick Erickson wrote


On 7/18/2017 12:31 PM, Shawn Heisey wrote:
On 7/17/2017 11:39 AM, Joe Obernberger wrote:
We use puppet to deploy the solr instance to all the nodes. I changed what was deployed to use the CDH jars, but our puppet module deletes the old directory and replaces it. So, all the core configuration files under server/solr/ were removed. Zookeeper still has the configuration, but the nodes won't come up.

Is there a way around this? Re-creating these files manually isn't realistic; do I need to re-index?

Put the solr home elsewhere so it's not under the program directory and doesn't get deleted when you re-deploy Solr. When starting Solr manually with bin/solr, this is done with the -s option.

If you install Solr as a service, which works on operating systems with a strong GNU presence (such as Linux), then the solr home will typically not be in the program directory. The configuration script (default filename is /etc/default/solr.in.sh) should not get deleted if Solr is reinstalled, but I have not confirmed that this is the case. The service installer script is included in the Solr download.

With SolrCloud, deleting all the core data like that will NOT be automatically fixed by restarting Solr. SolrCloud will have lost part of its data. If you have enough replicas left after a losslike that to remain fully operational, then you'll need to use the DELETEREPLICA and ADDREPLICA actions on the Collections API to rebuild the data on that server from the leader of each shard.

If the collection is incomplete after the solr home on a server gets deleted, you'll probably need to completely delete the collection, then recreate it, and reindex. And you'll need to look into adding servers/replicas so the loss of a single server cannot take you offline.

Thanks,
Shawn


---
This email has been checked for viruses by AVG.
http://www.avg.com


Reply via email to