Thank you Shawn. We will be adjusting solr.solr.home to point some
place else so that our puppet module will work. We actually didn't
loose any data since the indexes are in HDFS. Our configuration for our
largest collection is 100 shards with 3 replicas each on top of HDFS
with 3x replication. Perhaps overkill. It's just the core properties
files that we lost. I ended up writing a program that uses the
CloudSolrClient to get all the info from zookeeper and then rebuild the
core properties files. Looks like it is working. For example, for a
collection called COL1 with config called COL1:
File output;
Iterator<Slice> iSlice =
mainServer.getZkStateReader().getClusterState().getCollection("COL1").getActiveSlices().iterator();
while (iSlice != null && iSlice.hasNext()) {
Slice s = iSlice.next();
Iterator<Replica> replicaIt = s.getReplicas().iterator();
while (replicaIt != null && replicaIt.hasNext()) {
Replica r = replicaIt.next();
System.out.println("Name: "+r.getCoreName());
System.out.println("CodeNodeName: "+r.getName());
System.out.println("Node name: "+r.getNodeName());
System.out.println("Shard: "+s.getName());
output = new File(r.getNodeName()+"/"+r.getCoreName());
output.mkdirs();
output = new
File(r.getNodeName()+"/"+r.getCoreName()+"/"+"core.properties");
StringBuilder buff = new StringBuilder();
buff.append("collection.configName=COL1\n");
buff.append("name=").append(r.getCoreName());
buff.append("\nshard=").append(s.getName());
buff.append("\ncollection=COL1");
buff.append("\ncoreNodeName=").append(r.getName());
try {
setContents(output, buff.toString());
} catch (IOException ex) {
System.out.println("Error writting: "+ex);
}
}
}
Then I copied the files to the 45 servers and restarted solr 6.6.0 on
each. It came back up OK, and it has been indexing all night long.
-Joe
On 7/17/2017 3:15 PM, Erick Erickson wrote
On 7/18/2017 12:31 PM, Shawn Heisey wrote:
On 7/17/2017 11:39 AM, Joe Obernberger wrote:
We use puppet to deploy the solr instance to all the nodes. I
changed what was deployed to use the CDH jars, but our puppet module
deletes the old directory and replaces it. So, all the core
configuration files under server/solr/ were removed. Zookeeper still
has the configuration, but the nodes won't come up.
Is there a way around this? Re-creating these files manually isn't
realistic; do I need to re-index?
Put the solr home elsewhere so it's not under the program directory
and doesn't get deleted when you re-deploy Solr. When starting Solr
manually with bin/solr, this is done with the -s option.
If you install Solr as a service, which works on operating systems
with a strong GNU presence (such as Linux), then the solr home will
typically not be in the program directory. The configuration script
(default filename is /etc/default/solr.in.sh) should not get deleted
if Solr is reinstalled, but I have not confirmed that this is the
case. The service installer script is included in the Solr download.
With SolrCloud, deleting all the core data like that will NOT be
automatically fixed by restarting Solr. SolrCloud will have lost part
of its data. If you have enough replicas left after a losslike that
to remain fully operational, then you'll need to use the DELETEREPLICA
and ADDREPLICA actions on the Collections API to rebuild the data on
that server from the leader of each shard.
If the collection is incomplete after the solr home on a server gets
deleted, you'll probably need to completely delete the collection,
then recreate it, and reindex. And you'll need to look into adding
servers/replicas so the loss of a single server cannot take you offline.
Thanks,
Shawn
---
This email has been checked for viruses by AVG.
http://www.avg.com