Hello,

I'm trying to figure out if this is an issue or I'm doing something
wrong. Basically, with Solr 6.6.1 running on HDFS (Hadoop 2.7.4), I
see that if I create a one-shard collection with replicationFactor=1
(on one node), then add a second node, it creates a new replica on
this new node. I thought this isn't expected, and that Solr will only
try to keep the number of replicas according to replicationFactor.

Here are some reproducing steps:
- have HDFS set up according to
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
- have Solr 6.6.1 downloaded and extracted
- I also have a Zookeeper set up separately, if it matters (just
extracted and started)
- then, start Solr like

bin/solr start -c -Dsolr.directoryFactory=HdfsDirectoryFactory
-Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://localhost:9000/solr

- upload a config and create a collection:

bin/solr zk upconfig -n hdfs -d
./server/solr/configsets/data_driven_schema_configs/conf -z
localhost:2181

curl 
"http://localhost:8983/solr/admin/collections?action=CREATE&name=hdfs1&numShards=1&replicationFactor=1&autoAddReplicas=true&collection.configName=hdfs";

- add a document (again, it shouldn't matter)
- start a second node (I copied the whole extracted Solr, just in case):

bin/solr start -c -p 8984 -z localhost:2181
-Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs
-Dsolr.hdfs.home=hdfs://localhost:9000/solr

At this point I have two replicas of my shard (one on each node).

Am I missing something or is this a bug? Maybe replicationFactor=1 is
an edge case?

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

Reply via email to