You can't just add a new core to an existing collection. You can add the new node to the cloud, but it won't be part of any collection. You're not going to be able to just slide it in as a 4th shard to an established collection of 3 shards.
The root of that comes from routing (I'll assume you use default routing, rather than any custom routing). When you index a document into the cloud, it gets a unique id number attached to it. If you have 3 shards, than each shard gets 1/3 of the range of those possible ids. Inserts and/or updates for the same document will have the same id and be routed to the same shard. Shard splitting just divides the range of the shard in half, and copies documents to the 2 new shards based upon where their id's now fall in the new range. That's a little easier to manage than the more complex process of adding one shard, then having to adjust the ranges on all the other shards, and then copy entries that have to move -- all the while ensuring that new adds/updates/deletes are being routed to the correct location based upon whether the original has been copied over to the new ranges or not, yada, yada, yada. I believe there's been some discussions about how to add a capability like that to solr (i.e. adjust shard ranges and have documents moved and handled correctly), but I don't think it's even in 5.0. Now, if you feel the need to go down this path of adding a single shard to a 3 shard collection, here's something similar. Add your new solr node to the cloud. Then create a 1 shard, 2 replica collection called "collectionPart2". Also add a query alias for "TotalCollection" that points to "collectionPart1", "collectionPart2". That way a query will get processed by all 4 of your shards. Now this will make indexing more difficult, because you'll have to send your new documents to "collectionPart2" until that collection's shard gets about as big as the shards on your 3 shard collection. But some source data can be split up like that fairly easily, especially sequential data source. For example, if indexing twitter or email feeds, you can create new collection with appropriate shard/replica configuration and feed in a day (or month, or whatever) of data. Then repeat with a new collection for the next set. Keep the query alias updated to span the collections you're interested in. -----Original Message----- From: tuxedomoon [mailto:dancolem...@yahoo.com] Sent: Friday, February 27, 2015 12:43 PM To: solr-user@lucene.apache.org Subject: Re: Does shard splitting double host count What about adding one new leader/replica pair? It seems that would entail a) creating the r3.large instances and volumes b) adding 2 new Zookeeper hosts? c) updating my Zookeeper configs (new hosts, new ids, new SOLR config) d) restarting all ZKs e) restarting SOLR hosts in sequence needed for correct shard/replica assignment f) start indexing again So shards 1,2,3 start with 33% of the docs each. As I start indexing new documents get sharded at 25% per shard. If I reindex a document that exists already in shard2, does it remain in shard2 or could it migrate to another shard, thus removing it from shard2. I'm looking for a migration strategy to achieve 25% docs per shard. I would also consider deleting docs by daterange from shards1,2,3 and reindexing them to redistribute evenly. -- View this message in context: http://lucene.472066.n3.nabble.com/Does-shard-splitting-double-host-count-tp4189595p4189672.html Sent from the Solr - User mailing list archive at Nabble.com.