These are excellent questions and give me a good sense of why you suggest using the collections api.
In our case we have 8 shards of product data with a even distribution of data per shard, no hot spots. We have very different load at different points in the year (cyber monday), and we tend to have very little traffic at night. I'm thinking of two use cases: 1) we are seeing increased latency due to load and want to add 8 more replicas to handle the query volume. Once the volume subsides, we'd remove the nodes. 2) we lose a node due to some unexpected failure (ec2 tends to do this). We want auto scaling to detect the failure and add a node to replace the failed one. In both cases the core api makes it easy. It adds nodes to the shards evenly. Otherwise we have to write a fairly involved script that is subject to race conditions to determine which shard to add nodes to. Let me know if I'm making dangerous or uninformed assumptions, as I'm new to solr. Thanks, Paul > On Feb 14, 2016, at 10:35 AM, Susheel Kumar <susheel2...@gmail.com> wrote: > > Hi Pual, > > > For Auto-scaling, it depends on how you are thinking to design and what/how > do you want to scale. Which scenario you think makes coreadmin API easy to > use for a sharded SolrCloud environment? > > Isn't if in a sharded environment (assume 3 shards A,B & C) and shard B has > having higher or more load, then you want to add Replica for shard B to > distribute the load or if a particular shard replica goes down then you > want to add another Replica back for the shard in which case ADDREPLICA > requires a shard name? > > Can you describe your scenario / provide more detail? > > Thanks, > Susheel > > > > On Sun, Feb 14, 2016 at 11:51 AM, McCallick, Paul < > paul.e.mccall...@nordstrom.com> wrote: > >> Hi all, >> >> >> This doesn’t really answer the following question: >> >> What is the suggested way to add a new node to a collection via the >> apis? I am specifically thinking of autoscale scenarios where a node has >> gone down or more nodes are needed to handle load. >> >> >> The coreadmin api makes this easy. The collections api (ADDREPLICA), >> makes this very difficult. >> >> >>> On 2/14/16, 8:19 AM, "Susheel Kumar" <susheel2...@gmail.com> wrote: >>> >>> Hi Paul, >>> >>> Shawn is referring to use Collections API >>> https://cwiki.apache.org/confluence/display/solr/Collections+API than >> Core >>> Admin API https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API >>> for SolrCloud. >>> >>> Hope that clarifies and you mentioned about ADDREPLICA which is the >>> collections API, so you are on right track. >>> >>> Thanks, >>> Susheel >>> >>> >>> >>> On Sun, Feb 14, 2016 at 10:51 AM, McCallick, Paul < >>> paul.e.mccall...@nordstrom.com> wrote: >>> >>>> Then what is the suggested way to add a new node to a collection via the >>>> apis? I am specifically thinking of autoscale scenarios where a node >> has >>>> gone down or more nodes are needed to handle load. >>>> >>>> Note that the ADDREPLICA endpoint requires a shard name, which puts the >>>> onus of how to scale out on the user. This can be challenging in an >>>> autoscale scenario. >>>> >>>> Thanks, >>>> Paul >>>> >>>>> On Feb 14, 2016, at 12:25 AM, Shawn Heisey <apa...@elyograg.org> >> wrote: >>>>> >>>>>> On 2/13/2016 6:01 PM, McCallick, Paul wrote: >>>>>> - When creating a new collection, SOLRCloud will use all available >>>> nodes for the collection, adding cores to each. This assumes that you >> do >>>> not specify a replicationFactor. >>>>> >>>>> The number of nodes that will be used is numShards multipled by >>>>> replicationFactor. The default value for replicationFactor is 1. If >>>>> you do not specify numShards, there is no default -- the CREATE call >>>>> will fail. The value of maxShardsPerNode can also affect the overall >>>>> result. >>>>> >>>>>> - When adding new nodes to the cluster AFTER the collection is >> created, >>>> one must use the core admin api to add the node to the collection. >>>>> >>>>> Using the CoreAdmin API is strongly discouraged when running >> SolrCloud. >>>>> It works, but it is an expert API when in cloud mode, and can cause >>>>> serious problems if not used correctly. Instead, use the Collections >>>>> API. It can handle all normal maintenance needs. >>>>> >>>>>> I would really like to see the second case behave more like the >> first. >>>> If I add a node to the cluster, it is automatically used as a replica >> for >>>> existing clusters without my having to do so. This would really >> simplify >>>> things. >>>>> >>>>> I've added a FAQ entry to address why this is a bad idea. >> https://wiki.apache.org/solr/FAQ#Why_doesn.27t_SolrCloud_automatically_create_replicas_when_I_add_nodes.3F >>>>> >>>>> Thanks, >>>>> Shawn >>