The main advantage in SolrCloud in your setup is HA/DR. You say you
have multiple replicas and shards. Either you have to index to each
replica separately or you use master/slave replication. In either case
you have to manage and fix the case where some node goes down. If
you're using master/slave, if the master goes down you need to get in
there and fix it, reassign the master, make config changes, restart
Solr to pick them up, make sure you pick up any missed updates and all
that.

in SolrCloud that is managed for you. Plus, let's say you want to
increase QPS capacity. In SolrCloud all you do is use the collections
API ADDREPLICA command and you're done. It gets created (and you can
specify exactly what node if you want), the index gets copied, new
updates are automatically routed to it and it starts serving requests
when it's synchronized all automagically. Symmetrically you can
DELETEREPLICA if you have too much capacity.

The price here is you have to get comfortable with maintaining
ZooKeeper admittedly.

Also in the 7x world you have different types of replicas, TLOG, PULL
and NRT that combine some of the features of master/slave with
SolrCloud.

Generally my rule of thumb is the minute you get beyond a single shard
you should move to SolrCloud. If all your data fits in one Solr core
then it's less clear-cut, master/slave can work just fine. It Depends
(tm) of course.

Your use case is "implicit" (being renamed "manual") routing when you
create your Solr collection. There are pros and cons here, but that's
beyond the scope of your question. Your infrastructure should port
pretty directly to SolrCloud. The short form is that all your indexing
and/or querying is happening on a single node when using manual
routing rather than in parallel. Of course executing parallel
sub-queries imposes its own overhead.....

If your use-case for having these on a single shard it to segregate
the data by some set (say users), you might want to consider just
using separate _collections_ in SolrCloud where old_shard ==
new_collection, basically all your routing is the same. You can create
aliases pointing to multiple collections or specify multiple
collections on the query, don't know if that fits your use case or not
though.


Best,
Erick

On Fri, Dec 15, 2017 at 9:03 AM, John Davis <johndavis925...@gmail.com> wrote:
> Hello,
> We are thinking about migrating to SolrCloud. Our current setup is:
> 1. Multiple replicas and shards.
> 2. Each query typically hits a single shard only.
> 3. We have an external system that assigns a document to a shard based on
> it's origin and is also used by solr clients when querying to find the
> correct shard to query.
>
> It looks like the biggest advantage of SolrCloud is #3 - to route document
> to the correct shard & replicas when indexing and to route query similarly.
> Given we already have a fairly reliable system to do this, are there other
> benefits from migrating to SolrCloud?
>
> Thanks,
> John

Reply via email to