Hi Erick, I am looking into the rule based replica placement documentation and confused. How to ensure there are no more than one replica for any shard on the same host? There is an example rule shard:*,replica:<2,node:* seem to serve the purpose, but I am not sure if 'node' refer to solr instance or actual physical host. Is there an example for defining node?
Thanks On Sun, Aug 26, 2018 at 8:37 PM Erick Erickson <erickerick...@gmail.com> wrote: > Yes, you can use the "node placement rules", see: > https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html > > This is a variant of "rack awareness". > > Of course the simplest way if you're not doing very many collections is to > create the collection with the special "EMPTY" createNodeSet then just > build out your collection with ADDREPLICA, placing each replica on a > particular node. The idea of that capability was exactly to explicitly > control > where each and every replica landed. > > As a third alternative, just create the collection and let Solr put > the replicas where > it will, then use MOVEREPLICA to position replicas as you want. > > The node placement rules are primarily intended for automated or very large > setups. Manually placing replicas is simpler for limited numbers. > > Best, > Erick > On Sun, Aug 26, 2018 at 8:10 PM Wei <weiwan...@gmail.com> wrote: > > > > Thanks Shawn. When using multiple Solr instances per host, is there any > way > > to prevent solrcloud from putting multiple replicas of the same shard on > > same host? > > I see it makes sense if we can splitting into multiple instances with > > smaller heap size. Besides that, do you think multiple instances will be > > able to get better CPU utilization on multi-core server? > > > > Thanks, > > Wei > > > > On Sun, Aug 26, 2018 at 4:37 AM Shawn Heisey <apa...@elyograg.org> > wrote: > > > > > On 8/26/2018 12:00 AM, Wei wrote: > > > > I have a question about the deployment configuration in solr cloud. > When > > > > we need to increase the number of shards in solr cloud, there are two > > > > options: > > > > > > > > 1. Run multiple solr instances per host, each with a different port > and > > > > hosting a single core for one shard. > > > > > > > > 2. Run one solr instance per host, and have multiple cores(shards) > in > > > the > > > > same solr instance. > > > > > > > > Which would be better performance wise? For the first option I think > JVM > > > > size for each solr instance can be smaller, but deployment is more > > > > complicated? Are there any differences for cpu utilization? > > > > > > My general advice is to only have one Solr instance per machine. One > > > Solr instance can handle many indexes, and usually will do so with less > > > overhead than two or more instances. > > > > > > I can think of *ONE* exception to this -- when a single Solr instance > > > would require a heap that's extremely large. Splitting that into two or > > > more instances MIGHT greatly reduce garbage collection pauses. But > > > there's a caveat to the caveat -- in my strong opinion, if your Solr > > > instance is so big that it requires a huge heap and you're considering > > > splitting into multiple Solr instances on one machine, you very likely > > > need to run each of those instances on *separate* machines, so that > each > > > one can have access to all the resources of the machine it's running > on. > > > > > > For SolrCloud, when you're running multiple instances per machine, Solr > > > will consider those to be completely separate instances, and you may > end > > > up with all of the replicas for a shard on a single machine, which is a > > > problem for high availability. > > > > > > Thanks, > > > Shawn > > > > > > >