Re: Moving to solrcloud from single instance

Shawn Heisey Mon, 12 Aug 2019 13:21:30 -0700

On 8/12/2019 1:42 PM, Erie Data Systems wrote:

I am starting the planning stages of moving from a single instance of solr
8 to a solrcloud implementation.


Currently I have a 148GB index on a single dedicated server w 96gb ram @ 16
cores /2.4ghz ea. + SSD disk. The search is fast but obviously the index
size is greater than the physical memory, which to my understanding is not
a good thing.

An *IDEAL* setup would have enough memory available (not assigned toprograms) to be able to fit the entire index in the disk cache.

Lots of people run systems that aren't ideal and have perfectlyacceptable performance. I did that for several years. I would haveloved to have more memory, but the budget wasn't there, and the machinesI was using were already maxed out at 64GB.

If performance is acceptable already, I think that not being able to fitthe entire index into available memory is not enough of a reason to makesignificant changes that might require significant development time foryour systems that keep Solr operational. Switching to SolrCloud couldrequire changes to your other software.

My issue is that im not sure where to go to learn how to set this up, how
many shards, how many replicas, etc and would rather hire somebody or
something (detailed video or document)  to guide me through the process,
and make decisions along the way...For example I think a shard is a piece
of the index... but I dont even know how to decide how many replicas or
what they are .....

There are no standardized rules for making these decisions. Typicallyyou have to make an educated guess and try it to see whether it works.


https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

If it's done in the typical way, telling a SolrCloud setup to create acollection with 3 shards and 2 replicas will create six individualindexes that make up the whole collection. The index will be split intothree pieces (shards), and each of those pieces will have two copies(replicas). For each shard, an election will be done that will electone of the replicas as leader.

Sharding adds overhead. In some cases with extremely large indexes, theoverhead is less than the performance gained by splitting the index ontoseparate machines and letting those machines work in parallel. In othercases, the overhead may result in things actually getting slower.


Thanks,
Shawn

Re: Moving to solrcloud from single instance

Reply via email to