Hi Shawn, Thanks for your detailed explanation. Will do a POC and finalize the arch.
With Regards, Santanu On Tue, Jul 30, 2013 at 12:20 PM, Shawn Heisey <s...@elyograg.org> wrote: > On 7/30/2013 12:23 AM, Santanu8939967892 wrote: > > Yes, your assumption is correct. The index size is around 250 GB and > > we index 20/30 meta data and store around 50. > > We have plan for a Solr cloud architecture having two nodes one > Master > > and other one is replica of the master (replication factor 2) with > multiple > > zookeeper ensemble. We will have multiple shards for each Master and > > replica node. > > Is above architecture a fit for production deployment for an improved > index > > and query performance. > > Do we require 64 GB RAM or less will work for us. > > It sounds like you're planning to put the entire index on one server, > and then have a replica on another server. You'll have multiple shards, > but they won't be running on separate hardware. Running multiple shards > per server is a strategy that can work well if you have a lot CPU cores > and a low query volume. When the query volume gets really high, you > will want fewer shards per server and more servers. > > If your index is on spinning disks, I wouldn't try to run an index of > that size on a host with less than 128GB RAM, and I'd try to get 256GB. > If you have to choose between super-high-end CPUs and memory, choose > memory ... but don't skimp TOO much on the CPUs. The amount of RAM > required for each server will go down if you spread the shards out > across more servers. > > If the index is on SSD, 64GB might work OK, but 128GB would be better. > If your query volume is low, 64GB might even work for spinning disks, > but the query latency might be fairly high. > > If you require a very high query volume, two replicas might not be > enough, and you wouldn't want to run a lot of shards per server. You'd > have to actually set up a proof of concept and run tests with real data > and real queries to find out for sure what you need. > > In case it isn't clear by now - assuming you've got enough RAM for good > disk caching, query volume will dictate how many actual servers you need. > > Thanks, > Shawn > >