Hi Jen, Can you please forward the diagram attachment too that Ephraim sent. :-) Thanks, Tirthankar
-----Original Message----- From: Jens Mueller [mailto:supidupi...@googlemail.com] Sent: Tuesday, April 05, 2011 10:30 PM To: solr-user@lucene.apache.org Subject: Re: FW: Very very large scale Solr Deployment = how to do (Expert Question)? Hello Ephraim, thank you so much for the great Document/Scaling-Concept!! First I think you really should publish this on the solr wiki. This approach is nowhere documented there and not really obvious for newbies and your document is great and explains this very well! Please allow me to further questions regarding your document: 1.) Is it correct, that you mean by "DB" the Origin-Data-Source of the data that is fed into the Solr "Cloud" for searching? 2.) Solr Aggregator: This term did not yeald any google results, but is a very important aspect of your design (and this was the missing piece for me when thinking about solr architectures): Is it cocrrec that the "aggregators" are simply tomcat instances, with the solr webapp deployed? These Aggregators do not have their own index but only run the solr webapp and I access them via the ?shard= parameter giving the shards I want to query? (So in the end they aggreate the data of the shards but do not have their own data). This is really an important aspect that is not documented well enough in the solr documentation. Thank you very much! Jens 2011/4/5 Ephraim Ofir <ephra...@icq.com> > of course the attachment didn't get to the list, so here it is if you > want it... > > Ephraim Ofir > > > -----Original Message----- > From: Ephraim Ofir > Sent: Tuesday, April 05, 2011 10:20 AM > To: 'solr-user@lucene.apache.org' > Subject: RE: Very very large scale Solr Deployment = how to do (Expert > Question)? > > I'm not sure about the scale you're aiming for, but you probably want > to do both sharding and replication. There's no central server which > would be the bottleneck. The guidelines should probably be something like: > 1. Split your index to enough shards so it can keep up with the update > rate. > 2. Have enough replicates of each shard master to keep up with the > rate of queries. > 3. Have enough aggregators in front of the shard replicates so the > aggregation doesn't become a bottleneck. > 4. Make sure you have good load balancing across your system. > > Attached is a diagram of the setup we have. You might want to look > into SolrCloud as well. > > Ephraim Ofir > > > -----Original Message----- > From: Jens Mueller [mailto:supidupi...@googlemail.com] > Sent: Tuesday, April 05, 2011 4:25 AM > To: solr-user@lucene.apache.org > Subject: Very very large scale Solr Deployment = how to do (Expert > Question)? > > Hello Experts, > > > > I am a Solr newbie but read quite a lot of docs. I still do not > understand what would be the best way to setup very large scale > deployments: > > > > Goal (threoretical): > > A.) Index-Size: 1 Petabyte (1 Document is about 5 KB in Size) > > B) Queries: 100000 Queries/ per Second > > C) Updates: 100000 Updates / per Second > > > > > Solr offers: > > 1.) Replication => Scales Well for B) BUT A) and C) are not > satisfied > > > 2.) Sharding => Scales well for A) BUT B) and C) are not satisfied > (=> As > I understand the Sharding approach all goes through a central server, > that dispatches the updates and assembles the quries retrieved from > the different shards. But this central server has also some capacity > limits...) > > > > > What is the right approach to handle such large deployments? I would > be thankfull for just a rough sketch of the concepts so I can > experiment/search further... > > > Maybe I am missing something very trivial as I think some of the "Solr > Users/Use Cases" on the homepage are that kind of large deployments. > How are they implemented? > > > > Thanky very much!!! > > Jens > ******************Legal Disclaimer*************************** "This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message in error, please advise the sender by reply email and delete the message. Thank you." *********************************************************