Maybe not the most up to date or relevant example for your usage but https://sbdevel.wordpress.com/2016/11/30/70tb-16b-docs-4-machines-1-solrcloud/ is one that sticks in my mind I definitely remember seeing a list of these sorts of blogs somewhere a long time ago... don't know where though
On Tue, 28 Jul 2020 at 13:50, Prashant Jyoti <jtprash...@gmail.com> wrote: > Thanks Erick. > > 1> does Solr do what you want? You’re talking about reporting, and Solr is > > primarily a search engine. That said, it has tons of analytics > capabilities > > built in. Depends on what “reporting” means in your situation. > > > There is a reporting UI which has various criteria the user can filter on, > the data for this UI will be indexed to and fetched from Solr. These are > basically call logs of the user's interaction with tech support. The > documents would be at max a few MBs in size. > > > 2> how expensive is it? > > I am looking at what kind of a setup is considered okay to handle, let's > say, average loads to start with (I am not considering billions of > documents/day to be an average load at my place, that would be higher-end), > with the scope of scaling as and when the load increases. > > I went through the linked article in your answer and understand your > viewpoint, but that said even I am looking for some averages ;) > Unable to find any authentic blogs which detail their usage of Solr. > > On Tue, Jul 28, 2020 at 5:19 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > Here’s a list of some sites using Solr: > > https://cwiki.apache.org/confluence/display/solr/PublicServers > > > > It’s not really what you’re looking for though, it doesn’t really have > the > > details you’d like. > > > > There are two dimensions here: > > > > 1> does Solr do what you want? You’re talking about reporting, and Solr > is > > primarily a search engine. That said, it has tons of analytics > capabilities > > built in. Depends on what “reporting” means in your situation. > > > > 2> how expensive is it? Here “expensive” means hardware and support. > > Unfortunately that’s un-answerable. This is really “the sizing question”, > > and there are too many variables to work with. If you want some backup > for > > why this is an unfair question to answer in the abstract, see: > > > https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ > > > > What I’d recommend is to ask for enough resources to create a PoC on an > > existing bit of hardware, your workstation/laptop would do. For a PoC, > > there’s no reason to even have 3 Zookeepers, I routinely run with just > one > > (although I do use an external-to-Solr zookeeper). I’d start with two > > shards, leader-only, just to be sure you take into account how SolrCloud > > works. I wouldn’t get fancy here, just take your first guess at how it > will > > all work and index a bunch of documents (say 10,000,000) and see if you > can > > get Solr to create the data for your reports. At that point, you have > some > > data to work with, i.e. how big your indexes are, whether Solr’s > > capabilities meet your functional requirements etc. > > > > You can infer that I consider 10,000,000 documents a small Solr > > installation, with the caveat that if the docs are each gigabytes in > length > > all bets are off. I’ve worked with clients who index billions of > > documents/day (yes billion) admittedly they had a very large hardware > > budget ;). I’ve seen 300M docs (each reasonably complex and a few K > each) > > fit comfortably on a machine with 12G allocated to Solr (64G total > physical > > memory IIRC). > > > > So, It Depends (tm)... > > > > Good luck! > > Erick > > > > > On Jul 28, 2020, at 7:26 AM, Prashant Jyoti <jtprash...@gmail.com> > > wrote: > > > > > > Hi, > > > I wanted to check if anybody has any references for tech companies' > blogs > > > detailing their Solr setup in production. I am more interested in > storage > > > and scaling guidelines. I intend to use Solr for one of my projects at > > > work(back-end for a reporting tool) and need to convince higher > > management > > > that it is indeed the right solution. I have gone through the material > > > available in the Solr reference guide, I am looking for some details > > from a > > > working production setup. > > > > > > Thanks! > > > -- > > > Regards, > > > Prashant. > > > > > > -- > Regards, > Prashant. >