Thanks for that Colvin. Even though it's a bit dated but it sure does help in getting an idea.
I definitely remember seeing a list of these sorts of blogs somewhere a > long time ago... don't know where though > By any chance you stumble upon it, please feel free to share even at a later date. On Tue, Jul 28, 2020 at 8:32 PM Colvin Cowie <colvin.cowie....@gmail.com> wrote: > Maybe not the most up to date or relevant example for your usage but > > https://sbdevel.wordpress.com/2016/11/30/70tb-16b-docs-4-machines-1-solrcloud/ > is one that sticks in my mind > I definitely remember seeing a list of these sorts of blogs somewhere a > long time ago... don't know where though > > On Tue, 28 Jul 2020 at 13:50, Prashant Jyoti <jtprash...@gmail.com> wrote: > > > Thanks Erick. > > > > 1> does Solr do what you want? You’re talking about reporting, and Solr > is > > > primarily a search engine. That said, it has tons of analytics > > capabilities > > > built in. Depends on what “reporting” means in your situation. > > > > > There is a reporting UI which has various criteria the user can filter > on, > > the data for this UI will be indexed to and fetched from Solr. These are > > basically call logs of the user's interaction with tech support. The > > documents would be at max a few MBs in size. > > > > > 2> how expensive is it? > > > > I am looking at what kind of a setup is considered okay to handle, let's > > say, average loads to start with (I am not considering billions of > > documents/day to be an average load at my place, that would be > higher-end), > > with the scope of scaling as and when the load increases. > > > > I went through the linked article in your answer and understand your > > viewpoint, but that said even I am looking for some averages ;) > > Unable to find any authentic blogs which detail their usage of Solr. > > > > On Tue, Jul 28, 2020 at 5:19 PM Erick Erickson <erickerick...@gmail.com> > > wrote: > > > > > Here’s a list of some sites using Solr: > > > https://cwiki.apache.org/confluence/display/solr/PublicServers > > > > > > It’s not really what you’re looking for though, it doesn’t really have > > the > > > details you’d like. > > > > > > There are two dimensions here: > > > > > > 1> does Solr do what you want? You’re talking about reporting, and Solr > > is > > > primarily a search engine. That said, it has tons of analytics > > capabilities > > > built in. Depends on what “reporting” means in your situation. > > > > > > 2> how expensive is it? Here “expensive” means hardware and support. > > > Unfortunately that’s un-answerable. This is really “the sizing > question”, > > > and there are too many variables to work with. If you want some backup > > for > > > why this is an unfair question to answer in the abstract, see: > > > > > > https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ > > > > > > What I’d recommend is to ask for enough resources to create a PoC on an > > > existing bit of hardware, your workstation/laptop would do. For a PoC, > > > there’s no reason to even have 3 Zookeepers, I routinely run with just > > one > > > (although I do use an external-to-Solr zookeeper). I’d start with two > > > shards, leader-only, just to be sure you take into account how > SolrCloud > > > works. I wouldn’t get fancy here, just take your first guess at how it > > will > > > all work and index a bunch of documents (say 10,000,000) and see if you > > can > > > get Solr to create the data for your reports. At that point, you have > > some > > > data to work with, i.e. how big your indexes are, whether Solr’s > > > capabilities meet your functional requirements etc. > > > > > > You can infer that I consider 10,000,000 documents a small Solr > > > installation, with the caveat that if the docs are each gigabytes in > > length > > > all bets are off. I’ve worked with clients who index billions of > > > documents/day (yes billion) admittedly they had a very large hardware > > > budget ;). I’ve seen 300M docs (each reasonably complex and a few K > > each) > > > fit comfortably on a machine with 12G allocated to Solr (64G total > > physical > > > memory IIRC). > > > > > > So, It Depends (tm)... > > > > > > Good luck! > > > Erick > > > > > > > On Jul 28, 2020, at 7:26 AM, Prashant Jyoti <jtprash...@gmail.com> > > > wrote: > > > > > > > > Hi, > > > > I wanted to check if anybody has any references for tech companies' > > blogs > > > > detailing their Solr setup in production. I am more interested in > > storage > > > > and scaling guidelines. I intend to use Solr for one of my projects > at > > > > work(back-end for a reporting tool) and need to convince higher > > > management > > > > that it is indeed the right solution. I have gone through the > material > > > > available in the Solr reference guide, I am looking for some details > > > from a > > > > working production setup. > > > > > > > > Thanks! > > > > -- > > > > Regards, > > > > Prashant. > > > > > > > > > > -- > > Regards, > > Prashant. > > > -- Regards, Prashant.