Bram Van Dam [bram.van...@intix.eu] wrote: > I'm trying to get a feel of how large Solr can grow without slowing down > too much. We're looking into a use-case with up to 100 billion documents > (SolrCloud), and we're a little afraid that we'll end up requiring 100 > servers to pull it off.
One recurring theme on this list is that it is very hard to compare indexes. Even if the data structure happens to be the same, performance will very drastically depending on the types of queries and the processing requested. That being said, I acknowledge that it helps with stories to get a feel of what can be done. One second caveat is that I find it an exercise in futility to talk about scale without an idea of expected response times as well as the expected number of concurrent users. If you are just doing some nightly batch processing, you could probably run your (scaling up from your description) 100TB index off spinning drives on a couple of boxes. If you expect to be hammered with millions of requests per day, you would have to put a zero or two behind that number. End of sermon. At Lucene/Solr Revolution 2014, Grant Ingersoll also asked for user stories and pointed to https://wiki.apache.org/solr/SolrUseCases - sadly it has not caught on. The only entry is for our (State and University Library, Denmark) setup with 21TB / 7 billion documents on a single machine. To follow my own advice, I can elaborate that we have 1-3 concurrent users and a design goal of median response times below 2 seconds for faceted search. I guess that is at the larger end at the spectrum for pure size, but at the very low end for usage. - Toke Eskildsen