Bram Van Dam [bram.van...@intix.eu] wrote:
> I'm trying to get a feel of how large Solr can grow without slowing down
> too much. We're looking into a use-case with up to 100 billion documents
> (SolrCloud), and we're a little afraid that we'll end up requiring 100
> servers to pull it off.

One recurring theme on this list is that it is very hard to compare indexes. 
Even if the data structure happens to be the same, performance will very 
drastically depending on the types of queries and the processing requested. 
That being said, I acknowledge that it helps with stories to get a feel of what 
can be done.

One second caveat is that I find it an exercise in futility to talk about scale 
without an idea of expected response times as well as the expected number of 
concurrent users. If you are just doing some nightly batch processing, you 
could probably run your (scaling up from your description) 100TB index off 
spinning drives on a couple of boxes. If you expect to be hammered with 
millions of requests per day, you would have to put a zero or two behind that 
number.

End of sermon.

At Lucene/Solr Revolution 2014, Grant Ingersoll also asked for user stories and 
pointed to https://wiki.apache.org/solr/SolrUseCases - sadly it has not caught 
on. The only entry is for our (State and University Library, Denmark) setup 
with 21TB / 7 billion documents on a single machine. To follow my own advice, I 
can elaborate that we have 1-3 concurrent users and a design goal of median 
response times below 2 seconds for faceted search. I guess that is at the 
larger end at the spectrum for pure size, but at the very low end for usage.

- Toke Eskildsen

Reply via email to