No one can answer that, because it depends on how you configure Solr. How many fields do you want to search? Are you using fuzzy search? Facets? Highlighting?
We are searching a much smaller collection, about 250K docs, with great success. We see 80 queries/sec on each of four servers, and response times under 100ms. Each query searches against seven fields and we don't use any of the features I listed above. wunder On 9/26/07 10:50 AM, "Law, John" <[EMAIL PROTECTED]> wrote: > Thanks all! One last question... > > If I had a collection of 2.5 billion docs and a demand averaging 200 > queries per second, what's the confidence that Solr/Lucene could handle > this volume and execute search with sub-second response times? > > > -----Original Message----- > From: Charlie Jackson [mailto:[EMAIL PROTECTED] > Sent: Wednesday, September 26, 2007 1:32 PM > To: solr-user@lucene.apache.org > Subject: RE: dataset parameters suitable for lucene application > > Sorry, I meant that it maxed out in the sense that my maxDoc field on > the stats page was 8.8 million, which indicates that the most docs it > has ever had was around 8.8 million. It's down to about 7.8 million > currently. I have seen no signs of a "maximum" number of docs Solr can > handle. > > > -----Original Message----- > From: Chris Harris [mailto:[EMAIL PROTECTED] > Sent: Wednesday, September 26, 2007 11:49 AM > To: solr-user@lucene.apache.org > Subject: Re: dataset parameters suitable for lucene application > > By "maxed out" do you mean that Solr's performance became unacceptable > beyond 8.8M records, or that you only had 8.8M records to index? If > the former, can you share the particular symptoms? > > On 9/26/07, Charlie Jackson <[EMAIL PROTECTED]> wrote: >> My experiences so far with this level of data have been good. >> >> Number of records: Maxed out at 8.8 million >> Database size: friggin huge (100+ GB) >> Index size: ~24 GB >> >> 1) It took me about a day to index 8 million docs using a > non-optimized >> program I wrote. It's non-optimized in the sense that it's not >> multi-threaded. It batched together groups of about 5,000 docs at a > time >> to be indexed. >> >> 2) Search times for a basic search are almost always sub-second. If we >> toss in some faceting, it takes a little longer, but I've hardly ever >> seen it go above 1-2 seconds even with the most advanced queries. >> >> Hope that helps. >> >> >> Charlie >> >> ____________________________________________ >> >> -----Original Message----- >> From: Law, John [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, September 26, 2007 9:28 AM >> To: solr-user@lucene.apache.org >> Subject: dataset parameters suitable for lucene application >> >> I am new to the list and new to lucene and solr. I am considering > Lucene >> for a potential new application and need to know how well it scales. >> >> Following are the parameters of the dataset. >> >> Number of records: 7+ million >> Database size: 13.3 GB >> Index Size: 10.9 GB >> >> My questions are simply: >> >> 1) Approximately how long would it take Lucene to index these > documents? >> 2) What would the approximate retrieval time be (i.e. search response >> time)? >> >> Can someone provide me with some informed guidance in this regard? >> >> Thanks in advance, >> John >> >> ______________________________________________ >> John Law >> Director, Platform Management >> ProQuest >> 789 Eisenhower Parkway >> Ann Arbor, MI 48106 >> 734-997-4877 >> [EMAIL PROTECTED] >> www.proquest.com >> www.csa.com >> >> ProQuest... Start here. >> >> >> >>