That seems well within Solr's capabilities, though you should come up
with a desired queries/sec figure.

Solr's query rate varies widely with the configuration -- how many
fields, fuzzy search, highlighting, facets, etc.

Essentially, Solr uses Lucene, a modern search core. It has performance
and scaling comparable to the commercial products I know about, and I was
building enterprise search for nine years. If you need to search over
100M docs or over 1000 queries/second, you may need fancier distributed
search than is available in Solr or commercially.

Solr's big weaknesses are the quality of the stemmers, parsing document
formats (PDF, MS Word), and access control on queries. If you can live
with the stemmers, Solr will probably do the job.

I worked at Infoseek, Inktomi, Verity, and Autonomy, and I'm using
Solr here at Netflix.

wunder

On 9/26/07 7:27 AM, "Law, John" <[EMAIL PROTECTED]> wrote:

> I am new to the list and new to lucene and solr. I am considering Lucene
> for a potential new application and need to know how well it scales.
> 
> Following are the parameters of the dataset.
> 
> Number of records: 7+ million
> Database size: 13.3 GB
> Index Size:  10.9 GB
> 
> My questions are simply:
> 
> 1) Approximately how long would it take Lucene to index these documents?
> 2) What would the approximate retrieval time be (i.e. search response
> time)?
> 
> Can someone provide me with some informed guidance in this regard?
> 
> Thanks in advance,
> John
> 
> ______________________________________________
> John Law
> Director, Platform Management
> ProQuest
> 789 Eisenhower Parkway
> Ann Arbor, MI 48106
> 734-997-4877
> [EMAIL PROTECTED]
> www.proquest.com
> www.csa.com
> 
> ProQuest... Start here.


Reply via email to