That seems well within Solr's capabilities, though you should come up with a desired queries/sec figure.
Solr's query rate varies widely with the configuration -- how many fields, fuzzy search, highlighting, facets, etc. Essentially, Solr uses Lucene, a modern search core. It has performance and scaling comparable to the commercial products I know about, and I was building enterprise search for nine years. If you need to search over 100M docs or over 1000 queries/second, you may need fancier distributed search than is available in Solr or commercially. Solr's big weaknesses are the quality of the stemmers, parsing document formats (PDF, MS Word), and access control on queries. If you can live with the stemmers, Solr will probably do the job. I worked at Infoseek, Inktomi, Verity, and Autonomy, and I'm using Solr here at Netflix. wunder On 9/26/07 7:27 AM, "Law, John" <[EMAIL PROTECTED]> wrote: > I am new to the list and new to lucene and solr. I am considering Lucene > for a potential new application and need to know how well it scales. > > Following are the parameters of the dataset. > > Number of records: 7+ million > Database size: 13.3 GB > Index Size: 10.9 GB > > My questions are simply: > > 1) Approximately how long would it take Lucene to index these documents? > 2) What would the approximate retrieval time be (i.e. search response > time)? > > Can someone provide me with some informed guidance in this regard? > > Thanks in advance, > John > > ______________________________________________ > John Law > Director, Platform Management > ProQuest > 789 Eisenhower Parkway > Ann Arbor, MI 48106 > 734-997-4877 > [EMAIL PROTECTED] > www.proquest.com > www.csa.com > > ProQuest... Start here.