Greetings,

I'm evaluating using Solr under Tomcat to replace a number of text searching projects that currently use UMASS's INQUERY, an older search engine.

One nice feature of INQUERY is that you can create one large SGML file, containing lots of records, each bracketed with <DOC> and </DOC> tags. Submitting that big SGML document for indexing goes very fast. I believe that Solr indexes one document at a time; each document requires a separate HTTP POST.

How efficient is making a separate HTTP request per-document, when there are millions of documents? Do people ever use Solr's or Lucene's API directly for indexing large numbers of documents, and if so, what are the considerations pro and con?

Thanks to Yonik and Chris everyone for all your work; Solr looks really great.

Reply via email to