Greetings,
I'm evaluating using Solr under Tomcat to replace a number of text
searching projects that currently use UMASS's INQUERY, an older search
engine.
One nice feature of INQUERY is that you can create one large SGML file,
containing lots of records, each bracketed with <DOC> and </DOC> tags.
Submitting that big SGML document for indexing goes very fast.
I believe that Solr indexes one document at a time; each document
requires a separate HTTP POST.
How efficient is making a separate HTTP request per-document, when there
are millions of documents? Do people ever use Solr's or Lucene's API
directly for indexing large numbers of documents, and if so, what are
the considerations pro and con?
Thanks to Yonik and Chris everyone for all your work; Solr looks really
great.
- One big XML file vs. many HTTP requests Michael Levy
-