On 5-Apr-08, at 7:09 AM, Britske wrote:

Indexing of these documents takes a long time. Because of the size of the
documents (because of the indexed fields) I am currently batching 50
documents at once which takes about 2 seconds.Without adding the 10000
indexed fields to the document, indexing flies at about 15 ms for these 50
documents. INdexing is done using SolrJ

This is on a intel core 2 6400 @2.13ghz and 2 gb ram.

To speed this up I let 2 threads do the indexing in parallel. What happens is that solr just takes double the time (about 4 seconds) to complete these two jobs of 50 docs each in parallel. I figured because of the multi- core
setup indexing should improve, which it doesn't.

Multiple processors really only help indexing speeds when there is heavy analysis.

Does this perhaps indicate that the setup is IO-bound? What would be your best guess (given the fact that the schema has a big amount of indexed
fields) to try next to improve indexing performance?

Use Lucene 2.3 with solr 1.2, or simple try out solr trunk. The indexing has been reworked to be considerably faster (it also makes better use of multiple processors by spawing a background merging thread).

-Mike

Reply via email to