On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer <simon.willna...@googlemail.com> wrote:
> one more thing, after somebody (thanks robert) pointed me at the > stacktrace it seems kind of obvious what the root cause of your > problem is. Its solr :) Solr closes the IndexWriter on commit which is > very wasteful since you basically wait until all merges are done. Solr > trunk has solved this problem. That is very wasteful but I don't think it's actually the cause of the slowdown here? The cause looks like it's in applying deletes, which even once Solr stops closing the IW will still occur (ie, IW.commit must also resolve all deletes). When IW resolves deletes it 1) opens a SegmentReader for each segment in the index, and 2) looks up each deleted term and mark its document(s) as deleted. I saw a mention somewhere that you can tell Solr not to use IW.addDocument (not IW.updateDocument) when you add a document if you are certain it's not replacing a previous document with the same ID -- I don't know how to do that but if that's true, and you are truly only adding documents, that could be the easiest fix here. Failing that... you could try increasing IndexWriterConfig.setReaderTermsIndexDivisor (not sure if/how this is exposed in Solr's config)... this will make init time and RAM usage for each SegmentReader faster, but lookup time slower; whether this helps depends on if your slowness is in opening the SegmentReader (how long does it take to IR.open on your index?) or on resolving the deletes once SR is open. Do you have a great many terms in your index? Can you run CheckIndex and post the output? (If so this might mean you have an analysis problem, ie, putting too many terms in the index). > We should maybe try to fix this in 3.x too? +1; having to wait for running merges to complete when the app calls commit is crazy (Lucene long ago removed that limitation). Mike McCandless http://blog.mikemccandless.com