On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer
<simon.willna...@googlemail.com> wrote:

> one more thing, after somebody (thanks robert) pointed me at the
> stacktrace it seems kind of obvious what the root cause of your
> problem is. Its solr :) Solr closes the IndexWriter on commit which is
> very wasteful since you basically wait until all merges are done. Solr
> trunk has solved this problem.

That is very wasteful but I don't think it's actually the cause of the
slowdown here?

The cause looks like it's in applying deletes, which even once Solr
stops closing the IW will still occur (ie, IW.commit must also resolve
all deletes).

When IW resolves deletes it 1) opens a SegmentReader for each segment
in the index, and 2) looks up each deleted term and mark its
document(s) as deleted.

I saw a mention somewhere that you can tell Solr not to use
IW.addDocument (not IW.updateDocument) when you add a document if you
are certain it's not replacing a previous document with the same ID --
I don't know how to do that but if that's true, and you are truly only
adding documents, that could be the easiest fix here.

Failing that... you could try increasing
IndexWriterConfig.setReaderTermsIndexDivisor (not sure if/how this is
exposed in Solr's config)... this will make init time and RAM usage
for each SegmentReader faster, but lookup time slower; whether this
helps depends on if your slowness is in opening the SegmentReader (how
long does it take to IR.open on your index?) or on resolving the
deletes once SR is open.

Do you have a great many terms in your index?  Can you run CheckIndex
and post the output?  (If so this might mean you have an analysis
problem, ie, putting too many terms in the index).

> We should maybe try to fix this in 3.x too?

+1; having to wait for running merges to complete when the app calls
commit is crazy (Lucene long ago removed that limitation).

Mike McCandless

http://blog.mikemccandless.com

Reply via email to