On 11/2/06, Mike Klaas <[EMAIL PROTECTED]> wrote:
The one thing I'm worried about is closing the writer while documents
are being added to it. IndexWriter is nominally thread-safe, but I'm
not sure what happens to documents that are being added at the time.
Looking at IndexWriter.java, it seems like if addDocument() is entered
but hasn't reached the synchronized block, then close() is called, the
document could be lost or an exception raised.

This seems harder to address in "user code" and still maintain parallelism.
Perhaps a Lucene patch would be more appropriate?

Perhaps IndexWriter should have a close flag, and addDocument should
return a boolean indicating if the document was added or not.  Then we
could move addDocument() outside the sync block, and put a big do
while(!addDocument()) loop around the whole thing.

There is still another case to consider: if a commit happens between
adding the id to the pset and adding the document to the index, and
the add succeeds, the id will no longer be in the pset so we will end
up with a duplicate after the next commit.

I was going to try to put in some basic autoCommit logic while I was
mucking about here.  One question: did you intend for maxCommitTime to
trigger deterministically (regardless of any events occurring or not)?

I hadn't thought through the whole thing, but it seems like it should
only trigger if it would make a difference.

 I had in mind checking these constraints only when documents are
added, but this could result in maxCommitTime elapsing without a
commit.

If there is nothing to commit, that should be fine.
I think the type of guarantee we should make is that if you add a
document, it will be committed within a certain period of time
(leaving out variances for autowarming time, etc).

-Yonik

Reply via email to