On 12/4/2012 5:33 PM, Shawn Heisey wrote:
I am doing a DIH full import on a very recent checkout from branch_4x. Something I've recently done differently is enabling autocommit. I am seeing that there are deleted documents in some of the indexes. See "Development Build Indexes" at the bottom of the following screenshot. When the import is complete, the numbered shards will contain 13 million documents.

http://dl.dropbox.com/u/97770508/statuspage-deletes-import.png

The MySQL database that this imports from has a unique index on the field that Solr is using for its UniqueKey, soit's not possible to have duplicates. Each import uses one SELECT statement for the entire 13 million document import. What might be leading to these deleted docs?

Interesting development: The imports are now up to over 11 million documents, but now the number of deleted documents on all shards is zero.

I calculate deleted documents on my stats page by subtracting numDocs from maxDoc, information gathered from /admin/mbeans?stats=true.

Thanks,
Shawn

Reply via email to