Re: Synchronize large number of records with Solr

climbingrose Fri, 14 Sep 2007 06:35:48 -0700

Hi Erik,

>>So in your case #1, documents are reindexed with this scheme - so if you
>>truly need to skip a reindexing for some reason (why, though?) you'll
>>need to come up with some other mechanism.  [perhaps update could be
>>enhanced to allow ignoring a duplicate id rather than reindexing?]


It's pretty easy to ignore duplicate id during indexing but it won't solve
my problem. I think the batch number works well in your case because you
reindex existing documents which will get the updated batch number. In my
case, I can't update existing documents and therefore, even if I use this
approach, there is no way to know if an document is to be deleted. I think I
will need to store all ids in the batch in a DocSet and then compare with
the list of all ids after indexing. That way I can at least get rid of all
expired documents. It's just not as elegant as using a batch identifier.

Re: Synchronize large number of records with Solr

Reply via email to