Hi Philippa,

The migrate command actually splits the lucene index from the source
and merges it into the target collection. Whereas, the de-duplication
is applied only to incoming updates. So you see migrate is lower level
than de-duplication and therefore they cannot work together. If you
want de-duplication, you have no option but to index documents instead
of using migrate command.

On Wed, Dec 2, 2015 at 4:29 PM, philippa griggs
<philippa.gri...@hotmail.co.uk> wrote:
> Hello,
>
>
> I'm using Solr 5.2.1 and Zookeeper 3.4.6.
>
>
> I'm implementing two collections - HotDocuments and ColdDocuments . New 
> documents will only be written to HotDocuments and every night I will migrate 
> a chunk of documents into ColdDocuments.
>
>
> In the test environment, I have the Collection API migrate statement working 
> fine. I know this won't handle duplicates ending up in the ColdDocuments 
> collection and I don't expect to have duplicate documents but I would like to 
> protect against it- just in case.
>
>
> We have a unique key and I've tried to implement de-duplication 
> (https://cwiki.apache.org/confluence/display/solr/De-Duplication) but I still 
> end up with duplicates in the ColdDocuments collection.
>
>
>
> Does anyone have any suggestions on how I can protect against duplicates with 
> the migrate statement?  Any ideas would be greatly appreciated.
>
>
> Many thanks
>
> Philippa



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to