I used two fields to set up the signature, the unique Id and a time stamp field.

As its in test, I set it up- cleared all the data out in both collecionsand 
reloaded it. I could see the signature which was created. I then migrated into 
cold collection which already had documents in with the same unique id and 
signature.
I ended up with duplicates in the cold collection.

Thanks for your help,

Philippa

________________________________________
From: Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Sent: 03 December 2015 02:30:31
To: solr-user@lucene.apache.org
Subject: Re: Protect against duplicates with the Migrate statement

Hi Philippa,

Which field did you use to set it as SignatureField in your ColdDocuments
when you implement the de-duplication?

Regards,
Edwin


On 2 December 2015 at 18:59, philippa griggs <philippa.gri...@hotmail.co.uk>
wrote:

> Hello,
>
>
> I'm using Solr 5.2.1 and Zookeeper 3.4.6.
>
>
> I'm implementing two collections - HotDocuments and ColdDocuments . New
> documents will only be written to HotDocuments and every night I will
> migrate a chunk of documents into ColdDocuments.
>
>
> In the test environment, I have the Collection API migrate statement
> working fine. I know this won't handle duplicates ending up in the
> ColdDocuments collection and I don't expect to have duplicate documents but
> I would like to protect against it- just in case.
>
>
> We have a unique key and I've tried to implement de-duplication (
> https://cwiki.apache.org/confluence/display/solr/De-Duplication) but I
> still end up with duplicates in the ColdDocuments collection.
>
>
>
> Does anyone have any suggestions on how I can protect against duplicates
> with the migrate statement?  Any ideas would be greatly appreciated.
>
>
> Many thanks
>
> Philippa
>

Reply via email to