Neeb wrote:
> 
> Just wondering if you ever managed to run TextProfileSignature based
> deduplication. I would appreciate it if you could send me the code
> fragment for it from  solrconfig.
> 

Actually the project that was for got postponed and I got distracted by
other things, for now at least.

Re. your config, I don't see a minTokenLength in the wiki page for
deduplication, is this a recent addition that's not documented yet?

It looks okay to me though -- perhaps you could do some empirical tests to
see if it's working? i.e. add some near-dupes to a collection manually and
see if it finds them?

Andrew.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Filtering-near-duplicates-using-TextProfileSignature-tp479039p880379.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to