Neeb wrote: > > Just wondering if you ever managed to run TextProfileSignature based > deduplication. I would appreciate it if you could send me the code > fragment for it from solrconfig. >
Actually the project that was for got postponed and I got distracted by other things, for now at least. Re. your config, I don't see a minTokenLength in the wiki page for deduplication, is this a recent addition that's not documented yet? It looks okay to me though -- perhaps you could do some empirical tests to see if it's working? i.e. add some near-dupes to a collection manually and see if it finds them? Andrew. -- View this message in context: http://lucene.472066.n3.nabble.com/Filtering-near-duplicates-using-TextProfileSignature-tp479039p880379.html Sent from the Solr - User mailing list archive at Nabble.com.