Re: How to install DuplicatesDetectorService

2010-09-15 Thread Erick Erickson
Have you looked at: http://wiki.apache.org/solr/Deduplication Best Erick On Wed, Sep 15, 2010 at 4:58 AM, hellboy wrote: > > Is there possible to rewrite this code to Python: > > private static String getFuzzyHashing(MediaUnit unit) { >

Re: How to install DuplicatesDetectorService

2010-09-15 Thread hellboy
Is there possible to rewrite this code to Python: private static String getFuzzyHashing(MediaUnit unit) { TextProfileSignature tps = new TextProfileSignature(); // initialise with empty parameters to force default values of TextProfileSignature attributes

Re: How to install DuplicatesDetectorService

2010-09-15 Thread hellboy
OK. I need to find find/prevent duplicates in Database using Solr-Index I use Django with Haystack integration. I use TextProfileSignature to smart detect duplicates in text fields solrconfig.xml wrote: > > > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> >

Re: How to install DuplicatesDetectorService

2010-09-14 Thread Erick Erickson
Why do you want to? Perhaps there's a better solution for your underlying problem if you'd explain shat it is... Best Erick On Tue, Sep 14, 2010 at 8:05 AM, hellboy wrote: > > I found > > > http://www.jarvana.com/jarvana/browse/org/ow2/weblab/service/solr-duplicates-detector/2.0/ > > Is anybody