How to install DuplicatesDetectorService
I found http://www.jarvana.com/jarvana/browse/org/ow2/weblab/service/solr-duplicates-detector/2.0/ Is anybody knows, hot to install ans use this lib on existing Solr instance? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-install-DuplicatesDetectorService-tp1472561p1472561.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to install DuplicatesDetectorService
OK. I need to find find/prevent duplicates in Database using Solr-Index I use Django with Haystack integration. I use TextProfileSignature to smart detect duplicates in text fields solrconfig.xml wrote: > > > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> > true > sig > false > title,description >name="signatureClass">org.apache.solr.update.processor.TextProfileSignature > > > > > But there is also some other fields How can I calculate TextProfileSignature-value for custom title,description- values on Django-Side WITHOUT adding to Solr Index? I need only detect "possible duplicates" for entered by user title,description, i.e. select all records from Solr with user_sig=TextProfileSignature(user_title,user_description) Is there in Solr Webservice-Interface to do it? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-install-DuplicatesDetectorService-tp1472561p1478111.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to install DuplicatesDetectorService
Is there possible to rewrite this code to Python: private static String getFuzzyHashing(MediaUnit unit) { TextProfileSignature tps = new TextProfileSignature(); // initialise with empty parameters to force default values of TextProfileSignature attributes tps.init(SolrParams.toSolrParams(new NamedList())); // The following lines are copied from SignatureUpdateProcessorFactory SolR class tps.add("text"); tps.add(SolrComponent.extractTextFromResource(unit)); byte[] signature = tps.getSignature(); char[] arr = new char[signature.length << 1]; for (int i = 0; i < signature.length; i++) { int b = signature[i]; int idx = i << 1; arr[idx] = StrUtils.HEX_DIGITS[(b >> 4) & 0xf]; arr[idx + 1] = StrUtils.HEX_DIGITS[b & 0xf]; } return new String(arr); } -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-install-DuplicatesDetectorService-tp1472561p1478526.html Sent from the Solr - User mailing list archive at Nabble.com.