> I am doing dedup for my solr instance which works on the > content and the url > fields.My question is if I want to eliminate the records > which are 80% > matching or 90% matching in the content field then how I > should proceed for > that? > Already I have changed my solrconfig.xml and have changed > the part of file > which is required for the dedup(update Request Processor > chain) and that > part is working fine.
You can use TextProfileSignature, which is a Fuzzy hashing implementation, instead of Lookup3Signature.