You could try using a hash of the content? On Sep 4, 2015 9:00 AM, "Zheng Lin Edwin Yeo" <edwinye...@gmail.com> wrote:
> Hi, > > I'm trying out on the De-Duplication.I've tried to create a new signature > field in schema.xml > <field name="signature" type="string" stored="true" indexed="true" > multiValued="false" /> > > I've also added the following in solrconfig.xml. > > <updateRequestProcessorChain name="dedupe"> > <processor class="solr.processor.SignatureUpdateProcessorFactory"> > <bool name="enabled">true</bool> > <str name="signatureField">signature</str> > <bool name="overwriteDupes">false</bool> > <str name="fields">content</str> > <str name="signatureClass">solr.processor.Lookup3Signature</str> > </processor> > <processor class="solr.DistributedUpdateProcessorFactory" /> > <processor class="solr.LogUpdateProcessorFactory" /> > <processor class="solr.RunUpdateProcessorFactory" /> > </updateRequestProcessorChain> > > > However, I can't do a copyField of content into this signature field as > some of my contents are more than 32766 characters in length. Previously, I > tried to point the signatureField directly to content. but that is not > working too. > > Anything else that I can do to do a group on a new signatureField? > > > Regards, > Edwin >