On 3/19/2015 2:09 AM, Derek Poh wrote: > Am I right to saywe need todo the combine of duplicate records into 1 > before feeding it to Solr to index? > > I am coming from Endecawhich support the combine of duplicate records > into 1 recordduring indexing. Was wondering if Solr support this.
If you index multiple documents with the same uniqueId field value, Solr will delete the previous document and index the new one. The data in the previous document is never seen. You could in theory write a custom UpdateRequestProcessor that looks for the previous document and merges it in whatever way you desire, so the combined information is what will be indexed, and configure Solr to use that update processor ...but this capability is not available out of the box. An update processor that does this should probably be included with Solr, but it would either need to be highly configurable, or everyone would need to agree on exactly what rules should be followed when combining duplicate records. Thanks, Shawn