On 3/20/2015 4:03 AM, Toke Eskildsen wrote:
> On Thu, 2015-03-19 at 15:44 +0100, Shawn Heisey wrote:
>> You could in theory write a custom UpdateRequestProcessor that looks for
>> the previous document and merges it in whatever way you desire, so the
>> combined information is what will be indexed,
On Thu, 2015-03-19 at 15:44 +0100, Shawn Heisey wrote:
> You could in theory write a custom UpdateRequestProcessor that looks for
> the previous document and merges it in whatever way you desire, so the
> combined information is what will be indexed, and configure Solr to use
> that update processo
Oh that is how Solr works...
On 3/19/2015 10:44 PM, Shawn Heisey wrote:
On 3/19/2015 2:09 AM, Derek Poh wrote:
Am I right to saywe need todo the combine of duplicate records into 1
before feeding it to Solr to index?
I am coming from Endecawhich support the combine of duplicate records
into 1
bq: Am I right to saywe need todo the combine of duplicate records
into 1 before feeding it to Solr to index?
That's what I'd do. As Shawn says, if you simply fire them both at
Solr the more recent one will replace the older one.
Best,
Erick
On Thu, Mar 19, 2015 at 7:44 AM, Shawn Heisey wrote:
On 3/19/2015 2:09 AM, Derek Poh wrote:
> Am I right to saywe need todo the combine of duplicate records into 1
> before feeding it to Solr to index?
>
> I am coming from Endecawhich support the combine of duplicate records
> into 1 recordduring indexing. Was wondering if Solr support this.
If you
Hi Erick
Am I right to saywe need todo the combine of duplicate records into 1
before feeding it to Solr to index?
I am coming from Endecawhich support the combine of duplicate records
into 1 recordduring indexing. Was wondering if Solr support this.
-Derek
On 3/18/2015 11:21 PM, Erick Eri
I'd use SolrJ, pull the docs by productId order and combine records
with the same product ID into a single doc.
Here's a starter set for indexing form a DB with SolrJ. It has Tika
processing in it as well, but you can pull that out pretty easily.
https://lucidworks.com/blog/indexing-with-solrj/
Hi
If I have duplicaterecords in my source data (DB or delimited files).
For simplicity sake they are of the following nature
Product IdBusiness Type
---
12345 Exporter
12345 Agent
12366 Manufacturer
12377 Exporter
12377 Distributor