subject:"Question on index time de\-duplication"

Re: Question on index time de-duplication

2015-11-01 Thread shamik

That's what I observed as well. Perhaps there's a way to customize SignatureUpdateProcessorFactory to support my use case. I'll look into the source code and figure if there's a way to do it. -- View this message in context: http://lucene.472066.n3.nabble.com/Quest

Re: Question on index time de-duplication

2015-10-31 Thread Zheng Lin Edwin Yeo

this message in context: > http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237409.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: Question on index time de-duplication

2015-10-30 Thread shamik

sible using SignatureUpdateProcessorFactory. -- View this message in context: http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237409.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Question on index time de-duplication

2015-10-30 Thread shamik

rocessorFactory ? -- View this message in context: http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237403.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question on index time de-duplication

2015-10-30 Thread shamik

alent, which is a requirement for me. -- View this message in context: http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237401.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Question on index time de-duplication

2015-10-30 Thread Markus Jelsma

solr-user@lucene.apache.org > Subject: Re: Question on index time de-duplication > > At the top of the De-Duplication wiki page is a note about collapsing > results. Once you have the signature (identical for each of the duplicates) > you'll want to collapse your results, keeping the

Re: Question on index time de-duplication

2015-10-30 Thread Scott Stults

At the top of the De-Duplication wiki page is a note about collapsing results. Once you have the signature (identical for each of the duplicates) you'll want to collapse your results, keeping the one with max date. https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results k/r,

Re: Question on index time de-duplication

2015-10-29 Thread Zheng Lin Edwin Yeo

Yes, you can try to use the SignatureUpdateProcessorFactory to do a hashing of the content to a signature field, and group the signature field during your search. You can find more information here: https://cwiki.apache.org/confluence/display/solr/De-Duplication I have been using this method to g

Question on index time de-duplication

2015-10-29 Thread Shamik Bandopadhyay

Hi, I'm looking to customizing index time de-duplication. Here's my use case and what I'm trying to achieve. I've identical documents coming from different release year of a given product. I need to index them in Solr as they are required in individual year context. But there's a generic search

Re: Question on index time de-duplication

Re: Question on index time de-duplication

Re: Question on index time de-duplication

RE: Question on index time de-duplication

Re: Question on index time de-duplication

RE: Question on index time de-duplication

Re: Question on index time de-duplication

Re: Question on index time de-duplication

Question on index time de-duplication

9 matches

Site Navigation

Mail list logo

Footer information