That's what I observed as well. Perhaps there's a way to customize
SignatureUpdateProcessorFactory to support my use case. I'll look into the
source code and figure if there's a way to do it.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Quest
this message in context:
> http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237409.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
sible using
SignatureUpdateProcessorFactory.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237409.html
Sent from the Solr - User mailing list archive at Nabble.com.
rocessorFactory ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237403.html
Sent from the Solr - User mailing list archive at Nabble.com.
alent, which is a requirement for me.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Question-on-index-time-de-duplication-tp4237306p4237401.html
Sent from the Solr - User mailing list archive at Nabble.com.
solr-user@lucene.apache.org
> Subject: Re: Question on index time de-duplication
>
> At the top of the De-Duplication wiki page is a note about collapsing
> results. Once you have the signature (identical for each of the duplicates)
> you'll want to collapse your results, keeping the
At the top of the De-Duplication wiki page is a note about collapsing
results. Once you have the signature (identical for each of the duplicates)
you'll want to collapse your results, keeping the one with max date.
https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
k/r,
Yes, you can try to use the SignatureUpdateProcessorFactory to do a hashing
of the content to a signature field, and group the signature field during
your search.
You can find more information here:
https://cwiki.apache.org/confluence/display/solr/De-Duplication
I have been using this method to g
Hi,
I'm looking to customizing index time de-duplication. Here's my use case
and what I'm trying to achieve.
I've identical documents coming from different release year of a given
product. I need to index them in Solr as they are required in individual
year context. But there's a generic search