I'm not sure I understand your question...
A "near duplicate document" could mean a LOT of things depending on the
context.
perhaps you just need "fuzzy searching"?
http://lucene.apache.org/java/docs/queryparsersyntax.html#Fuzzy%20Searches
or "proximity searches"?
http://lucene.apache.org/java/docs/queryparsersyntax.html#Proximity%20Searches
MoreLikeThisHandler (added in 1.3-dev) may be able to help, but it is
used to search for other similar documents based on the results of
another query.
ryan
rishabh9 wrote:
Can anyone help me?
Rishabh
rishabh9 wrote:
Hi,
I am evaluating "Solr 1.2" for my project and wanted to know if it can
return near duplicate documents (near dups) and how do i go about it? I am
not sure, but is "MoreLikeThisHandler" the implementation for near dups?
Rishabh