Matt,

The Deduplication feature in Solr does support near-duplicate scenario.  It 
comes with a few components to help you detect near-duplicates, and you should 
be able to write a custom near-dupe detection component and plug it in. 

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Matt Mitchell <goodie...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Monday, May 25, 2009 3:30:42 PM
> Subject: Re: grouping response docs together
> 
> Thanks guys. I looked at the dedup stuff, but the documents I'm adding
> aren't really duplicates. They're very similar, but different.
> 
> I checked out the field collapsing feature patch, applied the patch but
> can't get it to build successfully. Will this patch work with a nightly
> build?
> 
> Thanks!
> 
> On Fri, May 15, 2009 at 7:47 PM, Otis Gospodnetic <
> otis_gospodne...@yahoo.com> wrote:
> 
> >
> > Matt - you may also want to detect near duplicates at index time:
> >
> > http://wiki.apache.org/solr/Deduplication
> >
> >  Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > ----- Original Message ----
> > > From: Matt Mitchell 
> > > To: solr-user@lucene.apache.org
> > > Sent: Friday, May 15, 2009 6:52:48 PM
> > > Subject: grouping response docs together
> > >
> > > Is there a built-in mechanism for grouping similar documents together in
> > the
> > > response? I'd like to make it look like there is only one document with
> > > multiple "hits".
> > >
> > > Matt
> >
> >

Reply via email to