Matt, The Deduplication feature in Solr does support near-duplicate scenario. It comes with a few components to help you detect near-duplicates, and you should be able to write a custom near-dupe detection component and plug it in.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Matt Mitchell <goodie...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Monday, May 25, 2009 3:30:42 PM > Subject: Re: grouping response docs together > > Thanks guys. I looked at the dedup stuff, but the documents I'm adding > aren't really duplicates. They're very similar, but different. > > I checked out the field collapsing feature patch, applied the patch but > can't get it to build successfully. Will this patch work with a nightly > build? > > Thanks! > > On Fri, May 15, 2009 at 7:47 PM, Otis Gospodnetic < > otis_gospodne...@yahoo.com> wrote: > > > > > Matt - you may also want to detect near duplicates at index time: > > > > http://wiki.apache.org/solr/Deduplication > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > ----- Original Message ---- > > > From: Matt Mitchell > > > To: solr-user@lucene.apache.org > > > Sent: Friday, May 15, 2009 6:52:48 PM > > > Subject: grouping response docs together > > > > > > Is there a built-in mechanism for grouping similar documents together in > > the > > > response? I'd like to make it look like there is only one document with > > > multiple "hits". > > > > > > Matt > > > >