The first question I'd ask is "why are there duplicates in your index in the first place?". If you're denormalizing, that would account for it. Mostly, I'm just asking to be sure that you expect duplicate product IDs. If you make your productid a <uniqueKey>, there'll only be one of each....
You'll have to re-index if you make this change though. But grouping/field collapsing would, indeed, apply to this problem. deduplication isn't applicable, since you know exactly what duplicates are. deduplication is more for "fuzzy" removal of near-duplicates.. Hope this helps Erick On Wed, Aug 31, 2011 at 12:01 AM, Aaron Bains <aaronba...@gmail.com> wrote: > Hello, > > What is the best way to remove duplicate values on output. I am using the > following query: > > /solr/select/?q=wrt54g2&version=2.2&start=0&rows=10&indent=on&*fl=productid* > > And I get the following results: > > <doc> > <int name="productid">1011630553</int> > </doc> > <doc> > <int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1011630553</int> > </doc> > <doc><int name="productid">1013033708</int> > </doc> > <doc><int name="productid">1013033708</int> > </doc> > <doc><int name="productid">1013033708</int> > </doc> > > > But I don't want those results because there are duplicates. I am looking > for results like below: > > <doc> > <int name="productid">1011630553</int> > </doc> > <doc> > <int name="productid">1013033708</int> > </doc> > > I know there is deduplication and field collapsing but I am not sure if they > are applicable in this situation. Thanks for your help! >