Re: how to remove duplicate data while facet?

蒋明原 Tue, 11 Dec 2012 09:42:22 -0800

Thank you,first of all,
Yes,no same unique key means no this trouble.
 But for me now,I can't reindex my data,it's too big.And it,s in production
environment .
So,any friends have solutions?


Thank you .

On Wednesday, December 12, 2012, Pawel wrote:

> I think that solution is quite obvous. Be sure that you don't have items
> with the same unique key in many shards :)
>
> On Tue, Dec 11, 2012 at 5:24 PM, 蒋明原 
> <mailtojiangmingy...@gmail.com<javascript:;>>
> wrote:
>
> > hi,all,
> >
> > I'm doing a distribute facet query,and there duplicate data among the
> > distribute cluster.
> > for example:
> >
> > server A hold documents:
> >
> > Doc1: uniqueKey=1 userid=a
> > Doc2: uniqueKey=2 userid=b
> > Doc3: uniqueKey=3  userid=c
> >
> > server B hold documents:
> > Doc1: uniqueKey=1 userid=a
> > Doc2: uniqueKey=4 userid=b
> > Doc3: uniqueKey=5  userid=c
> >
> > when a make a facet query using filed "userid", the expect result is:
> > a:1
> > b:2
> > c:2
> >
> > but solr gives me:
> >
> > a:2
> > b:2
> > c:2
> >
> > However, I make a normal query using : userid:a,
> > solr gives me total 1 result.
> >
> > It seems like: when making facet query,duplicate key will still
> participate
> > in calculate,but when making normal query,solr will choose only 1
> document
> > between duplication document.
> >
> > So,My problem is "how to remove duplicate documents during distributed
> > facet search."
> >
> > thanks !
> >
>

Re: how to remove duplicate data while facet?

Reply via email to