Hi,

This will help you identify the duplicates:
q=*:*&fl=id&facet=true&facet.mincount=2&rows=0&facet.field=<One_Of_The_Duplicated_Fields>

To actually remove them from Solr, you will have to do something like
Robert suggested. Write an application that uses the results to build a
delete by id query (
http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_documents_by_ID_and_by_Query
).

Regards,
Aloke


On Thu, Aug 22, 2013 at 3:04 AM, Ali, Saqib <docbook....@gmail.com> wrote:

> Thanks Aloke and Robert. Can you please give me code/query snippets?
> (newbie here)
>
>
> On Wed, Aug 21, 2013 at 2:31 PM, Aloke Ghoshal <alghos...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Facet by one of the duplicate fields (probably by the numeric field that
> > you mentioned) and set facet.mincount=2.
> >
> > Regards,
> > Aloke
> >
> >
> > On Thu, Aug 22, 2013 at 2:44 AM, Ali, Saqib <docbook....@gmail.com>
> wrote:
> >
> > > hello,
> > >
> > > We have documents that are duplicates i.e. the ID is different, but
> rest
> > of
> > > the fields are same. Is there a query that can remove duplicate, and
> just
> > > leave one copy of the document on solr? There is one numeric field that
> > we
> > > can key off for find duplicates.
> > >
> > > Please advise.
> > >
> > > Thanks
> > >
> >
>

Reply via email to