Hi, This will help you identify the duplicates: q=*:*&fl=id&facet=true&facet.mincount=2&rows=0&facet.field=<One_Of_The_Duplicated_Fields>
To actually remove them from Solr, you will have to do something like Robert suggested. Write an application that uses the results to build a delete by id query ( http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_documents_by_ID_and_by_Query ). Regards, Aloke On Thu, Aug 22, 2013 at 3:04 AM, Ali, Saqib <docbook....@gmail.com> wrote: > Thanks Aloke and Robert. Can you please give me code/query snippets? > (newbie here) > > > On Wed, Aug 21, 2013 at 2:31 PM, Aloke Ghoshal <alghos...@gmail.com> > wrote: > > > Hi, > > > > Facet by one of the duplicate fields (probably by the numeric field that > > you mentioned) and set facet.mincount=2. > > > > Regards, > > Aloke > > > > > > On Thu, Aug 22, 2013 at 2:44 AM, Ali, Saqib <docbook....@gmail.com> > wrote: > > > > > hello, > > > > > > We have documents that are duplicates i.e. the ID is different, but > rest > > of > > > the fields are same. Is there a query that can remove duplicate, and > just > > > leave one copy of the document on solr? There is one numeric field that > > we > > > can key off for find duplicates. > > > > > > Please advise. > > > > > > Thanks > > > > > >