On 05/06/2013 06:03 AM, Michael Sokolov wrote:
On 5/5/13 7:48 PM, Mingfeng Yang wrote:
Dear Solr Users,

Does anyone know what is the best way to iterate through each document in a
Solr index with billion entries?

I tried to use  select?q=*:*&start=xx&rows=500  to get 500 docs each time
and then change start value, but it got very slow after getting through
about 10 million docs.

Thanks,
Ming-

You need to use a unique and stable sort key and get documents>
sortkey.  For example, if you have a unique key, retrieve documents
ordered by the unique key, and for each batch get documents>  max (key)
from the previous batch

-Mike

There is more details on the wiki :
http://wiki.apache.org/solr/CommonQueryParameters#pageDoc_and_pageScore


--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

Reply via email to