Re: iterate through each document in Solr

2013-05-06 Thread Dmitry Kan
Hi Ming, Quoting my anwser on a diff. thread ( http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201210.mbox/%3ccaonbidbuzzsaqctdhtlxlgeoori_ghrjbt-84bm0zb-fsps...@mail.gmail.com%3E ): > > [code] > > Directory indexDir = FSDirectory.open(new File(pathToDir)); > > IndexReader input = Index

Re: iterate through each document in Solr

2013-05-06 Thread Mingfeng Yang
Andre, Thanks for the info! Unfortunately, my solr is on 3.6 version, and looks like those options are not available. :( Ming- On Mon, May 6, 2013 at 5:32 AM, Andre Bois-Crettez wrote: > On 05/06/2013 06:03 AM, Michael Sokolov wrote: > >> On 5/5/13 7:48 PM, Mingfeng Yang wrote: >> >>> Dear So

Re: iterate through each document in Solr

2013-05-06 Thread Mingfeng Yang
Hi Dmitry, My index is not sharded, and since its size is so big, sharding won't help much on the paging issue. Do you know any API which can help read from lucene binary index directly? I will be nice if we can just scan through the docs directly. Thanks! Ming- On Mon, May 6, 2013 at 3:33

Re: iterate through each document in Solr

2013-05-06 Thread Andre Bois-Crettez
On 05/06/2013 06:03 AM, Michael Sokolov wrote: On 5/5/13 7:48 PM, Mingfeng Yang wrote: Dear Solr Users, Does anyone know what is the best way to iterate through each document in a Solr index with billion entries? I tried to use select?q=*:*&start=xx&rows=500 to get 500 docs each time and the

Re: iterate through each document in Solr

2013-05-06 Thread Dmitry Kan
Are you doing it once? Is your index sharded? If so, can you ask each shard individually? Another way would be to do it on Lucene level, i.e. read from the binary indices (API exists). Dmitry On Mon, May 6, 2013 at 5:48 AM, Mingfeng Yang wrote: > Dear Solr Users, > > Does anyone know what is t

Re: iterate through each document in Solr

2013-05-05 Thread Michael Sokolov
On 5/5/13 7:48 PM, Mingfeng Yang wrote: Dear Solr Users, Does anyone know what is the best way to iterate through each document in a Solr index with billion entries? I tried to use select?q=*:*&start=xx&rows=500 to get 500 docs each time and then change start value, but it got very slow after

iterate through each document in Solr

2013-05-05 Thread Mingfeng Yang
Dear Solr Users, Does anyone know what is the best way to iterate through each document in a Solr index with billion entries? I tried to use select?q=*:*&start=xx&rows=500 to get 500 docs each time and then change start value, but it got very slow after getting through about 10 million docs. T