Hi Toke, Thanks for sharing solr internal's for my problem. I will definitely try Cursor also but only problem is my current solr version is 4.6.1 in which i guess cursor support is not there. Any other option i have for this problem ??
Also as per your suggestion i will try to avoid regional units in post. Thanks Naresh On Sun, Jan 18, 2015 at 4:19 PM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > Naresh Yadav [nyadav....@gmail.com] wrote: > > In both setups, we are reading in batches of 50k and each batch taking > > Setup1 : approx 7 seconds and for completing all batches of total 10 > lakh > > results takes 1 to 2 minutes. > > Setup2 : approx 2-3 minutes and for completing all batches of total 10 > lakh > > results takes 114 minutes. > > Deep paging across shards without cursors means that for each request, the > full result set up to that point must be requested from each shard. The > deeper your page, the longer it takes for each request. If you only > extracted 500K results instead of the 1M in setup 2, it would likely take a > lot less than 114/2 minutes. > > Since you are exporting the full result set, you should be using a cursor: > https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results > This should make your extraction linear to the number of documents and > hopefully a lot faster than your current setup. > > Also, please refrain from using regional units such as "lakh" in an > international forum. It requires some readers (me for example) to perform a > search in order to be sure on what you are talking about. > > - Toke Eskildsen >