Goutham I suggest you read Hossman's excellent article on deep paging and why returning rows=(some large number) is a bad idea. It provides an thorough overview of the concept and will explain it better than I ever could (https://lucidworks.com/post/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/#update_2013_12_18). In short if you want to extract that many documents out of your corpus use cursor mark, streaming expressions, or Solr's parallel SQL interface (that uses streaming expressions under the hood) https://lucene.apache.org/solr/guide/8_6/streaming-expressions.html.
Thanks, Dwane ________________________________ From: Goutham Tholpadi <gtholp...@gmail.com> Sent: Friday, 25 September 2020 4:19 PM To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> Subject: Solr queries slow down over time Hi, I have around 30M documents in Solr, and I am doing repeated *:* queries with rows=10000, and changing start to 0, 10000, 20000, and so on, in a loop in my script (using pysolr). At the start of the iteration, the calls to Solr were taking less than 1 sec each. After running for a few hours (with start at around 27M) I found that each call was taking around 30-60 secs. Any pointers on why the same fetch of 10000 records takes much longer now? Does Solr need to load all the 27M before getting the last 10000 records? Is there a better way to do this operation using Solr? Thanks! Goutham