On 8/20/2018 9:55 PM, Ash Ramesh wrote:
We ran a bunch of deep paginated queries (offset of 1,000,000) with a
filter query. We set the timeout to 5 seconds and it did timeout. We aren't
sure if this is what caused the irrecoverable failure, but by reading this
-
https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
, we feel that this was the cause.
Yes, this is most likely the cause.
Since you have three shards, the problem is even worse than Erick
described. Those 1000010 results will be returned by EVERY shard, and
consolidated on the machine that's actually making the query. So it
will have three million results in memory that it must sort.
Unless you're running on Windows, the bin/solr script will configure
Java to kill itself when OutOfMemoryError occurs. It does this because
program behavior after OOME occurs is completely unpredictable, so
there's a good chance that if it keeps running, it will corrupt the index.
If you're going to be doing queries like this, you need a larger heap.
There's no way around that.
Thanks,
Shawn