Re: Bug Report: Unnecessary memory allocation caused by a large reRankDocs param.

David Smiley Wed, 12 Feb 2025 09:17:59 -0800

Thanks for being willing to file a JIRA and contribute a fix!  Seems like a
straight-forward problem/fix.
Please create a JIRA account if you need to create an issue (assuming one
doesn 't exist for this problem).


On Tue, Feb 11, 2025 at 3:32 AM gaojiabao1...@qq.com.INVALID
<gaojiabao1...@qq.com.invalid> wrote:

> The reRank function has a reRankDocs parameter that specifies the number
> of documents to re-rank. I've observed that increasing this parameter to
> test its performance impact causes queries to become progressively slower.
> Even when the parameter value exceeds the total number of documents in the
> index, further increases continue to slow down the query, which is
> counterintuitive.
>
> Therefore, I investigated the code:
>
> For a query containing re-ranking, such as:
> {
> "start": "0",
> "rows": 10,
> "fl": "ID,score",
> "q": ":",
> "rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000
> reRankWeight=2}"
> }
>
> The current execution logic is as follows:
> 1. Perform normal retrieval using the q parameter.
> 2. Re-score all documents retrieved in the q phase using the rq parameter.
>
> During the retrieval in phase 1 (using q), a TopScoreDocCollector is
> created. Underneath, this creates a PriorityQueue which contains an
> Object[]. The length of this Object[] continuously increases with
> reRankDocs without any limit.
>
> On my local test cluster with limited JVM memory, this can even trigger an
> OOM, causing the Solr node to crash. I can also reproduce the OOM situation
> using the SolrCloudTestCase unit test.
>
> I think limiting the length of the Object[] array using
> searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this
> issue. This way, when reRankDocs exceeds maxDoc, memory allocation will not
> continue to increase indefinitely.
>
> This is my first attempt at identifying a problem in the Solr codebase.
> I'm wondering if this truly a bug? If it does, I'm happy to create a Jira
> issue and try to fix it.
>
>
>
> gaojiabao1...@qq.com
>

Re: Bug Report: Unnecessary memory allocation caused by a large reRankDocs param.

Reply via email to