Thanks for being willing to file a JIRA and contribute a fix! Seems like a straight-forward problem/fix. Please create a JIRA account if you need to create an issue (assuming one doesn 't exist for this problem).
On Tue, Feb 11, 2025 at 3:32 AM gaojiabao1...@qq.com.INVALID <gaojiabao1...@qq.com.invalid> wrote: > The reRank function has a reRankDocs parameter that specifies the number > of documents to re-rank. I've observed that increasing this parameter to > test its performance impact causes queries to become progressively slower. > Even when the parameter value exceeds the total number of documents in the > index, further increases continue to slow down the query, which is > counterintuitive. > > Therefore, I investigated the code: > > For a query containing re-ranking, such as: > { > "start": "0", > "rows": 10, > "fl": "ID,score", > "q": ":", > "rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000 > reRankWeight=2}" > } > > The current execution logic is as follows: > 1. Perform normal retrieval using the q parameter. > 2. Re-score all documents retrieved in the q phase using the rq parameter. > > During the retrieval in phase 1 (using q), a TopScoreDocCollector is > created. Underneath, this creates a PriorityQueue which contains an > Object[]. The length of this Object[] continuously increases with > reRankDocs without any limit. > > On my local test cluster with limited JVM memory, this can even trigger an > OOM, causing the Solr node to crash. I can also reproduce the OOM situation > using the SolrCloudTestCase unit test. > > I think limiting the length of the Object[] array using > searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this > issue. This way, when reRankDocs exceeds maxDoc, memory allocation will not > continue to increase indefinitely. > > This is my first attempt at identifying a problem in the Solr codebase. > I'm wondering if this truly a bug? If it does, I'm happy to create a Jira > issue and try to fix it. > > > > gaojiabao1...@qq.com >