[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499566#comment-17499566
]
Adrien Grand commented on LUCENE-10428:
---------------------------------------
Sorry the PR should have been linked automatically in JIRA given the naming
convention, I don't know why it didn't work this time. Here it is:
https://github.com/apache/lucene/pull/711. It does capture debug information as
you suggested.
> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge
> leading to busy threads in infinite loop
> -----------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/query/scoring, core/search
> Reporter: Ankit Jain
> Priority: Major
> Attachments: Flame_graph.png
>
>
> Customers complained about high CPU for Elasticsearch cluster in production.
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205
> AmMLzDQ4RrOJievRDeGFZw:569204 direct 1645195007282 14:36:47 6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075
> emjWc5bUTG6lgnCGLulq-Q:502074 direct 1645195037259 14:37:17 6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270
> emjWc5bUTG6lgnCGLulq-Q:583269 direct 1645201316981 16:21:56 4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some
> live JVM debugging found that
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) =
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]