dweiss commented on code in PR #11990:
URL: https://github.com/apache/lucene/pull/11990#discussion_r1036458813


##########
lucene/highlighter/src/java/org/apache/lucene/search/matchhighlight/PassageSelector.java:
##########
@@ -89,8 +89,9 @@ public List<Passage> pickBest(
     }
 
     // Best passages so far.
+    int pqSize = Math.max(markers.size(), maxPassages);

Review Comment:
   I looked at it and I think it was actually a deliberate decision (the 
selection of passages being independent from merging and capped at the 
user-requested max) so that regardless of how many markers there are, the 
overhead of the pq remains fairly low. I realize viewpoints will vary - I use 
this code in cases where the highlighter takes hits from multiple queries (and 
like I said, there can be hundreds of markers...). This change will degrade the 
performance significantly at literally no gain.
   
   I'd try overestimating pqSize based on maxPassages: say, min(markers.size(), 
maxPassages * 3). The parameter could even be configurable so that the overhead 
can be tuned from the outside (with a reasonable default). WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to