javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1741835108
##########
lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java:
##########
@@ -28,17 +31,77 @@
*/
public class TotalHitCountCollectorManager
implements CollectorManager<TotalHitCountCollector, Integer> {
+
+ /**
+ * Internal state shared across the different collectors that this collector
manager creates. It
+ * tracks leaves seen as an argument of {@link
Collector#getLeafCollector(LeafReaderContext)}
+ * calls, to ensure correctness: if the first partition of a segment early
terminates, count has
+ * been already retrieved for the entire segment hence subsequent partitions
of the same segment
+ * should also early terminate. If the first partition of a segment computes
hit counts,
+ * subsequent partitions of the same segment should do the same, to prevent
their counts from
+ * being retrieve from {@link LRUQueryCache} (which returns counts for the
entire segment)
+ */
+ private final Map<LeafReaderContext, Boolean> seenContexts = new HashMap<>();
Review Comment:
Scratch that, @original-brownbear suggested a different approach (now
included in the PR) that does not require a `synchronized` block. With this, it
feels like we may get around without a user option to disable the overhead when
it's not needed. The flag becomes odd because the searcher knows whether there
are partitions or not and could act accordingly, but the collector manager has
no way currently to know which searcher uses it, or get info from it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]