gsmiller commented on code in PR #974:
URL: https://github.com/apache/lucene/pull/974#discussion_r908941811
##########
lucene/facet/src/java/org/apache/lucene/facet/range/RangeFacetCounts.java:
##########
@@ -232,20 +233,43 @@ public FacetResult getAllChildren(String dim, String...
path) throws IOException
return new FacetResult(dim, path, totCount, labelValues,
labelValues.length);
}
- // The current getTopChildren method is not returning "top" ranges. Instead,
it returns all
- // user-provided ranges in
- // the order the user specified them when instantiating. This concept is
being introduced and
- // supported in the
- // getAllChildren functionality in LUCENE-10550. getTopChildren is
temporarily calling
- // getAllChildren to maintain its
- // current behavior, and the current implementation will be replaced by an
actual "top children"
- // implementation
- // in LUCENE-10614
- // TODO: fix getTopChildren in LUCENE-10614
@Override
public FacetResult getTopChildren(int topN, String dim, String... path)
throws IOException {
validateTopN(topN);
- return getAllChildren(dim, path);
+ validateDimAndPathForGetChildren(dim, path);
+
+ int resultSize = Math.min(topN, counts.length);
+ PriorityQueue<LabelAndValue> pq =
+ new PriorityQueue<>(resultSize) {
+ @Override
+ protected boolean lessThan(LabelAndValue a, LabelAndValue b) {
+ int cmp = Integer.compare(a.value.intValue(), b.value.intValue());
+ if (cmp == 0) {
+ cmp = b.label.compareTo(a.label);
+ }
+ return cmp < 0;
+ }
+ };
+
+ for (int i = 0; i < counts.length; i++) {
+ if (pq.size() < resultSize) {
+ pq.add(new LabelAndValue(ranges[i].label, counts[i]));
Review Comment:
Hmm, good point. I don't think we should change the behavior of
`getAllChildren`. I think users will expect to get back an entry for each range
they provided, and since that's the existing behavior anyway, I'd prefer not to
change it. So now I'm actually wondering if the `getAllChildren` API should
actually provide _all_ children in all of the implementations, whether-or-not
the count is zero. I think it's actually OK that `getAllChildren` is not
identical behavior as `getTopChildren` with a huge top-n value. I think it's OK
if we ignore entries with a zero-count in the "top children" case, but now I'm
not so sure in the "all children" case. What do you think users would most
likely expect here? Should we change `getAllChildren` in all `Facets`
implementations to return children with a zero-count?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]