gortiz commented on code in PR #15095: URL: https://github.com/apache/pinot/pull/15095#discussion_r1963038947
########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/AggregateOperator.java: ########## @@ -85,8 +85,10 @@ public class AggregateOperator extends MultiStageOperator { // trimming - related members private final int _groupTrimSize; + // Comparator is used in priority queue, and the order is reversed so that peek() can be used to compare with each + // output row Review Comment: nit: Use javadoc style comments. Alternatively, use the javadoc style introduced in [Java 23](https://openjdk.org/jeps/467). We probably won't get javadoc rendering in the IDE unless we configure it to use a new JDK version, but eventually we will migrate to a Java version that supports it ;) ########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/AggregateOperator.java: ########## @@ -108,33 +110,28 @@ public AggregateOperator(OpChainExecutionContext context, MultiStageOperator inp List<Integer> groupKeys = node.getGroupKeys(); - //process order trimming hint - int groupTrimSize = getGroupTrimSize(node.getNodeHint(), context.getOpChainMetadata()); - - if (groupTrimSize > -1) { - // limit is set to 0 if not pushed - int nodeLimit = node.getLimit() > 0 ? node.getLimit() : Integer.MAX_VALUE; - int limit = GroupByUtils.getTableCapacity(nodeLimit, groupTrimSize); - _groupTrimSize = limit; - if (limit == Integer.MAX_VALUE) { - // disable sorting because actual result can't realistically be bigger the limit - _priorityQueue = null; + int groupTrimSize = Integer.MAX_VALUE; + Comparator<Object[]> comparator = null; + int limit = node.getLimit(); + if (limit > 0) { + List<RelFieldCollation> collations = node.getCollations(); + if (collations.isEmpty()) { + groupTrimSize = limit; Review Comment: This behavior is not formally correct, right? If the stage has a parallelism higher than 1 each worker may pick their own keys. If there is a reduce phase later (which I think is always the case when limit is applied), the worker executing that reduce will not see the correct values. Assuming what I said is correct, I think we need the ability to disable this optimization with a config and/or hint -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org