siddharthteotia commented on a change in pull request #5451: URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432678859
########## File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java ########## @@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder aggregationResultHolde columnDataTypes[i] = ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType()); } DataSchema dataSchema = new DataSchema(_columns, columnDataTypes); - distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity); + distinctTable = new DistinctTable(dataSchema, _orderBy, _limit); aggregationResultHolder.setValue(distinctTable); + } else if (distinctTable.shouldNotAddMore()) { + return; } - // TODO: Follow up PR will make few changes to start using DictionaryBasedAggregationOperator - // for DISTINCT queries without filter. + // TODO: Follow up PR will make few changes to start using DictionaryBasedAggregationOperator for DISTINCT queries + // without filter. RowBasedBlockValueFetcher blockValueFetcher = new RowBasedBlockValueFetcher(blockValSets); - // TODO: Do early termination in the operator itself which should - // not call aggregate function at all if the limit has reached - // that will require the interface change since this function - // has to communicate back that required number of records have - // been collected for (int i = 0; i < length; i++) { - distinctTable.upsert(new Record(blockValueFetcher.getRow(i))); + if (!distinctTable.add(new Record(blockValueFetcher.getRow(i)))) { Review comment: I think this for loop should be written separately for order by and non order by. For order by, there is no early termination so if check can be avoided since the return value will always be true. For non order, after adding every record, check the return value to see if limit has been reached and terminate early ########## File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java ########## @@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder aggregationResultHolde columnDataTypes[i] = ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType()); } DataSchema dataSchema = new DataSchema(_columns, columnDataTypes); - distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity); + distinctTable = new DistinctTable(dataSchema, _orderBy, _limit); aggregationResultHolder.setValue(distinctTable); + } else if (distinctTable.shouldNotAddMore()) { + return; } - // TODO: Follow up PR will make few changes to start using DictionaryBasedAggregationOperator - // for DISTINCT queries without filter. + // TODO: Follow up PR will make few changes to start using DictionaryBasedAggregationOperator for DISTINCT queries + // without filter. RowBasedBlockValueFetcher blockValueFetcher = new RowBasedBlockValueFetcher(blockValSets); - // TODO: Do early termination in the operator itself which should - // not call aggregate function at all if the limit has reached - // that will require the interface change since this function - // has to communicate back that required number of records have - // been collected for (int i = 0; i < length; i++) { - distinctTable.upsert(new Record(blockValueFetcher.getRow(i))); + if (!distinctTable.add(new Record(blockValueFetcher.getRow(i)))) { Review comment: I think this for loop should be written separately for order by and non order by. For order by, there is no early termination so if check can be avoided since the return value will always be true. For non order, after adding every record, check the return value to see if limit has been reached and terminate early within the loop ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org