vvivekiyer commented on code in PR #14001:
URL: https://github.com/apache/pinot/pull/14001#discussion_r1761509972


##########
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java:
##########
@@ -183,4 +194,26 @@ public GroupKeyGenerator getGroupKeyGenerator() {
   public GroupByResultHolder[] getGroupByResultHolders() {
     return _groupByResultHolders;
   }
+
+  private Integer 
getGroupByResultHolderCapacityBasedOnFilterPredicate(QueryContext queryContext) 
{
+    if (queryContext.getFilter() == null || 
queryContext.getGroupByExpressions() == null) {
+      return null;
+    }
+
+    List<FilterContext> filterContexts = 
queryContext.getFilter().getChildren() != null
+        ? queryContext.getFilter().getChildren()
+        : List.of(queryContext.getFilter());
+
+    Map<ExpressionContext, InPredicate> predicateMap = filterContexts.stream()
+        .map(FilterContext::getPredicate)
+        .filter(predicate -> predicate.getType() == Predicate.Type.IN)
+        .collect(Collectors.toMap(Predicate::getLhs, predicate -> 
(InPredicate) predicate));
+
+    OptionalInt result = queryContext.getGroupByExpressions().stream()

Review Comment:
   If there is a single groupBy expression with a filter on the groupBy, this 
is correct. 
   
   However, consider the case where there are 2 groupByExpressions but only a 
filter on 1 column. In this case this optimization will not work.



##########
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java:
##########
@@ -183,4 +194,26 @@ public GroupKeyGenerator getGroupKeyGenerator() {
   public GroupByResultHolder[] getGroupByResultHolders() {
     return _groupByResultHolders;
   }
+
+  private Integer 
getGroupByResultHolderCapacityBasedOnFilterPredicate(QueryContext queryContext) 
{
+    if (queryContext.getFilter() == null || 
queryContext.getGroupByExpressions() == null) {
+      return null;
+    }
+
+    List<FilterContext> filterContexts = 
queryContext.getFilter().getChildren() != null
+        ? queryContext.getFilter().getChildren()
+        : List.of(queryContext.getFilter());
+
+    Map<ExpressionContext, InPredicate> predicateMap = filterContexts.stream()
+        .map(FilterContext::getPredicate)
+        .filter(predicate -> predicate.getType() == Predicate.Type.IN)

Review Comment:
   We should also do this for EQ predicate? 



##########
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java:
##########
@@ -183,4 +194,26 @@ public GroupKeyGenerator getGroupKeyGenerator() {
   public GroupByResultHolder[] getGroupByResultHolders() {
     return _groupByResultHolders;
   }
+
+  private Integer 
getGroupByResultHolderCapacityBasedOnFilterPredicate(QueryContext queryContext) 
{
+    if (queryContext.getFilter() == null || 
queryContext.getGroupByExpressions() == null) {

Review Comment:
   Unless there are groupByExpressions, we wouldn't come in to this class. It's 
better for this to be an assert if at all you want to ensure this case. 
   
   I think a better check here is to see if the groupByExpression contains only 
a single column (assuming that is the case you are optimizing)
   



##########
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java:
##########
@@ -110,7 +117,11 @@ public DefaultGroupByExecutor(QueryContext queryContext, 
AggregationFunction[] a
 
     // Initialize result holders
     int maxNumResults = _groupKeyGenerator.getGlobalGroupKeyUpperBound();
+    Integer optimalGroupByResultHolderCapacity = 
getGroupByResultHolderCapacityBasedOnFilterPredicate(queryContext);

Review Comment:
   Can we wire the logic of determining the filter-based result holder limit 
into  `maxInitialResultHolderCapacity()`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to