Re: [PR] Compute all groups for group by queries with only filtered aggregations [pinot]

via GitHub Fri, 11 Oct 2024 11:42:28 -0700


Jackie-Jiang commented on code in PR #14211:
URL: https://github.com/apache/pinot/pull/14211#discussion_r1797310958



##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java:
##########
@@ -458,6 +458,15 @@ public static class QueryOptionKey {
         // executed in an  Unbounded FCFS fashion. However, secondary 
workloads are executed in a constrainted FCFS
         // fashion with limited compute.
         public static final String IS_SECONDARY_WORKLOAD = 
"isSecondaryWorkload";
+
+        // For group by queries with only filtered aggregations (and no 
non-filtered aggregations), the v1 query engine
+        // does not compute all groups by default - instead, it will only 
compute the groups from the filtered result
+        // set (i.e., union of the main query filter and all the individual 
aggregation filters). This is good for
+        // performance, since indexes can be used, but the result won't match 
standard SQL behavior (where all groups
+        // are expected to be returned). If this option is set to true, the v1 
query engine will compute all groups for
+        // group by queries with only filtered aggregations. This could 
require a full scan if the main query does not
+        // have a filter and performance could be much worse, but the result 
will be SQL compliant.
+        public static final String FILTERED_AGGREGATIONS_COMPUTE_ALL_GROUPS = 
"filteredAggregationsComputeAllGroups";

Review Comment:
   Since the overhead seems fine (no extra aggregate needed), I'd suggest doing 
the standard SQL behavior by default, and add an option to skip the all groups 
computation



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultistageGroupByExecutor.java:
##########
@@ -241,6 +241,11 @@ private void processAggregate(TransferableBlock block) {
           aggFunction.aggregateGroupBySV(numMatchedRows, filteredIntKeys, 
groupByResultHolder, blockValSetMap);
         }
       }
+      if (intKeys == null) {

Review Comment:
   Let's make this also controlled by query option to be consistent with v1



##########
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionUtils.java:
##########
@@ -384,7 +385,12 @@ public static List<AggregationInfo> 
buildFilteredAggregationInfos(SegmentContext
       }
     }
 
-    if (!nonFilteredFunctions.isEmpty()) {
+    if (!nonFilteredFunctions.isEmpty() || 
QueryOptionsUtils.isFilteredAggregationsComputeAllGroups(

Review Comment:
   Wow, didn't expect this to be so simple! NIce!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Compute all groups for group by queries with only filtered aggregations [pinot]

Reply via email to