siddharthteotia commented on code in PR #10120:
URL: https://github.com/apache/pinot/pull/10120#discussion_r1073802214


##########
pinot-query-runtime/src/test/resources/queries/Skew.json:
##########
@@ -0,0 +1,79 @@
+{
+  "skew": {
+    "tables": {
+      "tbl": {
+        "schema": [
+          {"name": "groupingCol", "type": "STRING"},
+          {"name": "partitionCol", "type": "STRING"},
+          {"name": "val", "type": "INT"}
+        ],
+        "inputs": [
+          ["a", "key1", 1],
+          ["a", "key2", 2],
+          ["a", "key3", 3],
+          ["a", "key1", 4],
+          ["a", "key2", 4],
+          ["a", "key3", 4],
+          ["a", "key1", 7],
+          ["a", "key2", 9],
+          ["b", "key3", 1],
+          ["b", "key1", 2],
+          ["b", "key2", 3],
+          ["b", "key3", 4],
+          ["b", "key1", 4],
+          ["b", "key2", 4],
+          ["b", "key3", 7],
+          ["b", "key1", 9]
+        ],
+        "partitionColumns": [
+          "partitionCol"
+        ]
+      },
+      "tbl2": {
+        "schema": [
+          {"name": "groupingCol", "type": "STRING"},
+          {"name": "partitionCol", "type": "STRING"},
+          {"name": "val", "type": "INT"}
+        ],
+        "inputs": [
+          ["a", "key1", 1],
+          ["a", "key2", 2],
+          ["a", "key3", 3],
+          ["a", "key1", 4],
+          ["a", "key2", 4],
+          ["a", "key3", 4],
+          ["a", "key1", 7],
+          ["a", "key2", 9],
+          ["b", "key3", 1],
+          ["b", "key1", 2],
+          ["b", "key2", 3],
+          ["b", "key3", 4],
+          ["b", "key1", 4],
+          ["b", "key2", 4],
+          ["b", "key3", 7],
+          ["b", "key1", 9]
+        ],
+        "partitionColumns": [
+          "partitionCol"
+        ]
+      }
+    },
+    "queries": [
+      {
+        "description": "skew for int column",
+        "sql": "SELECT groupingCol, SKEWNESS(val), KURTOSIS(val) FROM {tbl} 
GROUP BY groupingCol",
+        "outputs": [
+          ["a", 0.8647536091225356, 0.3561662049861511],
+          ["b", 0.8647536091225356, 0.3561662049861511]
+        ]
+      },
+      {
+        "sql": "SELECT t1.groupingCol, SKEWNESS(t1.val + t2.val), 
KURTOSIS(t1.val + t2.val) FROM {tbl} AS t1 LEFT JOIN {tbl2} AS t2 USING 
(partitionCol) GROUP BY t1.groupingCol",

Review Comment:
   You may also want to add a test for `EXPLAIN PLAN` for these queries ?
   
   IIUC - The 2nd query will not push down the `fourthMoment` to leaf since it 
has to be computed on top of `JOIN` and therefore will use the new code you 
added in this PR. 
   
   But the previous query is a typical 2-stage plan where the aggregates are 
computed at leaf layer by current engine operators and then merged / reduced on 
broker. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to