Re: [PR] Add test case for lookup join [pinot]

via GitHub Mon, 17 Mar 2025 04:18:18 -0700


yashmayya commented on code in PR #15244:
URL: https://github.com/apache/pinot/pull/15244#discussion_r1998517975



##########
pinot-integration-tests/src/test/resources/baseballStats_data.csv:
##########


Review Comment:
   This is an 8 MB file, I'm wondering if it's really necessary to introduce 
this? Can't we add some dummy data for a dimension table that can be joined 
with the existing `On_Time_On_Time_Performance_2014_100k_subset_nonulls` data? 
Maybe something like `DayOfWeek` integer primary key to `DayOfWeekName` strings 
(just one idea, can be anything really since we don't care about the actual 
results, just the lookup join functionality).



##########
pinot-integration-tests/src/test/resources/baseballStats_offline_table_config.json:
##########
@@ -0,0 +1,38 @@
+{
+  "tableName": "baseballStats",
+  "tableType": "OFFLINE",
+  "segmentsConfig": {
+    "segmentPushType": "APPEND",
+    "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
+    "schemaName": "baseballStats",
+    "replication": "1"
+  },
+  "tenants": {
+  },
+  "tableIndexConfig": {
+    "loadMode": "HEAP",
+    "invertedIndexColumns": [
+      "playerID",
+      "teamID"
+    ]
+  },
+  "metadata": {
+    "customConfigs": {
+    }
+  },
+  "ingestionConfig": {
+    "batchIngestionConfig": {
+      "segmentIngestionType": "APPEND",
+      "segmentIngestionFrequency": "DAILY",

Review Comment:
   Why is this required?



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/MultiStageEngineIntegrationTest.java:
##########
@@ -1596,6 +1606,32 @@ public void testNumServersQueried() throws Exception {
   }
 
   @Test
+  public void testLookupJoin() throws Exception {
+
+    Schema lookupTableSchema = createSchema(DIM_TABLE_SCHEMA_PATH);
+    addSchema(lookupTableSchema);
+    TableConfig tableConfig = createTableConfig(DIM_TABLE_TABLE_CONFIG_PATH);
+    addTableConfig(tableConfig);
+    createAndUploadSegmentFromFile(tableConfig, lookupTableSchema, 
DIM_TABLE_DATA_PATH, FileFormat.CSV,
+        DIM_NUMBER_OF_RECORDS, 60_000);
+
+    Schema primaryTableSchema = createSchema(PRIMARY_TABLE_SCHEMA_PATH);
+    addSchema(primaryTableSchema);
+    TableConfig primaryTableConfig = 
createTableConfig(PRIMARY_TABLE_TABLE_CONFIG_PATH);
+    addTableConfig(primaryTableConfig);
+    createAndUploadSegmentFromFile(primaryTableConfig, primaryTableSchema, 
PRIMARY_TABLE_DATA_PATH, FileFormat.CSV,
+        PRIMARY_NUMBER_OF_RECORDS, 60_000);
+
+    String query = "select /*+ joinOptions(join_strategy='lookup') */ yearID, 
teamName from baseballStats "
+        + "join dimBaseballTeams ON baseballStats.teamID = 
dimBaseballTeams.teamID where playerId = 'aardsda01'";
+    JsonNode jsonNode = postQuery(query);
+    long result = jsonNode.get("resultTable").get("rows").size();
+    assertEquals(result, 3);

Review Comment:
   That sounds good to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Add test case for lookup join [pinot]

Reply via email to