slfan1989 commented on code in PR #13907:
URL: https://github.com/apache/iceberg/pull/13907#discussion_r2317259985


##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java:
##########
@@ -133,7 +133,47 @@ public void removeTables() {
     sql("DROP TABLE IF EXISTS %s", tableName(OTHER_TABLE_NAME));
   }
 
-  // TODO: add tests for truncate transforms once SPARK-40295 is released
+  // TODO: SPJ does not currently leverage truncate(...) partition transforms 
for partition
+  // alignment.
+  // SPARK-40295 improved related areas, but full truncate support is tracked 
in SPARK-50593.
+  // This test documents current behavior; update/extend once SPARK-50593 
lands.
+  @TestTemplate
+  public void testJoinWithTruncatePartitioning() {
+    sql(
+        "CREATE TABLE %s (id BIGINT, int_col INT, dep STRING) "
+            + "USING iceberg "
+            + "PARTITIONED BY (truncate(4, dep)) "
+            + "TBLPROPERTIES (%s)",
+        tableName, tablePropsAsString(TABLE_PROPERTIES));
+
+    sql("INSERT INTO %s VALUES (1L, 100, 'software')", tableName);
+    sql("INSERT INTO %s VALUES (2L, 200, 'software')", tableName);
+    sql("INSERT INTO %s VALUES (3L, 300, 'software')", tableName);
+
+    sql(
+        "CREATE TABLE %s (id BIGINT, int_col INT, dep STRING) "
+            + "USING iceberg "
+            + "PARTITIONED BY (truncate(4, dep)) "
+            + "TBLPROPERTIES (%s)",
+        tableName(OTHER_TABLE_NAME), tablePropsAsString(TABLE_PROPERTIES));
+
+    sql("INSERT INTO %s VALUES (1L, 100, 'software')", 
tableName(OTHER_TABLE_NAME));
+    sql("INSERT INTO %s VALUES (2L, 200, 'software')", 
tableName(OTHER_TABLE_NAME));
+    sql("INSERT INTO %s VALUES (3L, 300, 'software')", 
tableName(OTHER_TABLE_NAME));
+
+    // TODO(SPARK-50593): Once truncate transforms are leveraged by SPJ, 
expected shuffles with SPJ

Review Comment:
   Thank you for your feedback! I understand your point and agree to some 
extent. However, I believe we should at least improve the comments to indicate 
that, even though SPARK-40295 is ready, we still need to wait for SPARK-50593 
before we can proceed with this TODO.
   
   Let me briefly explain the reason for this improvement:
   
   When I first discovered this TODO, I checked SPARK-40295 and found that it 
was ready. I then tested it locally and found that, even with SPARK-40295 
implemented, truncate still cannot utilize Spark's SPJ features. My initial 
thought was to directly modify the TODO, but I was concerned that a review 
might inquire about the issue I encountered. So, I decided to document my 
verification tests here. Currently, this test can only verify that, under 
SPARK-40295, truncate still does not support SPJ.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to