szehon-ho commented on code in PR #6779: URL: https://github.com/apache/iceberg/pull/6779#discussion_r1110474171
########## spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAddFilesProcedure.java: ########## @@ -911,6 +935,14 @@ public void testPartitionedImportFromEmptyPartitionDoesNotThrow() { new StructField("ts", DataTypes.DateType, true, Metadata.empty()) }; + private static final StructField[] dateHourStruct = { + new StructField("id", DataTypes.IntegerType, true, Metadata.empty()), + new StructField("name", DataTypes.StringType, true, Metadata.empty()), + new StructField("dept", DataTypes.StringType, true, Metadata.empty()), + new StructField("ts", DataTypes.DateType, true, Metadata.empty()), + new StructField("hour", DataTypes.StringType, true, Metadata.empty()) Review Comment: OK in that case, I'd prefer not extra structs that's not strictly necessary, to keep the changes smaller. I dont see string hour being of a value like 01 being much more readable than a dept that has a name like 01 to justify a new struct. I think we can still make a separate DF if we need to. ########## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java: ########## @@ -815,9 +816,30 @@ public static String quotedFullIdentifier(String catalogName, Identifier identif * @param format format of the file * @param partitionFilter partitionFilter of the file * @return all table's partitions + * @deprecated use {@link Spark3Util#getPartitions(SparkSession, Path, String, Map, Option)} */ + @Deprecated public static List<SparkPartition> getPartitions( SparkSession spark, Path rootPath, String format, Map<String, String> partitionFilter) { + return getPartitions(spark, rootPath, format, partitionFilter, Optional.empty()); + } + + /** + * Use Spark to list all partitions in the table. + * + * @param spark a Spark session + * @param rootPath a table identifier + * @param format format of the file + * @param partitionFilter partitionFilter of the file + * @param partitionSpec partitionSpec of the table + * @return all table's partitions + */ + public static List<SparkPartition> getPartitions( + SparkSession spark, + Path rootPath, + String format, + Map<String, String> partitionFilter, + Optional<PartitionSpec> partitionSpec) { Review Comment: I was saying in original comment, Optional javadoc mentions it's usually for return value, I think in Java it's not so frequently used for arguments. So I would say, let's just make a PartitionSpec that can be null. We have the other version that takes in 4 arguments for users. ########## spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAddFilesProcedure.java: ########## @@ -926,6 +955,17 @@ private static java.sql.Date toDate(String value) { new StructType(dateStruct)) .repartition(2); + private static final Dataset<Row> dateHourDF = Review Comment: I think dateDF is specifically to address date partition test, and dateHourDf doesnt indicate that is testing a case where hour is modeled as a string. Maybe something descriptive like testPartitionTypeDF ########## spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAddFilesProcedure.java: ########## @@ -418,6 +418,27 @@ public void addDataPartitionedByDateToPartitioned() { sql("SELECT id, name, dept, date FROM %s ORDER BY id", tableName)); } + @Test + public void addDataPartitionedByDateHourToPartitioned() { Review Comment: I think the test name is not capturing the problem its solving, should be something like 'testPartitionType' to capture the problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org