zachdisc commented on code in PR #9731: URL: https://github.com/apache/iceberg/pull/9731#discussion_r1512074114
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ########## @@ -309,24 +312,6 @@ private List<ManifestFile> writePartitionedManifests( clusteredManifestEntryDF = manifestEntryDF.withColumn( CUSTOM_CLUSTERING_COLUMN_NAME, clusteringUdf.apply(col("data_file"))); - } else if (partitionFieldSortOrder != null) { - LOG.info( - "Sorting manifests for specId {} by partition columns in order of {} ", - spec.specId(), - partitionFieldSortOrder); - - // Map the top level partition column names to the column name referenced within the manifest - // entry dataframe - Column[] actualPartitionColumns = - partitionFieldSortOrder.stream() - .map(p -> col("data_file.partition." + p)) - .toArray(Column[]::new); Review Comment: This part might be the trick though, I wonder if we can modify the `PartitionSortFunction` to return a column, which we could define as a struct with partition column names in the standard `sort` case, and just a String in the more customizable case. Actually, I'm wondering if we could make it return a struct in general, or something that we could treat as such. For the same reason as I give below with wanting to deliver some level of hierarchy. I'll have to think on it and tinker. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org