zhongyujiang commented on code in PR #6118:
URL: https://github.com/apache/iceberg/pull/6118#discussion_r1015035815


##########
parquet/src/main/java/org/apache/iceberg/parquet/ParquetUtil.java:
##########
@@ -75,23 +76,27 @@ public static Metrics fileMetrics(InputFile file, 
MetricsConfig metricsConfig) {
   public static Metrics fileMetrics(
       InputFile file, MetricsConfig metricsConfig, NameMapping nameMapping) {
     try (ParquetFileReader reader = 
ParquetFileReader.open(ParquetIO.file(file))) {
-      return footerMetrics(reader.getFooter(), Stream.empty(), metricsConfig, 
nameMapping);
+      return footerMetrics(reader.getFooter(), Stream.empty(), metricsConfig, 
nameMapping, null);

Review Comment:
   This is used to migrate external files to Iceberg tables,  I am thinking if 
those files are generated by Iceberg, we may still not be able to collect 
metrics according to the correct metric mode with current change. 
   
   To solve this , I think we can change TableMigrationUtil to pass Iceberg 
schema also; 
   Or we can update MetricsConfig to make it tracks mapping between column ids 
metric modes, and use column ids to get metric modes directly when collecting 
parquet metrics, but that will need all writers to update from 
`MetricsConfig.fromProperties(Map<String, String> props)` to `MetricsConfig 
from(Map<String, String> props, Schema schema, SortOrder order)` (this is a 
private method now). 
   
   I prefer the second way and I can update the releated code in this repo, but 
I don't know if any external projects uses 
`MetricsConfig.fromProperties(Map<String, String> props)` too. What do you 
think about it? @rdblue 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to