HonahX commented on code in PR #907:
URL: https://github.com/apache/iceberg-python/pull/907#discussion_r1673486482


##########
pyiceberg/io/pyarrow.py:
##########
@@ -2026,6 +2072,8 @@ def parquet_files_to_data_files(io: FileIO, 
table_metadata: TableMetadata, file_
                 f"Cannot add file {file_path} because it has field IDs. 
`add_files` only supports addition of files without field_ids"
             )
         schema = table_metadata.schema()
+        _check_schema_compatible(schema, 
parquet_metadata.schema.to_arrow_schema())

Review Comment:
   My understand is that now if we enable 
`downcast-ns-timestamp-to-us-on-write`, we allow user to add parquet files with 
`TIMESTAMP_NANOS` type data. My concern here is that we may add parquet files 
that not align with [spec](https://iceberg.apache.org/spec/#parquet), which 
states that timestamp/timstamptz type should map to `TIMESTAMP_MICROS`. Shall 
we be more restrictive when checking the parquet file that will be directly 
added to the table?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to