syun64 commented on code in PR #921:
URL: https://github.com/apache/iceberg-python/pull/921#discussion_r1678021509
##########
pyiceberg/io/pyarrow.py:
##########
@@ -2214,13 +2199,16 @@ def _dataframe_to_data_files(
property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,
default=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES_DEFAULT,
)
+ name_mapping = table_metadata.schema().name_mapping
Review Comment:
Good question - I'm not sure actually.
When we are writing a dataframe into an Iceberg table, I think we are making
the assumption that its names match the current names of the Iceberg table, so
I think using the `table_metadata.schema().name_mapping` is appropriate in
supporting that.
`table_metadata.name_mapping` is used to map the field names of written data
files hat do not have field IDs to the Iceberg SChema, so I don't think we need
to extend our write APIs to also support mapping other names in the name
mapping. Since this will be a new feature, I'm of the opinion that we should
leave it out until we can think of valid use cases for that from our user base.
I'm curious to hear what others' thoughts are, and whether anyone has a
workflow in mind that would benefit from this change!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]