syun64 commented on issue #278: URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1904289015
> what do we do with the name-mapping created in step 1 after the table is created? Do we just discard it or put it in schema.name-mapping.default? If the later, I think we need either to update the [new_table_metadata](https://github.com/apache/iceberg-python/blob/83306104a25a4ecd1f2185ec46cd9fda247544f4/pyiceberg/table/metadata.py#L399-L409) to not assign fresh ids when a name-mapping present or update the _SetFreshIds to respect name-mapping if given. I would appreciate any thoughts on this matter! Great question @HonahX my understanding is that the act of putting in a name mapping into schema.name-mapping.default isn't done automatically by any operation, and requires the user to actually insert the name mapping json as a table property into the iceberg table. I think regardless of whether we create this visitor to create a name mapping (which in turn will be used to create an iceberg schema), or an iceberg schema directly, it will need have to have the ability to incrementally assign a new id by position. Because we are trying to create a new iceberg schema based on an arrow schema that does not have the field_id metadata. Imagine we are trying to grab a 100 column parquet file from a vendor and create an Iceberg table based on it, and it doens't have PARQUET:FIELD_ID metadata on its columns. Currently, there's no way to create this iceberg table and ingest this data without manually coding and labelling each and every column using the Iceberg schema types to create an Iceberg schema. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org