Fokko commented on issue #278: URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1906620668
Alright, I went to the source and talked with @danielcweeks and @rdblue. It looks like we made things more complicated than actually needed. So when reading and writing Parquet, we need to make sure that the IDs are aligned properly. When we are working with runtime data (`pa.Table`'s) then we match everything up based on names. I also discussed with Dan about adding Arrow types to the `create_table` statement, and he liked the idea, where I was a bit reluctant. But thinking of it, I think it makes sense since it will allow us to create Iceberg tables from a dataframe: ```python catalog = load_catalog() catalog.create_table('some.table', df=df) ``` And then: ```python # It will wire up the schema by name tbl.overwrite(df) ``` ```python # Should be quite easy with union by name: tbl.append(df, merge_schema=True) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org