Fokko commented on issue #278:
URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1906620668
Alright, I went to the source and talked with @danielcweeks and @rdblue. It
looks like we made things more complicated than actually needed.
So when reading and writing Parquet, we need to make sure that the IDs are
aligned properly. When we are working with runtime data (`pa.Table`'s) then we
match everything up based on names.
I also discussed with Dan about adding Arrow types to the `create_table`
statement, and he liked the idea, where I was a bit reluctant. But thinking of
it, I think it makes sense since it will allow us to create Iceberg tables from
a dataframe:
```python
catalog = load_catalog()
catalog.create_table('some.table', df=df)
```
And then:
```python
# It will wire up the schema by name
tbl.overwrite(df)
```
```python
# Should be quite easy with union by name:
tbl.append(df, merge_schema=True)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]