kevinjqliu commented on issue #1798: URL: https://github.com/apache/iceberg-python/issues/1798#issuecomment-2764704243
I suspect the issue is with the schema definition ``` schema = Schema( NestedField(field_id=1, name="name", field_type=StringType(), required=False), NestedField( field_id=3, name="my_list", field_type=ListType( element_id=45, element=StringType(), element_required=False ), required=False, ), ) ``` or how we handle the schema conversion internally, between iceberg schema and pyarrow schema. For example, using the example iceberg schema provided, i get a schema mismatch ``` # not working from pyiceberg.catalog import load_catalog import pyarrow as pa from pyiceberg.schema import Schema from pyiceberg.types import NestedField, StringType, ListType from pyiceberg.io.pyarrow import schema_to_pyarrow catalog = load_catalog(**dict(type="in-memory")) schema = Schema( NestedField(field_id=1, name="name", field_type=StringType(), required=False), NestedField( field_id=3, name="my_list", field_type=ListType( element_id=45, element=StringType(), element_required=False ), required=False, ), ) pyarrow_schema = schema_to_pyarrow(schema) # create table catalog.create_namespace_if_not_exists("test") catalog.create_table_if_not_exists("test.table", pyarrow_schema) # append data df_1 = pa.Table.from_pylist([ {"name": "one", "my_list": ["test"]}, {"name": "another", "my_list": ["test"]}, ], schema=pyarrow_schema) catalog.load_table("test.table").append(df_1) catalog.load_table("test.table").scan().to_arrow() # append more data df_2 = pa.Table.from_pylist([ {"name": "one"}, {"name": "another"}, ], schema=pyarrow_schema) catalog.load_table("test.table").append(df_2) catalog.load_table("test.table").scan().to_arrow() ``` ``` ValueError: Mismatch in fields: ┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ ┃ Table field ┃ Dataframe field ┃ ┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ ✅ │ 1: name: optional string │ 1: name: optional string │ │ ✅ │ 2: my_list: optional list<string> │ Missing │ │ ❌ │ 3: element: optional string │ 3: my_list: optional list<string> │ └────┴───────────────────────────────────┴───────────────────────────────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org