Re: [I] Error reading table after appending pyarrow table [iceberg-python]

via GitHub Sun, 30 Mar 2025 18:20:33 -0700


kevinjqliu commented on issue #1798:
URL: 
https://github.com/apache/iceberg-python/issues/1798#issuecomment-2764848902


   ok heres a working version, which supplies a pyarrow schema to when creating 
the pyarrow table. 
   
   The difference is the parquet `field-id` (see 
[`PYARROW_PARQUET_FIELD_ID_KEY`](https://github.com/apache/iceberg-python/blob/4b15fb6fe0870071c5a23e95b811631877dd291b/pyiceberg/io/pyarrow.py#L192))
   
   
   ```
   from pyiceberg.catalog import load_catalog
   
   catalog = load_catalog(**dict(type="in-memory"))
   
   from pyiceberg.schema import Schema
   from pyiceberg.types import NestedField, StringType, ListType
   
   schema = Schema(
       NestedField(field_id=1, name="name", field_type=StringType(), 
required=False),
       NestedField(
           field_id=3,
           name="my_list",
           field_type=ListType(
               element_id=45, element=StringType(), element_required=False
           ),
           required=False,
       ),
   )
   catalog.create_namespace_if_not_exists("test")
   tbl = catalog.create_table_if_not_exists("test.table", schema)
   
   import pyarrow as pa
   
   df_1 = pa.Table.from_pylist([
       {"name": "one", "my_list": ["test"]},
       {"name": "another", "my_list": ["test"]},
   ], tbl.schema().as_arrow())
   catalog.load_table("test.table").append(df_1)
   catalog.load_table("test.table").scan().to_arrow()
   
   import pyarrow as pa
   
   df_2 = pa.Table.from_pylist([
       {"name": "one"},
       {"name": "another"},
   ], tbl.schema().as_arrow())
   catalog.load_table("test.table").append(df_2)
   catalog.load_table("test.table").scan().to_arrow()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Error reading table after appending pyarrow table [iceberg-python]

Reply via email to