ndrluis commented on issue #992:
URL: https://github.com/apache/iceberg-python/issues/992#issuecomment-2266204995

   Hello @jurossiar,
   
   I ran some tests and was unable to reproduce the error. Reading the 
exception, it looks like some files do not have the table_id filled in. Could 
you create a minimal example that reproduces the error? In your video, you are 
using a table that already exists. It would be good if the example includes a 
setup from scratch.
   
   This is the test that I did
   ```python
   from pyiceberg.catalog import load_catalog
   import pyarrow as pa
   from pyiceberg.schema import Schema
   from pyiceberg.types import NestedField, StringType
   from pyiceberg.expressions import EqualTo
   
   
   catalog = load_catalog(
       "demo",
       **{
           "type": "rest",
           "uri": "http://localhost:8181";,
           "s3.endpoint": "http://localhost:9000";,
           "s3.access-key-id": "admin",
           "s3.secret-access-key": "password",
       },
   )
   
   catalog.create_namespace_if_not_exists("default")
   
   schema = Schema(
       NestedField(field_id=1, name="table_id", field_type=StringType(), 
required=True),
       NestedField(field_id=2, name="name", field_type=StringType(), 
required=True),
       NestedField(field_id=3, name="dataset", field_type=StringType(), 
required=True),
       NestedField(
           field_id=4, name="description", field_type=StringType(), 
required=False
       ),
       identifier_field_ids=[1],
   )
   
   data = pa.Table.from_pylist(
       [
           {
               "table_id": "table1",
               "name": "table1",
               "dataset": "default",
               "description": "table1",
           },
           {
               "table_id": "table2",
               "name": "table2",
               "dataset": "default",
               "description": "table2",
           },
       ],
       schema=schema.as_arrow(),
   )
   
   try:
       catalog.purge_table("default.some_table")
   except:
       pass
   
   table = catalog.create_table("default.some_table", schema=schema)
   
   table.append(data)
   
   result = table.scan(
       selected_fields=(["*"]),
       row_filter=EqualTo("dataset", ""),
   )
   
   result.to_pandas()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to