Samreay commented on issue #338:
URL: https://github.com/apache/iceberg-python/issues/338#issuecomment-2456444991

   Hey @sungwy, just thought I'd chase this as well. The PR you linked is 
merged and 0.7.1 is now out, so does that mean there is a new way of specifying 
sort order we can use with pyarrow schemas? I've been trying to do it the way 
recommended by the doco and still running into `ValueError`
   
   ```python
   import polars as pl
   from pyiceberg.catalog.glue import GlueCatalog
   from pyiceberg.table.sorting import SortField, SortOrder
   from pyiceberg.transforms import IdentityTransform
   
   # Using the arrow schema as per 
https://py.iceberg.apache.org/#write-a-pyarrow-dataframe
   df = pl.DataFrame({"a": [1, 2, 3, 4, 5], "b": [5, 4, 3, 2, 1]}).to_arrow()
   
   glue_catalog = GlueCatalog(name="", 
properties={"write.parquet.compression-codec": "snappy"})
   table = glue_catalog.create_table(
       identifier="dev-cleaned.tmp_iceberg",
       schema=df.schema,
       location="s3:///mybucket/tmp",
       sort_order=SortOrder(SortField(source_id=1, 
transform=IdentityTransform())),
   )
   ```
   
   Gives:
   
   ```
   Exception has occurred: ValueError
   Could not find in old schema: 1 ASC NULLS FIRST
     File "/home/sam/arenko/flows-datalake/tmp_arrow.py", line 10, in <module>
       table = glue_catalog.create_table(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
   ValueError: Could not find in old schema: 1 ASC NULLS FIRST
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to