Samreay commented on issue #338: URL: https://github.com/apache/iceberg-python/issues/338#issuecomment-2456444991
Hey @sungwy, just thought I'd chase this as well. The PR you linked is merged and 0.7.1 is now out, so does that mean there is a new way of specifying sort order we can use with pyarrow schemas? I've been trying to do it the way recommended by the doco and still running into `ValueError` ```python import polars as pl from pyiceberg.catalog.glue import GlueCatalog from pyiceberg.table.sorting import SortField, SortOrder from pyiceberg.transforms import IdentityTransform # Using the arrow schema as per https://py.iceberg.apache.org/#write-a-pyarrow-dataframe df = pl.DataFrame({"a": [1, 2, 3, 4, 5], "b": [5, 4, 3, 2, 1]}).to_arrow() glue_catalog = GlueCatalog(name="", properties={"write.parquet.compression-codec": "snappy"}) table = glue_catalog.create_table( identifier="dev-cleaned.tmp_iceberg", schema=df.schema, location="s3:///mybucket/tmp", sort_order=SortOrder(SortField(source_id=1, transform=IdentityTransform())), ) ``` Gives: ``` Exception has occurred: ValueError Could not find in old schema: 1 ASC NULLS FIRST File "/home/sam/arenko/flows-datalake/tmp_arrow.py", line 10, in <module> table = glue_catalog.create_table( ^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: Could not find in old schema: 1 ASC NULLS FIRST ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org