grobgl opened a new issue, #8551: URL: https://github.com/apache/iceberg/issues/8551
### Feature Request / Improvement When querying an Iceberg table using PyIceberg for a specific set of fields, the order of fields in the `selected_fields` parameter is not preserved in the output. This is counter-intuitive as query engines usually respect the order of columns as specified. ```python from pyiceberg.catalog import load_catalog from pyiceberg.expressions import GreaterThanOrEqual catalog = load_catalog("default") table = catalog.load_table("nyc.taxis") scan = table.scan( row_filter=GreaterThanOrEqual("trip_distance", 10.0), selected_fields=("VendorID", "tpep_pickup_datetime", "tpep_dropoff_datetime"), ) df = scan.to_pandas() # column order is not guaranteed to equal order in selected_fields ``` AFAICT, this is because `Schema.select` converts the selected fields to a `set`. ### Query engine Other -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org