bwi-earth opened a new issue, #45855: URL: https://github.com/apache/arrow/issues/45855
### Describe the usage question you have. Please include as many useful details as possible. I'm reading about various methods to consume a `pyarrow.dataset.Dataset`, in the of large dataset (.to_table is excluded). it seems that it is impossible to read a dataset in by chunk, yet in ordered manner, `to_batches` doesn't offer any guarntees about the order of the retrieved batches. The best I've come up with is to list the fragments of the dataset and read each one individually, then sort partial outputs. However, if that's the case i'm loosing the benefit of pyarrow loading stuff in the background. (im using parquet stored in s3 as backend, doesn't seem to be relevant though) ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org