Feiyang472 opened a new issue, #43684:
URL: https://github.com/apache/arrow/issues/43684
### Describe the enhancement requested
Hi Arrow team
We use pyarrow for dataset partitioning. We want to find the relative paths
on the filesystem for respective partitioning schemes and segment encodings.
For example, if using hive partitioning, given a filter `("key", "=", "value
value")`, we would like `/key=value value/`
another example, if using hive partitioning, and uri segment encoding, given
a filter `("key", "=", "value value")`, we would like `/key=value%20value/`
another example, if using directorypartitioning, given a filter `("key",
"=", "value value")`, we would like `/value value/`. We are currently composing
these paths by hand, but we would like to be resilient to changes/inheritances
in arrow implementation.
To achieve this, we would really appreciate if the C++ API
```
arrow::dataset::Partitioning:Format
```
could be exposed via cython
https://github.com/apache/arrow/blob/712cfe6d84bd344cfe57a1e4c791f8a4d052c76d/python/pyarrow/includes/libarrow_dataset.pxd#L290
https://github.com/apache/arrow/blob/712cfe6d84bd344cfe57a1e4c791f8a4d052c76d/python/pyarrow/_dataset.pyx#L2492
like the `arrow::dataset::Partitioning:Parse` method.
Thanks in advance for any help or discussion!
### Component(s)
Integration, Parquet, Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]