pratheekrebala opened a new pull request, #778:
URL: https://github.com/apache/sedona-db/pull/778
## Summary
- Enables passing pyogrio keyword arguments (e.g., `layer`, `where`, `sql`,
`max_features`) through `read_pyogrio(options=...)` by switching from the
internal `pyogrio.raw.ogr_open_arrow()` to the public
`pyogrio.raw.open_arrow()` API, which accepts these as keyword arguments
directly including specifying the `layer=` flag to handle multi-layer files
(#772)
- Adds a `path_suffix` option to append a subpath to the resolved GDAL
source, enabling reading of formats like FileGeoDatabase (`.gdb`) stored inside
`.zip` archives (e.g., `options={"path_suffix": 'subdir/test.gdb')`). This is
equivalent to running `ogrinfo /vsizip/test.zip/subdir/test.gdb`.
### Motivation
Previously, `read_pyogrio()` had no way to pass options through to pyogrio,
making it impossible to select a specific layer in a multi-layer file or use
GDAL driver-specific open options. For example, reading a named layer from a
GeoPackage now works:
```python
con.read_pyogrio("file.gpkg", options={"layer": "my_layer"})
```
For formats like FileGeoDatabase distributed inside .zip archives, GDAL
expects a path of the form `/vsizip/archive.zip/data.gdb`. Because the file
listing infrastructure resolves real filesystem paths before the Python reader
runs, passing a full `/vsizip/...` path directly fails. The path_suffix option
bridges this gap:
```python
con.read_pyogrio("path/to/file.zip", options={"path_suffix": "data.gdb"})
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]