james-willis opened a new pull request, #849: URL: https://github.com/apache/sedona-db/pull/849
## Summary Make `BandRef::nd_buffer()`, `contiguous_data()`, and `data()` Just Work for OutDb bands (bands whose Arrow `data` column is empty and whose `outdb_uri` points elsewhere). Today those methods return `NotYetImplemented`, blocking every UDF and downstream consumer from reading OutDb pixel bytes — the entire reason OutDb references exist as a schema feature is moot without a working byte path. Approach: a single statically-typed function-pointer hook in `sedona-raster`, populated by `sedona-raster-gdal` at session bootstrap. No trait, no HashMap-keyed registry, no async — \"compiled in, not pluggable\". The only abstraction introduced is the one the crate-boundary forces (`sedona-raster-gdal` depends on `sedona-raster`, not the reverse, so a direct call from `BandRefImpl` to GDAL is impossible). ## Changes - **New** `rust/sedona-raster/src/outdb_loader.rs` — `OutDbLoadRequest`, `OutDbBandLoader` fn-pointer type, `OnceLock`, `set_outdb_band_loader`, internal `load_outdb` dispatcher. - **Modified** `rust/sedona-raster/src/array.rs` — `BandRefImpl` gains `outdb_loaded: OnceCell<Vec<u8>>` (lifetime anchor, not a cross-band cache). `nd_buffer()`, `contiguous_data()`, `data()` route through a new `source_bytes()` helper that zero-copies from the Arrow `data` column for schema-InDb bands and falls back to the loader hook (anchored in `outdb_loaded`) for schema-OutDb bands. The legacy `data()` accessor collapses errors to `&[]` to preserve the pre-N-D contract. - **New** `rust/sedona-raster-gdal/src/outdb_loader.rs` — `gdal_load` impl using the existing `GDALDatasetCache::get_or_create_outdb_source` (thread-local LRU, VSI translation, `#band=N` fragment handling). `pub fn register_outdb_loader()` registers it into the hook. - **Modified** `rust/sedona/src/context.rs` — calls `sedona_raster_gdal::register_outdb_loader()` from `SedonaContext::new_from_context()` next to the existing function-set registration. ## Tests `sedona-raster` (mock loader): five `outdb_loader` unit tests cover loader registration, returned-bytes round-trip, per-band caching, missing-uri error, undersized-loader-output error, and loader-failure propagation. `sedona-raster-gdal` (real GDAL): three integration tests write tiny GeoTIFFs to a temp dir and verify the end-to-end byte path: - `loads_outdb_band_bytes_from_geotiff` — 4×3 single-band tiff, `band.nd_buffer().buffer` matches the file bytes. - `second_call_on_same_band_reuses_cache` — verifies `BandRefImpl.outdb_loaded` reuse on a second `nd_buffer()` call. - `band_fragment_selects_correct_band` — two-band tiff with `#band=N` fragment selects the correct band. ## Relationship to PR #813 (view machinery) This PR is **independent of PR-D** (#813 — view machinery / `materialized` cell / strided walk) and based directly on `main`. The view-composition path remains rejected at construction in `RasterRefImpl::band()`, so the OutDb byte path here only needs to handle the identity-view case. If PR-D lands first, this PR will need a small follow-up to integrate OutDb bytes with the strided walk in `data()` (the byte-access surface is the soft conflict — both PRs rewrite `data()` / `nd_buffer()` / `contiguous_data()`). If this PR lands first, PR-D folds the second `OnceCell` into its existing materialization pattern. Either order is manageable; there is no hard duplication. ## Test plan - [x] \`cargo test -p sedona-raster --lib\` (74 passing) - [x] \`cargo test -p sedona-raster-gdal --lib outdb_loader\` (3 passing) - [x] \`cargo clippy --all-targets -p sedona-raster -p sedona-raster-gdal -p sedona -- -D warnings\` - [x] \`cargo fmt --all -- --check\` - [ ] Smoke test: end-to-end SQL against an OutDb raster reading bytes through an `RS_*` kernel. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
