prantogg opened a new pull request, #2956: URL: https://github.com/apache/sedona/pull/2956
## Summary Add Python-side `serialize()` for `InDbSedonaRaster` and `with_bands()` for replacing pixel data, enabling Python UDFs to return raster objects directly. This replaces the lossy `.tolist()` + `RS_MakeRaster` workaround documented today. ### What changed - **`raster_serde.serialize()`** — Writes `InDbSedonaRaster` to Sedona's binary format, byte-compatible with JVM `Serde.deserialize()`. Uses cache-and-replay for opaque Kryo blobs (categories, properties, colorModel). - **`InDbSedonaRaster.with_bands()`** — Creates a new raster with replaced pixel data (NumPy array) but preserved spatial metadata. Band count and dtype may differ from the source. - **`RasterType.serialize()`** — Delegates to `raster_serde.serialize()` instead of raising `NotImplementedError`. - **`DeepCopiedRenderedImage.reconcileColorModel()`** (JVM) — Fixes colorModel/sampleModel mismatches at deserialization time when Python UDFs change band count or dtype. ### Performance improvement | Metric | Before (`.tolist()`) | After (`serialize()`) | |--------|---------------------|-----------------------| | Latency per raster | ~28-53ms | ~2.3ms | | Memory per raster | ~25MB (262K float objects) | ~266KB (contiguous bytes) | | Dtype fidelity | Forced Float64 | Native (uint8-float64) | | Metadata survival | Lost | Preserved | ### Tests - 8 `with_bands()` tests (band count changes, dtype changes, metadata survival) - 2 serialize round-trip tests - 1 JVM serde test (colorModel mismatch handling) ### Docs Updated the "Writing Python UDF" section in `docs/tutorial/raster.md` to show the new raster-to-raster UDF pattern using `with_bands()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
