jiayuasu opened a new pull request, #2896:
URL: https://github.com/apache/sedona/pull/2896

   ## Did you read the Contributor Guide?
   
   - Yes, I have read the [Contributor 
Rules](https://sedona.apache.org/latest/community/rule/) and [Contributor 
Development Guide](https://sedona.apache.org/latest/community/develop/)
   
   ## Is this PR related to a ticket?
   
   - Yes, and the PR name follows the format `[GH-XXX] my subject`. Closes part 
of #2700.
   
   ## What changes were proposed in this PR?
   
   Continues the docker-image notebook refresh series (issue #2700, milestone 
1.9.1). Adds the first raster-pipeline notebook in the series.
   
   `docs/usecases/02-vegetation-change.ipynb` answers:
   
   > **Between two satellite scenes a season apart, which farm parcels in this 
AOI greened up the most?**
   
   End-to-end on Sedona's 1.9 raster surface:
   
   1. SedonaContext setup.
   2. Synthesize two 256×256 red+NIR GeoTIFFs in `/tmp/veg-change/` (uint16, 
EPSG:4326, tiled GeoTIFF). The "before" scene is mostly bare; the "after" scene 
has a circular field of vegetation in the south-west corner with elevated NIR. 
Written with `tiled=True, blockxsize=256, blockysize=256` because the Sedona 
raster reader rejects strip-based GeoTIFFs as "too thin".
   3. Load both with `sedona.read.format("raster")` — the new auto-tiling 
reader (GH-2672, 1.9.0).
   4. Single-raster `RS_MapAlgebra` to compute NDVI per scene.
   5. Two-raster `RS_MapAlgebra` to compute the per-pixel ΔNDVI delta.
   6. 4×4 synthetic parcel grid + `RS_ZonalStats(rast, geom, 'mean')` — the 
canonical raster→vector aggregation.
   7. `RS_Clip` on the top-ranked parcel for a close-up.
   8. `RS_AsCOG` (GH-2652, 1.9.0) round-trip through a Cloud-Optimized GeoTIFF; 
read back via the same `raster` reader to prove it's valid for cloud-hosted 
streaming.
   9. Four-panel matplotlib visualization (NDVI before, NDVI after, ΔNDVI with 
parcel grid, top-parcel close-up).
   
   The synthesized greening pattern places its peak in parcel **P10**, which is 
what the workflow ranks top — built-in ground truth for the answer.
   
   Notebook is structured as numbered markdown sections (`## 1.` through `## 
9.`), matching the convention from `01-mobility-pulse` and 
`05-geopandas-on-spark`. Notebook intro flags `**Requires Sedona ≥ 1.9.0.**` 
explicitly because the auto-tiling raster reader and `RS_AsCOG` are 1.9-only.
   
   No new data shipped. No network required.
   
   ## How was this patch tested?
   
   End-to-end through the local mirror of `docker/test-notebooks.sh` (matched 
docker stack: Python 3.10, `pyspark==4.0.1`, `apache-sedona==1.9.0`, JDK 17, 
`local[*]`, `DRIVER_MEM=4g`, Sedona JAR via `PYSPARK_SUBMIT_ARGS` Maven coords).
   
   ```
   PASS  02-vegetation-change  13s elapsed
   ```
   
   Output sanity-checked: top-greening parcel `P10` matches the synthesized 
field location; COG round-trip read-back as 65×65 REAL_64BITS as expected; all 
`RS_*` results have the right dimensions.
   
   Three real failure modes surfaced and were fixed during local verification 
before this commit:
   
   1. macOS `/tmp` pollution intercepted Spark's directory listing for the 
input glob → use a dedicated `/tmp/veg-change/` subdir for the synthetic 
rasters.
   2. The `raster` data source schema is `[rast, x, y, name]` (not `path`); 
derive the scene label from `name`.
   3. Sedona's reader rejects strip-based GeoTIFFs as "too thin"; pass 
`tiled=True, blockxsize=256, blockysize=256` to `rasterio.open`.
   
   The CI Docker-build workflow (path-filter widening landed in #2889) will run 
on this PR — the `apache/sedona:latest` matrix leg builds the image with this 
notebook bundled and runs `test-notebooks.sh` against it, so the in-container 
PASS line lands in CI.
   
   ## Did this PR include necessary documentation updates?
   
   - The notebook is itself the documentation; intro markdown calls out 
`**Requires Sedona ≥ 1.9.0.**` and lists the gotchas (tiled GeoTIFF 
requirement, `name` not `path` in the schema).
   - No new data shipped, so no `docs/usecases/data/README.md` updates.
   - No public API changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to