jiayuasu commented on PR #2879:
URL: https://github.com/apache/sedona/pull/2879#issuecomment-4358352934
Verified end-to-end against a local mirror of `docker/test-notebooks.sh`.
Stack matches the docker image (Python 3.10, pyspark==4.0.1,
apache-sedona==1.9.0, JDK 17, `--driver-memory 4g`, `local[*]`); Sedona JAR
pulled via `PYSPARK_SUBMIT_ARGS --packages
org.apache.sedona:sedona-spark-shaded-4.0_2.13:1.9.0,org.datasyslab:geotools-wrapper:1.9.0-33.5`.
| Notebook | Result | Wall-clock |
|---|---|---|
| `00-quickstart.ipynb` (already merged in #2876) | **PASS** | 14 s |
| `01-mobility-pulse.ipynb` (this PR) | **PASS** | 22 s with warm parquet
cache |
Output sanity-checks for 01-mobility-pulse:
- Bing-tile quadkeys produced (e.g. Williamsburg → `03201011013232`).
- Top OD pairs: Lenox Hill / Upper East Side dominates — correct for
affluent residential zones.
- Per-zone peak buckets read intuitively: Lower East Side `nightlife`, Kips
Bay `evening`, Bronx Park `midday`.
- GeoParquet 1.1 round-trip succeeds with covering-bbox metadata visible.
Added `docs/usecases/README.md` (`dd9ee5ad15`) codifying the contract every
shipped notebook must meet (default 4g driver, local[*], 900s timeout,
`requires-network: true` tag for offline skipping, the `data/...` and
`master("spark://...")` placeholders the harness rewrites). Future contributors
won't need to re-derive it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]