kevinjqliu commented on PR #3364: URL: https://github.com/apache/iceberg-python/pull/3364#issuecomment-4700045818
Follow-up items I think are worth tracking after this lands: - Decide whether PyIceberg should support an unset start snapshot for incremental append scans. Java `IncrementalScan` allows this and scans from the oldest ancestor of the end snapshot. This PR intentionally requires `from_snapshot_id_exclusive`, closer to the Spark option surface. Either behavior is defensible, but the choice should be explicit in docs and tests. - Add branch and ref convenience APIs for incremental scans. Regular PyIceberg scans have `use_ref(...)`, and Java incremental scans support ref overloads and branch selection. The current ID-only surface is fine for an initial version, but users will likely expect parity with normal scans. - Consider adding `from_snapshot_inclusive(...)` as a convenience. Java exposes inclusive and exclusive start semantics. Python currently exposes only exclusive. Inclusive can be translated to the parent snapshot when the start snapshot is still present. - Decide whether `IncrementalAppendScan` should expose `count()`. The current scan supports materialization paths such as `to_arrow`, `to_pandas`, and `to_polars`, but not the `DataScan.count()` convenience. - Add documentation examples for the intended incremental append semantics: append snapshots only, delete/overwrite/replace snapshots ignored except for lineage validation, current-schema projection, and expired exclusive-start cursors. - If REST/server-side planning grows support for incremental scans, wire this scan into that path or document that incremental append planning is local-only for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
