smaheshwar-pltr commented on code in PR #3364:
URL: https://github.com/apache/iceberg-python/pull/3364#discussion_r3260038059
##########
pyiceberg/table/__init__.py:
##########
@@ -1685,20 +1750,19 @@ def _parse_row_filter(expr: str | BooleanExpression) ->
BooleanExpression:
return parser.parse(expr) if isinstance(expr, str) else expr
-S = TypeVar("S", bound="TableScan", covariant=True)
+A = TypeVar("A", bound="BaseScan", covariant=True)
+
+class BaseScan(ABC):
Review Comment:
`BaseScan` is new; `TableScan` is unchanged in surface but now subclasses
it. Why split:
- This PR keeps `snapshot_id`, `catalog`, `table_identifier`, `use_ref`,
`snapshot()`, and abstract `count()` on `TableScan` to avoid the breaking
change [#533](https://github.com/apache/iceberg-python/pull/533) introduced
when it dropped these.
- That makes `TableScan` snapshot-specific, so it isn't a sensible base
class for incremental scans (which have *two* snapshot IDs, not one).
- `BaseScan` therefore holds the genuinely-shared surface (row filter,
projection, options, limit, chaining helpers, format-converter sinks built on
`to_arrow()`).
I don't love this — if breaking `TableScan` were acceptable we could
collapse the hierarchy like #533. See [prior
thinking](https://github.com/apache/iceberg-python/pull/2031#discussion_r2102683614)
and
[follow-up](https://github.com/apache/iceberg-python/pull/2031#discussion_r2102727321).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]