Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560708280 > I like that idea. Could you elaborate on that? Yes, the following is what I implemented in the internal repo: ```python def plan_scan_tasks( files: Iterable[Fi

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560194912 > Another option would be to provide a plan_util to support plan tasks like the Java-side implementation. Thats interesting, I like that too. "util" suggests that its opti

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
Fokko commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559871997 Just to add some context: > Currently, PyIceberg's read path assumes to be run on a single node machine. This assumption is embedded in the way we plan and execute the read pa

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559498484 @kevinjqliu, thanks for the summary and the great proposal. Another option would be to provide a `plan_util` to support plan tasks like the Java-side implementation. -- This i

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-20 Thread via GitHub
kevinjqliu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2557356375 Thanks everyone for the great discussion here! To summarize the thread above, I think the main concern here is around exposing this functionality as part of PyIceberg's `DataSca

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-19 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1893572606 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-18 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1890655836 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-17 Thread via GitHub
samster25 commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1889650976 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-17 Thread via GitHub
Fokko commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1889190270 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-17 Thread via GitHub
samster25 commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1888474960 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1887950445 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
kevinjqliu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1887034502 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886487094 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
Fokko commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886483368 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2545047787 Thanks @kevinjqliu @corleyma for your review. Pls take another look, thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886465895 ## tests/integration/test_reads.py: ## @@ -873,3 +874,76 @@ def test_table_scan_empty_table(catalog: Catalog) -> None: result_table = tbl.scan().to_arrow()

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886464171 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446918 ## pyiceberg/table/__init__.py: ## @@ -1253,6 +1265,22 @@ def __init__( self.start = start or 0 self.length = length or data_file.file_size_in

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446277 ## pyiceberg/table/__init__.py: ## @@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool = True) -> S: class ScanTask(ABC): -pas

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886445512 ## tests/integration/test_reads.py: ## @@ -873,3 +873,48 @@ def test_table_scan_empty_table(catalog: Catalog) -> None: result_table = tbl.scan().to_arrow()

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886444111 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886443611 ## pyiceberg/table/__init__.py: ## @@ -191,6 +193,15 @@ class TableProperties: DELETE_MODE_MERGE_ON_READ = "merge-on-read" DELETE_MODE_DEFAULT = DELET

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-16 Thread via GitHub
ConeyLiu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886441268 ## pyiceberg/manifest.py: ## @@ -105,6 +105,9 @@ def _missing_(cls, value: object) -> Union[None, str]: return member return None +

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-14 Thread via GitHub
kevinjqliu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1885389787 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884533676 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884524371 ## pyiceberg/table/__init__.py: ## @@ -1253,6 +1265,22 @@ def __init__( self.start = start or 0 self.length = length or data_file.file_size_in

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884523394 ## pyiceberg/table/__init__.py: ## @@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool = True) -> S: class ScanTask(ABC): -pas

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884198201 ## pyiceberg/table/__init__.py: ## @@ -191,6 +193,15 @@ class TableProperties: DELETE_MODE_MERGE_ON_READ = "merge-on-read" DELETE_MODE_DEFAULT = DEL