ConeyLiu commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560708280
> I like that idea. Could you elaborate on that?
Yes, the following is what I implemented in the internal repo:
```python
def plan_scan_tasks(
files: Iterable[Fi
kevinjqliu commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560194912
> Another option would be to provide a plan_util to support plan tasks like
the Java-side implementation.
Thats interesting, I like that too. "util" suggests that its opti
Fokko commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559871997
Just to add some context:
> Currently, PyIceberg's read path assumes to be run on a single node
machine. This assumption is embedded in the way we plan and execute the read
pa
ConeyLiu commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559498484
@kevinjqliu, thanks for the summary and the great proposal. Another option
would be to provide a `plan_util` to support plan tasks like the Java-side
implementation.
--
This i
kevinjqliu commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2557356375
Thanks everyone for the great discussion here! To summarize the thread
above, I think the main concern here is around exposing this functionality as
part of PyIceberg's `DataSca
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1893572606
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1890655836
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
samster25 commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1889650976
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
Fokko commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1889190270
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
samster25 commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1888474960
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1887950445
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
kevinjqliu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1887034502
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
Fokko commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886487094
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
Fokko commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886483368
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
ConeyLiu commented on PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2545047787
Thanks @kevinjqliu @corleyma for your review. Pls take another look, thanks
a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886465895
##
tests/integration/test_reads.py:
##
@@ -873,3 +874,76 @@ def test_table_scan_empty_table(catalog: Catalog) -> None:
result_table = tbl.scan().to_arrow()
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886464171
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446918
##
pyiceberg/table/__init__.py:
##
@@ -1253,6 +1265,22 @@ def __init__(
self.start = start or 0
self.length = length or data_file.file_size_in
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886446277
##
pyiceberg/table/__init__.py:
##
@@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool =
True) -> S:
class ScanTask(ABC):
-pas
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886445512
##
tests/integration/test_reads.py:
##
@@ -873,3 +873,48 @@ def test_table_scan_empty_table(catalog: Catalog) -> None:
result_table = tbl.scan().to_arrow()
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886444111
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886443611
##
pyiceberg/table/__init__.py:
##
@@ -191,6 +193,15 @@ class TableProperties:
DELETE_MODE_MERGE_ON_READ = "merge-on-read"
DELETE_MODE_DEFAULT = DELET
ConeyLiu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1886441268
##
pyiceberg/manifest.py:
##
@@ -105,6 +105,9 @@ def _missing_(cls, value: object) -> Union[None, str]:
return member
return None
+
kevinjqliu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1885389787
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884533676
##
pyiceberg/table/__init__.py:
##
@@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]:
for data_entry in data_entries
]
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884524371
##
pyiceberg/table/__init__.py:
##
@@ -1253,6 +1265,22 @@ def __init__(
self.start = start or 0
self.length = length or data_file.file_size_in
corleyma commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884523394
##
pyiceberg/table/__init__.py:
##
@@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool =
True) -> S:
class ScanTask(ABC):
-pas
kevinjqliu commented on code in PR #1427:
URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884198201
##
pyiceberg/table/__init__.py:
##
@@ -191,6 +193,15 @@ class TableProperties:
DELETE_MODE_MERGE_ON_READ = "merge-on-read"
DELETE_MODE_DEFAULT = DEL
30 matches
Mail list logo