goalzz85 commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2428665724
```python
from pyiceberg.transforms import BucketTransform
from pyiceberg.types import IntegerType
id = 50
t = BucketTransform(50)
bucket_int_func = t.tran
goalzz85 commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2426506481
```python
table = catalog.load_table(("default", "mq_log10"))
scan = table.scan()
file_scan_tasks = scan.plan_files()
partition_num = 20
new_file_scan_task
github-actions[bot] commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2367035408
This issue has been automatically marked as stale because it has been open
for 180 days with no activity. It will be closed in next 14 days if no further
activity oc
frankliee commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2019248895
I have considered this, `.plan_files()` will get all files, and cannot
distinguish which files are in the same bucket.
For Iceberg-Spark, there is a helpful system funct
Fokko commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018364320
Ah, I misread what you're looking for. Have you considered the
`.plan_files()` API where you just get a list of tasks to read?
--
This is an automated message from the Apache Gi
frankliee commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018198305
> @frankliee PyIceberg will do the filtering automatically, so when you
filter on the id column, it will automatically use the bucketing to filter down
to the correct bucket:
Fokko commented on issue #548:
URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018051402
@frankliee PyIceberg will do the filtering automatically, so when you filter
on the id column, it will automatically use the bucketing to filter down to the
correct bucket: