Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-21 Thread via GitHub
bigluck commented on PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#issuecomment-2605167277 :'( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-21 Thread via GitHub
Fokko commented on PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#issuecomment-2604855436 Ugh, accidentally pushed `main` 🤦 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-21 Thread via GitHub
Fokko closed pull request #1539: Arrow: Avoid buffer-overflow by avoid doing a sort URL: https://github.com/apache/iceberg-python/pull/1539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-21 Thread via GitHub
Fokko commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1923422826 ## pyiceberg/partitioning.py: ## @@ -413,7 +414,9 @@ def partition_record_value(partition_field: PartitionField, value: Any, schema: the final partition reco

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
kevinjqliu commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922786157 ## pyiceberg/partitioning.py: ## @@ -413,7 +414,9 @@ def partition_record_value(partition_field: PartitionField, value: Any, schema: the final partition

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
Fokko commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922771682 ## pyiceberg/io/pyarrow.py: ## Review Comment: Good one, updated and simplified! -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
Fokko commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922769699 ## pyiceberg/partitioning.py: ## @@ -425,8 +426,13 @@ def _to_partition_representation(type: IcebergType, value: Any) -> Any: @_to_partition_representation.reg

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
Fokko commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922765663 ## pyiceberg/io/pyarrow.py: ## @@ -2594,42 +2566,46 @@ def _determine_partitions(spec: PartitionSpec, schema: Schema, arrow_table: pa.T We then retrieve the

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
Fokko commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922764937 ## pyiceberg/partitioning.py: ## @@ -425,8 +426,13 @@ def _to_partition_representation(type: IcebergType, value: Any) -> Any: @_to_partition_representation.reg

Re: [PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
kevinjqliu commented on code in PR #1539: URL: https://github.com/apache/iceberg-python/pull/1539#discussion_r1922707073 ## tests/benchmark/test_benchmark.py: ## @@ -0,0 +1,72 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

[PR] Arrow: Avoid buffer-overflow by avoid doing a sort [iceberg-python]

2025-01-20 Thread via GitHub
Fokko opened a new pull request, #1539: URL: https://github.com/apache/iceberg-python/pull/1539 This was already being discussed back here: https://github.com/apache/iceberg-python/issues/208#issuecomment-1889891973 This PR changes from doing a sort, and then a single pass over the ta