Re: [I] Ignore downcasting of column types when "mergeSchema" is set. [iceberg]

2024-10-28 Thread via GitHub
rocco408 commented on issue #4849: URL: https://github.com/apache/iceberg/issues/4849#issuecomment-2443313921 I might have a fix in https://github.com/apache/iceberg/pull/11419. This is my first PR here, I'm open to any thoughts if folks have a minute. Cheers. -- This is an automated mess

[PR] AWS: Enable RetryMode for AWS KMS client [iceberg]

2024-10-28 Thread via GitHub
hsiang-c opened a new pull request, #11420: URL: https://github.com/apache/iceberg/pull/11420 * We're reaching the limit of AWS KMS requests per AWS account quota and we'd like to enable retry in KMS client to be more resilient to failures. * This PR is similar to https://github.com/apach

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-10-28 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2443300032 @bmorck Thanks for your interest! Currently, this PR only enables the CometBatchReader for batch reading; it does not yet turn on Comet's native operators. In the next step, I will make

Re: [PR] Spark 3.5: Fix NotSerializableException when migrating Spark tables [iceberg]

2024-10-28 Thread via GitHub
manuzhang commented on PR #11157: URL: https://github.com/apache/iceberg/pull/11157#issuecomment-2443260806 @RussellSpitzer I can revert to previous commit and this is Spark specific, but can you elaborate on why `LazyExecutorService` is better than`SerializableSupplier`? I agree with you t

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2443167064 Okay, I see what the problem is now. Thanks for the example! The value of struct type field (`struct_field_1`) changes from `None` to ``` {'string_nested_1': '

Re: [PR] Flink 1.20: Update Flink to use planned Avro reads [iceberg]

2024-10-28 Thread via GitHub
jbonofre commented on PR #11386: URL: https://github.com/apache/iceberg/pull/11386#issuecomment-2443246368 @RussellSpitzer the planned Avro reads has been added to Spark (for Iceberg 1.7.x). This one is not a blocker for 1.7.0 but a good to have to benefit the same performance boost as Spar

Re: [PR] Core: log a warn message if MetricsReporter fails [iceberg]

2024-10-28 Thread via GitHub
amogh-jahagirdar commented on code in PR #11416: URL: https://github.com/apache/iceberg/pull/11416#discussion_r1820097463 ## core/src/main/java/org/apache/iceberg/SnapshotScan.java: ## @@ -154,7 +155,17 @@ public CloseableIterable planFiles() { .scanMetrics(S

Re: [PR] fix: do not sort indices for `ProjectionMask::leaves` [iceberg-rust]

2024-10-28 Thread via GitHub
wcy-fdu commented on PR #682: URL: https://github.com/apache/iceberg-rust/pull/682#issuecomment-2443054451 > cc @wcy-fdu Would you mind to take a look at the ut failure? Sure, will fix ut outside of work time. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] fix: do not sort indices for `ProjectionMask::leaves` [iceberg-rust]

2024-10-28 Thread via GitHub
liurenjie1024 commented on PR #682: URL: https://github.com/apache/iceberg-rust/pull/682#issuecomment-2443036829 cc @wcy-fdu Would you mind to take a look at the ut failure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] WIP: Testing out using Polaris Docker image in Integration Test Suite [iceberg-python]

2024-10-28 Thread via GitHub
sungwy commented on PR #1252: URL: https://github.com/apache/iceberg-python/pull/1252#issuecomment-2442998150 TODO: [ ] - Update to use S3COMPATIBLE storage type once https://github.com/apache/polaris/pull/389 is merged [ ] - Use a built Polaris image -- This is an automated message

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
SGA-taichi-kato commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2443023759 Here is the parquet-tools result. ``` $ parquet-tools show 0-0-515d281f-afd7-4873-89c9-c5b1a992a822.parquet +--+---+

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
SGA-taichi-kato commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442934911 Hi @kevinjqliu Here we define the pyarrow schema. What I'm asking about is the struct field "struct_field_1". ```python from pyiceberg.catalog import load_cat

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
SGA-taichi-kato commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442998961 Hi @kevinjqliu > After appending to the table, is the record None when you read it back? table.scan().to_pandas() Here is the result. ```python ~

Re: [PR] Build: Enable errorprone PatternMatchingInstanceof [iceberg]

2024-10-28 Thread via GitHub
ebyhr closed pull request #11374: Build: Enable errorprone PatternMatchingInstanceof URL: https://github.com/apache/iceberg/pull/11374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442984949 In your example, the second row is ``` { "string_field_1": "field_1_b", }, ``` which corresponds with `string_field_1` ``` p

Re: [I] Operations on partition columns in `WHERE` clause not used in pruning [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #9678: URL: https://github.com/apache/iceberg/issues/9678#issuecomment-2442910420 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442982970 If the value shows as `None` in the pyarrow table, I'm pretty sure it'll be written as that during `append`. It would be weird otherwise. Another thing you can check is

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442979795 > So I expect that the value of the second record other than "string_field_1" to be null when I insert these records into the iceberg table using pyiceberg. ``` impo

Re: [PR] feat: Add 'Create Namespace' command to CLI [iceberg-go]

2024-10-28 Thread via GitHub
alex-kar commented on code in PR #179: URL: https://github.com/apache/iceberg-go/pull/179#discussion_r1819949262 ## cmd/iceberg/main.go: ## @@ -70,6 +71,7 @@ type Config struct { Uuid bool `docopt:"uuid"` Location bool `docopt:"location"` Propsboo

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819920681 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,27 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [I] Operations on partition columns in `WHERE` clause not used in pruning [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed issue #9678: Operations on partition columns in `WHERE` clause not used in pruning URL: https://github.com/apache/iceberg/issues/9678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] feat: Safer PartitionSpec & SchemalessPartitionSpec [iceberg-rust]

2024-10-28 Thread via GitHub
c-thiel commented on PR #645: URL: https://github.com/apache/iceberg-rust/pull/645#issuecomment-2442922625 @Xuanwo, @liurenjie1024 ready for another round. Ditched both the trait and the enum. Bare with me: As already mentioned earlier, the enum or the trait were just a vehicle to

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
stevie9868 commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819921287 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,27 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [I] Support for snowflake catalog in apache iceberg [iceberg-python]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #685: URL: https://github.com/apache/iceberg-python/issues/685#issuecomment-2442912961 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819918304 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,27 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [PR] add notes in hive docs [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on PR #9864: URL: https://github.com/apache/iceberg/pull/9864#issuecomment-2442910627 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Allow manifest file cache to be configurable [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on PR #10118: URL: https://github.com/apache/iceberg/pull/10118#issuecomment-2442910961 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Spark 3.5: Check table existence to determine which catalog for drop table [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on PR #10128: URL: https://github.com/apache/iceberg/pull/10128#issuecomment-2442911006 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Incorrect Metrics Calculation for Iceberg Table Due to Column Name Transformation with Special Characters [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #10115: URL: https://github.com/apache/iceberg/issues/10115#issuecomment-2442910903 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Open-api: update prefix param description [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed pull request #9870: Open-api: update prefix param description URL: https://github.com/apache/iceberg/pull/9870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] [feature request] Allow Java Iceberg library to write parquet files with special character column names [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #10120: URL: https://github.com/apache/iceberg/issues/10120#issuecomment-2442910981 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Open-api: update prefix param description [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on PR #9870: URL: https://github.com/apache/iceberg/pull/9870#issuecomment-2442910653 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] add notes in hive docs [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed pull request #9864: add notes in hive docs URL: https://github.com/apache/iceberg/pull/9864 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [I] Fix broken doc links of released versions [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #10116: URL: https://github.com/apache/iceberg/issues/10116#issuecomment-2442910923 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Update flink docs with alter column support [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed pull request #9756: Update flink docs with alter column support URL: https://github.com/apache/iceberg/pull/9756 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Spark: readStream from Iceberg doesn't progress anymore after running Maintenance (rewrite_data_files and rewrite_manifests) [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #10117: URL: https://github.com/apache/iceberg/issues/10117#issuecomment-2442910938 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Can we make commits inside compaction jobs with partial-progress.enabled sequential to avoid CommitFailedException? [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #9687: URL: https://github.com/apache/iceberg/issues/9687#issuecomment-2442910442 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How tracke authors of iceberg snapshots? [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #9928: URL: https://github.com/apache/iceberg/issues/9928#issuecomment-2442910706 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] java.lang.IllegalArgumentException: requirement failed: length (-6235972) cannot be smaller than -1 [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on issue #9689: URL: https://github.com/apache/iceberg/issues/9689#issuecomment-2442910464 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Update flink docs with alter column support [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] commented on PR #9756: URL: https://github.com/apache/iceberg/pull/9756#issuecomment-2442910554 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] java.lang.IllegalArgumentException: requirement failed: length (-6235972) cannot be smaller than -1 [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed issue #9689: java.lang.IllegalArgumentException: requirement failed: length (-6235972) cannot be smaller than -1 URL: https://github.com/apache/iceberg/issues/9689 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Can we make commits inside compaction jobs with partial-progress.enabled sequential to avoid CommitFailedException? [iceberg]

2024-10-28 Thread via GitHub
github-actions[bot] closed issue #9687: Can we make commits inside compaction jobs with partial-progress.enabled sequential to avoid CommitFailedException? URL: https://github.com/apache/iceberg/issues/9687 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] explore removing `numpy` as a dependency [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1259: URL: https://github.com/apache/iceberg-python/issues/1259#issuecomment-2442788004 This ChatGPT-generated function seems to be logically equivalent. And pass integration tests. But we should double check this ``` def _combine_positional_del

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
stevie9868 commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819915765 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,27 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819915138 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819881889 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819911008 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,27 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819886247 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819906609 ## format/spec.md: ## @@ -1494,6 +1536,20 @@ Writing v1 or v2 metadata: * For a single-arg transform, `source-id` should be written; if `source-ids` is also

[PR] Core: log a warn message if MetricsReporter fails [iceberg]

2024-10-28 Thread via GitHub
sullis opened a new pull request, #11416: URL: https://github.com/apache/iceberg/pull/11416 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1819894602 ## format/puffin-spec.md: ## @@ -123,6 +123,57 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819773532 ## format/spec.md: ## @@ -585,13 +589,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _option

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1819894337 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1819890674 ## format/puffin-spec.md: ## @@ -123,6 +123,57 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819892263 ## format/spec.md: ## @@ -585,13 +589,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_ |

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1819889757 ## format/puffin-spec.md: ## @@ -123,6 +123,57 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819884060 ## open-api/rest-catalog-open-api.py: ## @@ -854,6 +854,16 @@ class ContentFile(BaseModel): class PositionDeleteFile(ContentFile): content: Literal['position-d

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819884319 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on code in PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#discussion_r1819883002 ## tests/integration/test_writes/test_writes.py: ## @@ -1448,3 +1448,26 @@ def test_rewrite_manifest_after_partition_evolution(session_catalog: Catalog) ->

Re: [PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-10-28 Thread via GitHub
rdblue commented on PR #11415: URL: https://github.com/apache/iceberg/pull/11415#issuecomment-2442841569 @aihuaxu I cleaned up my implementation, added tests, and fixed quite a few bugs. Please take a look to help validate that it implements the spec correctly. Thanks! -- This is an auto

[PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-10-28 Thread via GitHub
rdblue opened a new pull request, #11415: URL: https://github.com/apache/iceberg/pull/11415 This PR adds an implementation of the [Variant encoding spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md) that can read serialized Variant buffers. This implementation wa

Re: [PR] abort the whole table transaction if any updates in the transaction has failed [iceberg-python]

2024-10-28 Thread via GitHub
stevie9868 commented on PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#issuecomment-2442818473 I have decided to move the test under `integeration/test_writes/test_writes.py` test instead of `tests/table/test_init.py` given that: 1. Many of the existing test fixtur

[PR] Bump psycopg2-binary from 2.9.9 to 2.9.10 [iceberg-python]

2024-10-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1262: URL: https://github.com/apache/iceberg-python/pull/1262 Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.9 to 2.9.10. Changelog Sourced from https://github.com/psycopg/psycopg2/blob/master/NEWS";>psycopg2-binary

[PR] Bump moto from 5.0.17 to 5.0.18 [iceberg-python]

2024-10-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1261: URL: https://github.com/apache/iceberg-python/pull/1261 Bumps [moto](https://github.com/getmoto/moto) from 5.0.17 to 5.0.18. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

[PR] Bump getdaft from 0.3.8 to 0.3.9 [iceberg-python]

2024-10-28 Thread via GitHub
dependabot[bot] opened a new pull request, #1260: URL: https://github.com/apache/iceberg-python/pull/1260 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.3.8 to 0.3.9. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] Bump PyArrow to 18.0.0 [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on PR #1256: URL: https://github.com/apache/iceberg-python/pull/1256#issuecomment-2442785346 opened #1259 to continue the `numpy` deprecation conversation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [I] add an option to automatically use the table name in the path [iceberg-python]

2024-10-28 Thread via GitHub
corleyma commented on issue #1254: URL: https://github.com/apache/iceberg-python/issues/1254#issuecomment-2442751487 In case it's relevant, there is [an issue](https://github.com/apache/iceberg-python/issues/861) re: adopting a pluggable LocationProvider interface akin to what exists in ic

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819779135 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-28 Thread via GitHub
ChaladiMohanVamsi commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1819175720 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Add list_views for hive catalog [iceberg-python]

2024-10-28 Thread via GitHub
omkenge commented on PR #1251: URL: https://github.com/apache/iceberg-python/pull/1251#issuecomment-2442663561 Hi @kevinjqliu , ### Previous Situation In the existing implementation of the list_tables method within hive catalog, the functionality aimed to retrieve all tables under a sp

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819773532 ## format/spec.md: ## @@ -585,13 +589,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _option

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-28 Thread via GitHub
aokolnychyi merged PR #11372: URL: https://github.com/apache/iceberg/pull/11372 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-28 Thread via GitHub
aokolnychyi commented on PR #11372: URL: https://github.com/apache/iceberg/pull/11372#issuecomment-2442626638 I am going to merge this one as it was thoroughly reviewed. There is no need to include it in 1.7, though (no harm too). Thanks for reviewing, @rdblue @RussellSpitzer @amogh-j

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-28 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1819756467 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Fix ADLSLocation file parsing [iceberg]

2024-10-28 Thread via GitHub
bryanck commented on PR #11395: URL: https://github.com/apache/iceberg/pull/11395#issuecomment-2442586580 I originally stripped off the query params to be on the safe side, but given query params aren't specified in the URI format this change looks OK to me, +1 to Dan's comment about adding

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-28 Thread via GitHub
Fokko commented on PR #11279: URL: https://github.com/apache/iceberg/pull/11279#issuecomment-2442559693 @ajantha-bhat First of all, thanks for working on this, and sorry for the late reply as I was on parental leave. Unfortunately, there are issues with this. I just ran the following to che

Re: [PR] REST: Docker file for Rest catalog adapter image [iceberg]

2024-10-28 Thread via GitHub
Fokko commented on code in PR #11283: URL: https://github.com/apache/iceberg/pull/11283#discussion_r1819691281 ## docker/iceberg-rest-adapter-image/Dockerfile: ## @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agr

Re: [I] Why is the struct field not null when the data is empty? [iceberg-python]

2024-10-28 Thread via GitHub
kevinjqliu commented on issue #1255: URL: https://github.com/apache/iceberg-python/issues/1255#issuecomment-2442502372 hi @SGA-taichi-kato thanks for reporting this issue. From the description, it is not clear to me what the problem is. Do you mind clarifying what you expect to see

Re: [I] [SPARK] Fix flakey test [iceberg]

2024-10-28 Thread via GitHub
Fokko closed issue #10569: [SPARK] Fix flakey test URL: https://github.com/apache/iceberg/issues/10569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

Re: [PR] Spark 3.5: Fix flaky test due to deleting temp directory failure [iceberg]

2024-10-28 Thread via GitHub
Fokko merged PR #10811: URL: https://github.com/apache/iceberg/pull/10811 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] Flaky test due to failing to delete temp directory [iceberg]

2024-10-28 Thread via GitHub
Fokko closed issue #10480: Flaky test due to failing to delete temp directory URL: https://github.com/apache/iceberg/issues/10480 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Build: Bump com.azure:azure-sdk-bom from 1.2.25 to 1.2.28 [iceberg]

2024-10-28 Thread via GitHub
Fokko merged PR #11267: URL: https://github.com/apache/iceberg/pull/11267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-28 Thread via GitHub
abharath9 commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1819647849 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceBoundedRow.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-28 Thread via GitHub
jackye1995 commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1819620667 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientProperties.java: ## @@ -66,21 +67,39 @@ public class AwsClientProperties implements Serializable { */ p

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-28 Thread via GitHub
jackye1995 commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1819619568 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] fix: list_tables method in glue catalog now only return tables. [iceberg-python]

2024-10-28 Thread via GitHub
omkenge commented on PR #1258: URL: https://github.com/apache/iceberg-python/pull/1258#issuecomment-2442388133 Hi Team , @kevinjqliu @sungwy While working I faced this issue ### Error Details: When views are included in the output of list_tables and subsequently processed, the ab

[PR] fix: list_tables method in glue catalog now only return tables. [iceberg-python]

2024-10-28 Thread via GitHub
omkenge opened a new pull request, #1258: URL: https://github.com/apache/iceberg-python/pull/1258 ### Description: This pull request addresses an issue where the list_tables function was returning views alongside Iceberg tables. Since views lack the table_type property or have it set di

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
dramaticlly commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819586127 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | fals

Re: [PR] Exclude views from list tables for AWS Glue Catalog [iceberg-python]

2024-10-28 Thread via GitHub
omkenge commented on PR #1257: URL: https://github.com/apache/iceberg-python/pull/1257#issuecomment-2442362373 closed due to mistake -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Exclude views from list tables for AWS Glue Catalog [iceberg-python]

2024-10-28 Thread via GitHub
omkenge closed pull request #1257: Exclude views from list tables for AWS Glue Catalog URL: https://github.com/apache/iceberg-python/pull/1257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2024-10-28 Thread via GitHub
RussellSpitzer commented on PR #11216: URL: https://github.com/apache/iceberg/pull/11216#issuecomment-2442317326 Moving out of 1.7.0 since we still have a bit of discussion here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add `all_manifests` metadata table with tests [iceberg-python]

2024-10-28 Thread via GitHub
soumya-ghosh commented on code in PR #1241: URL: https://github.com/apache/iceberg-python/pull/1241#discussion_r1819462207 ## pyiceberg/table/inspect.py: ## @@ -405,13 +419,19 @@ def _partition_summaries_to_rows( "partition_summaries": _partition_summaries_

[PR] Exclude views from list tables for AWS Glue Catalog [iceberg-python]

2024-10-28 Thread via GitHub
omkenge opened a new pull request, #1257: URL: https://github.com/apache/iceberg-python/pull/1257 ### Description: This pull request addresses an issue where the list_tables function was returning views alongside Iceberg tables. Since views lack the table_type property or have it set di

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
dramaticlly commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819559913 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | fals

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
dramaticlly commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819552738 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | fals

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-28 Thread via GitHub
ChaladiMohanVamsi commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1819021581 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientProperties.java: ## @@ -136,6 +156,12 @@ public void applyClientCredentialConfigurations(T b @Supp

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
RussellSpitzer commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819529998 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | f

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
RussellSpitzer commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819530496 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | f

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-28 Thread via GitHub
aokolnychyi commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1819529635 ## core/src/main/java/org/apache/iceberg/deletes/RoaringPositionBitmap.java: ## @@ -0,0 +1,317 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Doc: Update rewrite data files spark procedure [iceberg]

2024-10-28 Thread via GitHub
RussellSpitzer commented on code in PR #11396: URL: https://github.com/apache/iceberg/pull/11396#discussion_r1819529086 ## docs/docs/spark-procedures.md: ## @@ -402,7 +403,8 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `rewrite-all` | f

  1   2   >