Re: [PR] Hive catalog: Add retry logic for hive locking [iceberg-python]

2024-05-08 Thread via GitHub
frankliee commented on code in PR #701: URL: https://github.com/apache/iceberg-python/pull/701#discussion_r1594885592 ## pyiceberg/catalog/hive.py: ## @@ -111,6 +122,15 @@ HIVE2_COMPATIBLE = "hive.hive2-compatible" HIVE2_COMPATIBLE_DEFAULT = False +LOCK_CHECK_MIN_WAIT_TIME =

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-08 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1594829408 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/BatchReadConf.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Build: Bump getdaft from 0.2.23 to 0.2.24 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #721: URL: https://github.com/apache/iceberg-python/pull/721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump mkdocs-material from 9.5.20 to 9.5.21 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #719: URL: https://github.com/apache/iceberg-python/pull/719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Build: Bump getdaft from 0.2.23 to 0.2.24 [iceberg-python]

2024-05-08 Thread via GitHub
dependabot[bot] opened a new pull request, #721: URL: https://github.com/apache/iceberg-python/pull/721 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.2.23 to 0.2.24. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

[PR] Build: Bump ray from 2.9.2 to 2.21.0 [iceberg-python]

2024-05-08 Thread via GitHub
dependabot[bot] opened a new pull request, #720: URL: https://github.com/apache/iceberg-python/pull/720 Bumps [ray](https://github.com/ray-project/ray) from 2.9.2 to 2.21.0. Release notes Sourced from https://github.com/ray-project/ray/releases";>ray's releases. Ray-2.21.0

[PR] Build: Bump mkdocs-material from 9.5.20 to 9.5.21 [iceberg-python]

2024-05-08 Thread via GitHub
dependabot[bot] opened a new pull request, #719: URL: https://github.com/apache/iceberg-python/pull/719 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.20 to 9.5.21. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mk

Re: [PR] fix (manifest-list): added serde aliases to support both forms conventions [iceberg-rust]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #365: URL: https://github.com/apache/iceberg-rust/pull/365#discussion_r1594775872 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -802,6 +802,9 @@ pub(super) mod _serde { pub key_metadata: Option, } +// Aliases were added to f

Re: [PR] fix (manifest-list): added serde aliases to support both forms conventions [iceberg-rust]

2024-05-08 Thread via GitHub
a-agmon commented on PR #365: URL: https://github.com/apache/iceberg-rust/pull/365#issuecomment-2101597092 Added test files and test. Thanks @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Support Appends with TimeTransform Partitions [iceberg-python]

2024-05-08 Thread via GitHub
syun64 commented on code in PR #703: URL: https://github.com/apache/iceberg-python/pull/703#discussion_r1594758142 ## pyiceberg/transforms.py: ## @@ -515,6 +583,19 @@ def __repr__(self) -> str: """Return the string representation of the HourTransform class."""

Re: [PR] Build: Bump flask-cors from 4.0.0 to 4.0.1 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #718: URL: https://github.com/apache/iceberg-python/pull/718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Update site to 1.5.2 docs [iceberg]

2024-05-08 Thread via GitHub
amogh-jahagirdar opened a new pull request, #10291: URL: https://github.com/apache/iceberg/pull/10291 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594704486 ## pyiceberg/table/__init__.py: ## @@ -434,6 +458,9 @@ def overwrite( if table_arrow_schema != df.schema: df = df.cast(table_arrow_schema) +

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594703680 ## pyiceberg/table/__init__.py: ## @@ -434,6 +458,9 @@ def overwrite( if table_arrow_schema != df.schema: df = df.cast(table_arrow_schema) +

[PR] Build: Bump flask-cors from 4.0.0 to 4.0.1 [iceberg-python]

2024-05-08 Thread via GitHub
dependabot[bot] opened a new pull request, #718: URL: https://github.com/apache/iceberg-python/pull/718 Bumps [flask-cors](https://github.com/corydolphin/flask-cors) from 4.0.0 to 4.0.1. Release notes Sourced from https://github.com/corydolphin/flask-cors/releases";>flask-cors's r

Re: [PR] Hive catalog: Add retry logic for hive locking [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #701: URL: https://github.com/apache/iceberg-python/pull/701#discussion_r1594699293 ## pyiceberg/catalog/hive.py: ## @@ -111,6 +122,15 @@ HIVE2_COMPATIBLE = "hive.hive2-compatible" HIVE2_COMPATIBLE_DEFAULT = False +LOCK_CHECK_MIN_WAIT_TIME = "lo

Re: [PR] Build: Bump sqlalchemy from 2.0.29 to 2.0.30 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #712: URL: https://github.com/apache/iceberg-python/pull/712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump coverage from 7.5.0 to 7.5.1 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #713: URL: https://github.com/apache/iceberg-python/pull/713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump mkdocstrings from 0.25.0 to 0.25.1 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #715: URL: https://github.com/apache/iceberg-python/pull/715 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump tenacity from 8.2.3 to 8.3.0 [iceberg-python]

2024-05-08 Thread via GitHub
Fokko merged PR #714: URL: https://github.com/apache/iceberg-python/pull/714 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Support Appends with TimeTransform Partitions [iceberg-python]

2024-05-08 Thread via GitHub
benihildebrand commented on code in PR #703: URL: https://github.com/apache/iceberg-python/pull/703#discussion_r1594581596 ## pyiceberg/transforms.py: ## @@ -515,6 +583,19 @@ def __repr__(self) -> str: """Return the string representation of the HourTransform class."""

Re: [I] `parquet_path_to_id_mapping` generates incorrect path for List types [iceberg-python]

2024-05-08 Thread via GitHub
cgbur commented on issue #716: URL: https://github.com/apache/iceberg-python/issues/716#issuecomment-2101251170 Ah, confusingly there appears to be writer differences that cause the issue. My Rust pyarrow implementation matches when polars has `pyarrow=True`. ```python import polar

Re: [I] `parquet_path_to_id_mapping` generates incorrect path for List types [iceberg-python]

2024-05-08 Thread via GitHub
cgbur commented on issue #716: URL: https://github.com/apache/iceberg-python/issues/716#issuecomment-2101184351 Here is a complete example recreating the error. Here I am using polars to make the table which results in the same schema that I am producing with pyarrow. ```python i

[PR] Fix struct evolution default value rule list [iceberg]

2024-05-08 Thread via GitHub
sfc-gh-dmetzgar opened a new pull request, #10290: URL: https://github.com/apache/iceberg/pull/10290 The documentation site does not show the rules as a list because some markdown readers need an empty line between a paragraph and a list. The list on the docs site currently looks like

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-08 Thread via GitHub
rodmeneses commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1594360116 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/IcebergFlinkManifestUtil.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Softw

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-05-08 Thread via GitHub
slessard commented on code in PR #10284: URL: https://github.com/apache/iceberg/pull/10284#discussion_r1594328291 ## build.gradle: ## @@ -840,6 +840,8 @@ project(':iceberg-arrow') { exclude group: 'org.codehaus.jackson' } +testImplementation 'org.apache.spark:s

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-05-08 Thread via GitHub
slessard commented on code in PR #10284: URL: https://github.com/apache/iceberg/pull/10284#discussion_r1594303461 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactoryTest.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Core: Introduce AuthConfig [iceberg]

2024-05-08 Thread via GitHub
flyrain commented on code in PR #10161: URL: https://github.com/apache/iceberg/pull/10161#discussion_r1594289258 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -213,12 +214,13 @@ private AuthSession authSession() { e

Re: [PR] Add ManifestFile Stats in snapshot summary. [iceberg]

2024-05-08 Thread via GitHub
ajantha-bhat commented on code in PR #10246: URL: https://github.com/apache/iceberg/pull/10246#discussion_r1594265875 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -156,6 +156,8 @@ public List apply(TableMetadata base, Snapshot snapshot) { manifests.add

Re: [PR] feat: Convert predicate to arrow filter and push down to parquet reader [iceberg-rust]

2024-05-08 Thread via GitHub
viirya commented on code in PR #295: URL: https://github.com/apache/iceberg-rust/pull/295#discussion_r1594186293 ## crates/iceberg/src/scan.rs: ## @@ -689,4 +720,90 @@ mod tests { let int64_arr = col2.as_any().downcast_ref::().unwrap(); assert_eq!(int64_arr.val

Re: [PR] Add support to use regional endpoints for STS client while using assume role [iceberg]

2024-05-08 Thread via GitHub
munendrasn commented on PR #10289: URL: https://github.com/apache/iceberg/pull/10289#issuecomment-2100727870 @jackye1995 @amogh-jahagirdar @nastra Please review. For testing, couldn't find other way to validate, please suggest if the current approach doesn't work -- This is an automat

[PR] Add support to use regional endpoints for STS client while using assume role [iceberg]

2024-05-08 Thread via GitHub
munendrasn opened a new pull request, #10289: URL: https://github.com/apache/iceberg/pull/10289 STS by default uses Global endpoint but AWS recommends to use regional [endpoints](https://aws.amazon.com/blogs/security/how-to-use-regional-aws-sts-endpoints/). The PR introduces new prope

Re: [I] "Iceberg.engine.hive.enabled" Conf is not honouring for HIVE CATALOG [iceberg]

2024-05-08 Thread via GitHub
shivjha30 commented on issue #10286: URL: https://github.com/apache/iceberg/issues/10286#issuecomment-2100717247 @pvary then, for a cluster level if we set icebeg.engine.hive.enabled as false and we don’t pass the table properties(hive.engine.enabled as false), even then we are creating the

Re: [PR] AWS: Retain Glue Catalog table description after updating Iceberg table [iceberg]

2024-05-08 Thread via GitHub
aajisaka commented on PR #10199: URL: https://github.com/apache/iceberg/pull/10199#issuecomment-2100651509 Hi @geruh @amogh-jahagirdar would you review the latest change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594058803 ## tests/integration/test_deletes.py: ## @@ -0,0 +1,257 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594057885 ## pyiceberg/table/__init__.py: ## @@ -2897,12 +2959,161 @@ def _commit(self) -> UpdatesAndRequirements: ), ( AssertTable

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594055639 ## pyiceberg/table/__init__.py: ## @@ -443,6 +468,54 @@ def overwrite( for data_file in data_files: update_snapshot.append_data

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594053750 ## pyiceberg/table/__init__.py: ## @@ -443,6 +468,54 @@ def overwrite( for data_file in data_files: update_snapshot.append_data

Re: [PR] Support partial deletes [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1594048917 ## pyiceberg/table/__init__.py: ## @@ -292,7 +303,13 @@ def _apply(self, updates: Tuple[TableUpdate, ...], requirements: Tuple[TableRequ requirement.va

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-05-08 Thread via GitHub
nastra commented on code in PR #10284: URL: https://github.com/apache/iceberg/pull/10284#discussion_r1594003789 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactoryTest.java: ## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] #10275 - fix NullPointerException [iceberg]

2024-05-08 Thread via GitHub
nastra commented on code in PR #10284: URL: https://github.com/apache/iceberg/pull/10284#discussion_r1594002633 ## build.gradle: ## @@ -840,6 +840,8 @@ project(':iceberg-arrow') { exclude group: 'org.codehaus.jackson' } +testImplementation 'org.apache.spark:spa

Re: [PR] Support special chars in S3URI [iceberg]

2024-05-08 Thread via GitHub
dimas-b commented on PR #10283: URL: https://github.com/apache/iceberg/pull/10283#issuecomment-2100455393 @jackye1995 : Would you think about special chars in S3 URIs? I'd appreciate your insight. Thanks! -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [I] `pylintrc` still needed? [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on issue #666: URL: https://github.com/apache/iceberg-python/issues/666#issuecomment-2100411301 No, the `pylintrc` can go πŸ‘ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] `parquet_path_to_id_mapping` generates incorrect path for List types [iceberg-python]

2024-05-08 Thread via GitHub
Fokko commented on issue #716: URL: https://github.com/apache/iceberg-python/issues/716#issuecomment-2100409045 @cgbur Thanks for raising this. Could you share the stack trace that you're seeing? I tried to reproduce it, but it works on my end: ```python def test_data_file_s

Re: [I] Spark rewrite Files Action OOM [iceberg]

2024-05-08 Thread via GitHub
manuzhang commented on issue #10054: URL: https://github.com/apache/iceberg/issues/10054#issuecomment-2100175845 > I have implemented a disk-based map to solve this problem. Is this what Iceberg expects? If so, I will submit the code. @Zhanxiao-Ma I think it will be valuable to the commun

Re: [PR] fix (manifest-list): added serde aliases to support both forms conventions [iceberg-rust]

2024-05-08 Thread via GitHub
a-agmon commented on PR #365: URL: https://github.com/apache/iceberg-rust/pull/365#issuecomment-2100100759 > Looks good, at least until we move to field IDs. But could we add a test data file to check the aliases work? πŸ‘πŸΌ Absolutely. Thanks -- This is an automated message from the

Re: [PR] fix (manifest-list): added serde aliases to support both forms conventions [iceberg-rust]

2024-05-08 Thread via GitHub
sdd commented on PR #365: URL: https://github.com/apache/iceberg-rust/pull/365#issuecomment-2099979115 Looks good, at least until we move to field IDs. But could we add a test data file to check the aliases work? πŸ‘πŸΌ -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [Spec] Add Iceberg Materialized View Spec [iceberg]

2024-05-08 Thread via GitHub
wmoustafa commented on code in PR #10280: URL: https://github.com/apache/iceberg/pull/10280#discussion_r1593549087 ## format/materialized-view-spec.md: ## @@ -0,0 +1,131 @@ + + +# Iceberg Materialized View Spec + +## Background and Motivation +Iceberg views are a powerful tool t

Re: [PR] [Spec] Add Iceberg Materialized View Spec [iceberg]

2024-05-08 Thread via GitHub
wmoustafa commented on code in PR #10280: URL: https://github.com/apache/iceberg/pull/10280#discussion_r1593545018 ## format/materialized-view-spec.md: ## @@ -0,0 +1,131 @@ + + +# Iceberg Materialized View Spec + +## Background and Motivation +Iceberg views are a powerful tool t

[PR] Add manifests metadata table [iceberg-python]

2024-05-08 Thread via GitHub
geruh opened a new pull request, #717: URL: https://github.com/apache/iceberg-python/pull/717 This PR adds the manifests metadata table the existing inspect logic for Iceberg tables as listed in #511. The manifests metadata table in Iceberg shows the current file manifests for a given table

Re: [PR] [Spec] Add Iceberg Materialized View Spec [iceberg]

2024-05-08 Thread via GitHub
ajantha-bhat commented on code in PR #10280: URL: https://github.com/apache/iceberg/pull/10280#discussion_r1593522217 ## format/materialized-view-spec.md: ## @@ -0,0 +1,131 @@ + + +# Iceberg Materialized View Spec + +## Background and Motivation +Iceberg views are a powerful too

Re: [PR] Spark Action to Analyze table [iceberg]

2024-05-08 Thread via GitHub
ajantha-bhat commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1593501067 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/AnalyzeTableSparkAction.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Spark Action to Analyze table [iceberg]

2024-05-08 Thread via GitHub
ajantha-bhat commented on code in PR #10288: URL: https://github.com/apache/iceberg/pull/10288#discussion_r1593501067 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/AnalyzeTableSparkAction.java: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Iceberg.engine.hive.enabled Conf is not honouring for HIVE CATALOG #10286 [iceberg]

2024-05-08 Thread via GitHub
pvary commented on PR #10287: URL: https://github.com/apache/iceberg/pull/10287#issuecomment-2099893087 This is the expected behaviour. See: https://github.com/apache/iceberg/pull/1495#discussion_r1582544488 Could you please describe your use-case? -- This is an automated message

Re: [I] "Iceberg.engine.hive.enabled" Conf is not honouring for HIVE CATALOG [iceberg]

2024-05-08 Thread via GitHub
pvary commented on issue #10286: URL: https://github.com/apache/iceberg/issues/10286#issuecomment-2099890479 This is intentional. If you are using [HiveIcebergMetaHook](https://github.com/apache/iceberg/blob/ed0959257cba02f378f7097d81cecaaaef9fa43f/mr/src/main/java/org/apache/iceberg/mr/h