Re: [PR] Add Python Release Action to publish `pyiceberg_core` dist to Pypi [iceberg-rust]

2024-11-22 Thread via GitHub
Xuanwo commented on code in PR #705: URL: https://github.com/apache/iceberg-rust/pull/705#discussion_r1855140601 ## .github/workflows/release_python.yml: ## @@ -0,0 +1,137 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Spark 3.3: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho commented on PR #11625: URL: https://github.com/apache/iceberg/pull/11625#issuecomment-2495374411 Merged thanks @pan3793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Spark 3.3: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho merged PR #11625: URL: https://github.com/apache/iceberg/pull/11625 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho commented on PR #11624: URL: https://github.com/apache/iceberg/pull/11624#issuecomment-2495373788 Merged, thanks @pan3793 and @nastra for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Spark 3.5: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho merged PR #11624: URL: https://github.com/apache/iceberg/pull/11624 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.4: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho merged PR #7732: URL: https://github.com/apache/iceberg/pull/7732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Spark 3.4: IcebergSource extends SessionConfigSupport [iceberg]

2024-11-22 Thread via GitHub
szehon-ho commented on PR #7732: URL: https://github.com/apache/iceberg/pull/7732#issuecomment-2495373599 Merged, thanks @pan3793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
jbonofre commented on code in PR #11430: URL: https://github.com/apache/iceberg/pull/11430#discussion_r1855135884 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/data/AbstractTestFlinkAvroReaderWriter.java: ## @@ -80,20 +77,18 @@ private void writeAndValidate(Schema

Re: [PR] Document procedure for stats collection [iceberg]

2024-11-22 Thread via GitHub
szehon-ho commented on code in PR #11606: URL: https://github.com/apache/iceberg/pull/11606#discussion_r1855135154 ## docs/docs/spark-procedures.md: ## @@ -936,3 +936,40 @@ as an `UPDATE_AFTER` image, resulting in the following pre/post update images: |-||-

Re: [PR] Docs: Mention look-free requires HIVE-28121 for MySQL/MariaDB-based HMS [iceberg]

2024-11-22 Thread via GitHub
pvary commented on PR #11631: URL: https://github.com/apache/iceberg/pull/11631#issuecomment-2495365499 Thanks for the clarification @pan3793 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Docs: Mention look-free requires HIVE-28121 for MySQL/MariaDB-based HMS [iceberg]

2024-11-22 Thread via GitHub
pvary merged PR #11631: URL: https://github.com/apache/iceberg/pull/11631 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
pvary commented on code in PR #11430: URL: https://github.com/apache/iceberg/pull/11430#discussion_r1855131415 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/data/AbstractTestFlinkAvroReaderWriter.java: ## @@ -80,20 +77,18 @@ private void writeAndValidate(Schema sch

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1855092790 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854320423 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [I] `catalog.table_exists()` returns 'False' when table exists in Polaris catalog [iceberg-python]

2024-11-22 Thread via GitHub
JasperHG90 commented on issue #1363: URL: https://github.com/apache/iceberg-python/issues/1363#issuecomment-2495318937 I've annotated this method with what I see in the debugger ```python @retry(**_RETRY_ARGS) def table_exists(self, identifier: Union[str, Identifier]) -> boo

Re: [PR] fix `KeyError` raised by `add_files` when parquet file doe not have column stats [iceberg-python]

2024-11-22 Thread via GitHub
binayakd commented on PR #1354: URL: https://github.com/apache/iceberg-python/pull/1354#issuecomment-2495238970 Pushed a change to fix the python 3.9 compatibility and updated the test based on the comment, @kevinjqliu. Thanks! -- This is an automated message from the Apache Git Service.

Re: [PR] fix `KeyError` raised by `add_files` when parquet file doe not have column stats [iceberg-python]

2024-11-22 Thread via GitHub
binayakd commented on code in PR #1354: URL: https://github.com/apache/iceberg-python/pull/1354#discussion_r1855024655 ## tests/io/test_pyarrow_stats.py: ## @@ -681,6 +685,39 @@ def test_stats_types(table_schema_nested: Schema) -> None: ] +def test_read_missing_statisti

Re: [PR] fix `KeyError` raised by `add_files` when parquet file doe not have column stats [iceberg-python]

2024-11-22 Thread via GitHub
binayakd commented on code in PR #1354: URL: https://github.com/apache/iceberg-python/pull/1354#discussion_r1855021256 ## tests/io/test_pyarrow_stats.py: ## @@ -681,6 +685,39 @@ def test_stats_types(table_schema_nested: Schema) -> None: ] +def test_read_missing_statisti

Re: [PR] fix `KeyError` raised by `add_files` when parquet file doe not have column stats [iceberg-python]

2024-11-22 Thread via GitHub
binayakd commented on code in PR #1354: URL: https://github.com/apache/iceberg-python/pull/1354#discussion_r1855019099 ## tests/io/test_pyarrow_stats.py: ## @@ -681,6 +685,39 @@ def test_stats_types(table_schema_nested: Schema) -> None: ] +def test_read_missing_statisti

Re: [PR] Iceberg Kafka Connect :: Writer Per Topic Partition Design [iceberg]

2024-11-22 Thread via GitHub
github-actions[bot] closed pull request #11290: Iceberg Kafka Connect :: Writer Per Topic Partition Design URL: https://github.com/apache/iceberg/pull/11290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Iceberg Kafka Connect :: Writer Per Topic Partition Design [iceberg]

2024-11-22 Thread via GitHub
github-actions[bot] commented on PR #11290: URL: https://github.com/apache/iceberg/pull/11290#issuecomment-2495133613 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] OpenAPI: Define REST Catalog models for Snapshot Production [iceberg]

2024-11-22 Thread via GitHub
github-actions[bot] commented on PR #11287: URL: https://github.com/apache/iceberg/pull/11287#issuecomment-2495133597 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] More accurate estimate on parquet row groups size [iceberg]

2024-11-22 Thread via GitHub
github-actions[bot] commented on PR #11258: URL: https://github.com/apache/iceberg/pull/11258#issuecomment-2495133577 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
guykhazma commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854912088 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -248,6 +296,88 @@ protected Statistics estimateStatistics(Snapshot snapshot)

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
guykhazma commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854912088 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -248,6 +296,88 @@ protected Statistics estimateStatistics(Snapshot snapshot)

Re: [PR] Spec: add variant type [iceberg]

2024-11-22 Thread via GitHub
emkornfield commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1854866077 ## format/spec.md: ## @@ -1154,6 +1169,7 @@ Maps with non-string keys must use an array representation with the `map` logica |**`struct`**|`record`|| |**`list`

Re: [PR] Document procedure for stats collection [iceberg]

2024-11-22 Thread via GitHub
karuppayya commented on code in PR #11606: URL: https://github.com/apache/iceberg/pull/11606#discussion_r1854799006 ## docs/docs/spark-procedures.md: ## @@ -936,3 +936,40 @@ as an `UPDATE_AFTER` image, resulting in the following pre/post update images: |-||

[PR] Bump pydantic from 2.10.0 to 2.10.1 [iceberg-python]

2024-11-22 Thread via GitHub
dependabot[bot] opened a new pull request, #1364: URL: https://github.com/apache/iceberg-python/pull/1364 Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.10.0 to 2.10.1. Release notes Sourced from https://github.com/pydantic/pydantic/releases";>pydantic's releases.

Re: [I] `catalog.load_table` raises Invalid JSON error [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on issue #1328: URL: https://github.com/apache/iceberg-python/issues/1328#issuecomment-2494993070 if it's empty on read, it's most likely related to a permission issue. Here's something you can run to debug. ``` metadata_location = "s3://" io = cat

Re: [I] `catalog.table_exists()` returns 'False' when table exists in Polaris catalog [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on issue #1363: URL: https://github.com/apache/iceberg-python/issues/1363#issuecomment-2494980924 Thank you for bringing this up. Its been on my todo list to investigate this as part of https://github.com/apache/iceberg-python/issues/1018#issuecomment-2471827257

[I] java.lang.IllegalStateException: Connection pool shut down in Spark [iceberg]

2024-11-22 Thread via GitHub
davseitsev opened a new issue, #11633: URL: https://github.com/apache/iceberg/issues/11633 ### Apache Iceberg version 1.7.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 We have a maintenance job which run all necessary SparkActions o

[I] `table_exists()` method does not work properly with Polaris catalog [iceberg-python]

2024-11-22 Thread via GitHub
JasperHG90 opened a new issue, #1363: URL: https://github.com/apache/iceberg-python/issues/1363 ### Apache Iceberg version 0.8.0 (latest release) ### Please describe the bug 🐞 Hi πŸ‘‹ I see quite a few issues/PRs centered around the `table_exists()` method when usin

Re: [PR] feat: support append data file and add e2e test [iceberg-rust]

2024-11-22 Thread via GitHub
ZENOTME commented on code in PR #349: URL: https://github.com/apache/iceberg-rust/pull/349#discussion_r1854528111 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -106,34 +106,38 @@ impl std::fmt::Debug for ManifestListWriter { impl ManifestListWriter { /// Construct a

Re: [PR] Spec: add variant type [iceberg]

2024-11-22 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1854522786 ## format/spec.md: ## @@ -1436,6 +1457,7 @@ This serialization scheme is for storing single values as individual binary valu | **`struct`** | Not su

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-11-22 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2494482108 Sorry about that. It was my over zealous auto-formatter. Reverted any changes that are not relevant to the PR. -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-11-22 Thread via GitHub
bryanck commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2494489489 I feel like we can take a much simpler approach for this. For more complex routing needs, an SMT makes more sense to me. -- This is an automated message from the Apache Git Service. T

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
RussellSpitzer commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854422282 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot sn

Re: [PR] Spec: add variant type [iceberg]

2024-11-22 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1854383036 ## format/spec.md: ## @@ -444,7 +459,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | Transform name| Description

Re: [PR] Create publish-docker.yml [iceberg]

2024-11-22 Thread via GitHub
kevinjqliu commented on PR #11632: URL: https://github.com/apache/iceberg/pull/11632#issuecomment-2494376136 in addition to pushing the latest, it would be great to publish images tagged with specific releases (1.7/1.8/etc) also, I found the GitHub action that generates the hive docker im

Re: [PR] Spark: Write DVs for V3 MoR tables [iceberg]

2024-11-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #11561: URL: https://github.com/apache/iceberg/pull/11561#discussion_r1854358716 ## core/src/main/java/org/apache/iceberg/io/PartitioningDVWriter.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Create publish-docker.yml [iceberg]

2024-11-22 Thread via GitHub
sungwy commented on PR #11632: URL: https://github.com/apache/iceberg/pull/11632#issuecomment-2494376574 > I was waiting for the account creation. But thanks for working on it. I think many of us are eager to get the docker image in the hub to improve our integration tests in the sub

Re: [I] [Bug] Iceberg tables break when they're named any of the metadata table names (e.g. `files`, `history`, `manifests`) [iceberg]

2024-11-22 Thread via GitHub
blakelivingston commented on issue #10550: URL: https://github.com/apache/iceberg/issues/10550#issuecomment-2494335597 Just chiming in that I am also experiencing this same bug using JDBC catalog and Minio. `org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.6.1`. Either 'files' or 'Files'

Re: [I] `catalog.load_table` raises Invalid JSON error [iceberg-python]

2024-11-22 Thread via GitHub
sandcobainer commented on issue #1328: URL: https://github.com/apache/iceberg-python/issues/1328#issuecomment-2494328682 @Fokko I've tried to see if my s3 credentials are the issue, but that doesn't seem to be the issue. Here's the metadata file that i downloaded directly from the S3 bucke

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Create publish-docker.yml [iceberg]

2024-11-22 Thread via GitHub
ajantha-bhat commented on PR #11632: URL: https://github.com/apache/iceberg/pull/11632#issuecomment-2494321802 I was waiting for the account creation. But thanks for working on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854320423 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] 1.7.x cherry pick #11526 [iceberg]

2024-11-22 Thread via GitHub
bryanck merged PR #11629: URL: https://github.com/apache/iceberg/pull/11629 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] 1.7.x cherry pick #11526 [iceberg]

2024-11-22 Thread via GitHub
bryanck commented on PR #11629: URL: https://github.com/apache/iceberg/pull/11629#issuecomment-2494301022 Thanks for the reviews @nastra @amogh-jahagirdar and @kevinjqliu ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Spark : Derive Stats From Manifest on the Fly [iceberg]

2024-11-22 Thread via GitHub
saitharun15 commented on code in PR #11615: URL: https://github.com/apache/iceberg/pull/11615#discussion_r1854296151 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -194,10 +205,40 @@ protected Statistics estimateStatistics(Snapshot snaps

Re: [PR] Added support for lowercase FileFormat for Issue #1340 [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on PR #1362: URL: https://github.com/apache/iceberg-python/pull/1362#issuecomment-2494228038 thanks for the contribution! do you mind adding a few test cases to validate the new behavior? in the original issue, the Datafile has to set `file_format` to uppercase.

Re: [PR] Improve documentation for "how to release" [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on code in PR #1359: URL: https://github.com/apache/iceberg-python/pull/1359#discussion_r1854262402 ## mkdocs/docs/how-to-release.md: ## @@ -253,19 +332,19 @@ This Python release can be downloaded from: https://pypi.org/project/pyiceberg/< Thanks to everyo

Re: [PR] check mkdocs build strict in CI [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu merged PR #1360: URL: https://github.com/apache/iceberg-python/pull/1360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] fix `KeyError` raised by `add_files` when parquet file doe not have column stats [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on code in PR #1354: URL: https://github.com/apache/iceberg-python/pull/1354#discussion_r1854238935 ## tests/io/test_pyarrow_stats.py: ## @@ -681,6 +685,39 @@ def test_stats_types(table_schema_nested: Schema) -> None: ] +def test_read_missing_statis

Re: [PR] Create publish-docker.yml [iceberg]

2024-11-22 Thread via GitHub
sungwy commented on code in PR #11632: URL: https://github.com/apache/iceberg/pull/11632#discussion_r1854251098 ## .github/workflows/publish-docker.yml: ## @@ -0,0 +1,38 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

Re: [PR] Add `list_views` for hive catalog [iceberg-python]

2024-11-22 Thread via GitHub
omkenge closed pull request #1251: Add `list_views` for hive catalog URL: https://github.com/apache/iceberg-python/pull/1251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Remove Python 3.13 upper bound restriction [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu commented on PR #1355: URL: https://github.com/apache/iceberg-python/pull/1355#issuecomment-2494156220 LGTM! thanks for getting to the bottom of the numpy issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Remove upper Python version constraint to enable early testing [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu closed issue #1348: Remove upper Python version constraint to enable early testing URL: https://github.com/apache/iceberg-python/issues/1348 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Remove Python 3.13 upper bound restriction [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu merged PR #1355: URL: https://github.com/apache/iceberg-python/pull/1355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] check mkdocs build strict in CI [iceberg-python]

2024-11-22 Thread via GitHub
kevinjqliu closed pull request #1360: check mkdocs build strict in CI URL: https://github.com/apache/iceberg-python/pull/1360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] [DISCUSSION] Project Goal [iceberg-cpp]

2024-11-22 Thread via GitHub
zeroshade commented on issue #2: URL: https://github.com/apache/iceberg-cpp/issues/2#issuecomment-2494138383 The biggest drawback to just using the Arrow C++ type system directly is that the mappings aren't perfect for iceberg. Iceberg only has Int32 and Int64 while Arrow has Int 8/16

Re: [I] [DISCUSSION] Project Goal [iceberg-cpp]

2024-11-22 Thread via GitHub
wgtmac commented on issue #2: URL: https://github.com/apache/iceberg-cpp/issues/2#issuecomment-2494091818 I have made a bold suggestion that the type system to directly leverage Arrow C++ to avoid re-invent the wheels and benefit from RecordBatch, Expression and other stuff. I saw that iceb

[I] [DISCUSSION] Project Goal [iceberg-cpp]

2024-11-22 Thread via GitHub
wgtmac opened a new issue, #2: URL: https://github.com/apache/iceberg-cpp/issues/2 I'd like to create this very first issue to collect ideas from people who have an interest. Below are what's in my mind: - Platform: Linux, MacOS, Windows. - Compilers: Clang, GCC, MSVC. - Build:

Re: [PR] Create publish-docker.yml [iceberg]

2024-11-22 Thread via GitHub
Fokko commented on code in PR #11632: URL: https://github.com/apache/iceberg/pull/11632#discussion_r1854121432 ## .github/workflows/publish-docker.yml: ## @@ -0,0 +1,38 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Add ASF yaml [iceberg-cpp]

2024-11-22 Thread via GitHub
wgtmac commented on PR #1: URL: https://github.com/apache/iceberg-cpp/pull/1#issuecomment-2493976961 Thanks @Fokko and @Xuanwo! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Add TableMetadataBuilder::assign_current_snapshot_id [iceberg-rust]

2024-11-22 Thread via GitHub
SergeiPatiakin commented on PR #713: URL: https://github.com/apache/iceberg-rust/pull/713#issuecomment-2493974787 Thanks! You are right, I think I'll be able to use `TableMetadataBuilder::set_ref` from https://github.com/apache/iceberg-rust/pull/587 . Closing this PR -- This is an automa

Re: [PR] Add TableMetadataBuilder::assign_current_snapshot_id [iceberg-rust]

2024-11-22 Thread via GitHub
SergeiPatiakin closed pull request #713: Add TableMetadataBuilder::assign_current_snapshot_id URL: https://github.com/apache/iceberg-rust/pull/713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-11-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1854001589 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/AzureSasCredentialRefresher.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Docs: Mention look-free requires HIVE-28121 for MySQL/MariaDB-based HMS [iceberg]

2024-11-22 Thread via GitHub
pan3793 commented on PR #11631: URL: https://github.com/apache/iceberg/pull/11631#issuecomment-2493962087 cc @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
jbonofre commented on code in PR #11430: URL: https://github.com/apache/iceberg/pull/11430#discussion_r1854047740 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/data/TestRowProjection.java: ## @@ -61,52 +63,63 @@ private RowData writeAndRead(String desc, Schema wri

Re: [PR] Kafka Connect: Add config to prefix the control consumer group [iceberg]

2024-11-22 Thread via GitHub
hugofriant commented on PR #11599: URL: https://github.com/apache/iceberg/pull/11599#issuecomment-2493945187 I had to fix the format @bryanck. Could you relaunch the workflows ? Thx -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
jbonofre commented on code in PR #11430: URL: https://github.com/apache/iceberg/pull/11430#discussion_r1854035536 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/data/TestRowProjection.java: ## @@ -61,52 +63,63 @@ private RowData writeAndRead(String desc, Schema wri

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
nastra commented on code in PR #11430: URL: https://github.com/apache/iceberg/pull/11430#discussion_r1854033609 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/data/TestRowProjection.java: ## @@ -61,52 +63,63 @@ private RowData writeAndRead(String desc, Schema write

Re: [PR] Docs: Add new blog post to Iceberg Blogs [iceberg]

2024-11-22 Thread via GitHub
nastra merged PR #11627: URL: https://github.com/apache/iceberg/pull/11627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
jbonofre commented on PR #11430: URL: https://github.com/apache/iceberg/pull/11430#issuecomment-2493928687 @nastra if you have time to take a look, thanks ! πŸ˜„ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Kafka Connect: Add config to prefix the control consumer group [iceberg]

2024-11-22 Thread via GitHub
bryanck commented on PR #11599: URL: https://github.com/apache/iceberg/pull/11599#issuecomment-2493915018 Looks good to me, thanks @hugofriant for the contribution, and @jbonofre for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
pan3793 commented on PR #7734: URL: https://github.com/apache/iceberg/pull/7734#issuecomment-2493910476 @nastra I opened https://github.com/apache/iceberg/pull/11630, pls take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Add TableMetadataBuilder::assign_current_snapshot_id [iceberg-rust]

2024-11-22 Thread via GitHub
c-thiel commented on PR #713: URL: https://github.com/apache/iceberg-rust/pull/713#issuecomment-2493900023 Thanks @SergeiPatiakin for your PR! Assigning the current snapshot id is a bit more complex than what you have in your PR. Here is the extract from the spec: https://iceberg.

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
nastra commented on PR #7734: URL: https://github.com/apache/iceberg/pull/7734#issuecomment-2493888048 @pan3793 can you also please backport this to Spark 3.3? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
nastra merged PR #7734: URL: https://github.com/apache/iceberg/pull/7734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.5: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
nastra merged PR #11628: URL: https://github.com/apache/iceberg/pull/11628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark: Write DVs for V3 MoR tables [iceberg]

2024-11-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #11561: URL: https://github.com/apache/iceberg/pull/11561#discussion_r1853989223 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -521,7 +528,10 @@ public void deleteSingleRecordProdu

Re: [I] Allow `file_format` to be lower-case [iceberg-python]

2024-11-22 Thread via GitHub
hgollakota commented on issue #1340: URL: https://github.com/apache/iceberg-python/issues/1340#issuecomment-2493851999 Hey, gonna submit a pull-request - here's the solution I'm proposing: ``` class FileFormat(str, Enum): AVRO = "AVRO", "avro" PARQUET = "PARQUET", "pa

[PR] Add TableMetadataBuilder::assign_current_snapshot_id [iceberg-rust]

2024-11-22 Thread via GitHub
SergeiPatiakin opened a new pull request, #713: URL: https://github.com/apache/iceberg-rust/pull/713 `TableMetadataBuilder::from_table_creation` initializes `current_snapshot_id` to `None`. This PR adds a builder method that allows different values. -- This is an automated message from t

Re: [PR] Extend Storage support to relative local FS paths [iceberg-rust]

2024-11-22 Thread via GitHub
Fokko commented on PR #712: URL: https://github.com/apache/iceberg-rust/pull/712#issuecomment-2493744800 @gruuya No worries, thanks for your understanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Support relative paths in `Storage::LocalFs` [iceberg-rust]

2024-11-22 Thread via GitHub
gruuya closed issue #711: Support relative paths in `Storage::LocalFs` URL: https://github.com/apache/iceberg-rust/issues/711 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Support relative paths in `Storage::LocalFs` [iceberg-rust]

2024-11-22 Thread via GitHub
gruuya commented on issue #711: URL: https://github.com/apache/iceberg-rust/issues/711#issuecomment-2493735580 As per https://github.com/apache/iceberg-rust/pull/712#issuecomment-2493686502 this isn't a viable ask, so I'm closing the issue. -- This is an automated message from the Apache

Re: [PR] Extend Storage support to relative local FS paths [iceberg-rust]

2024-11-22 Thread via GitHub
gruuya closed pull request #712: Extend Storage support to relative local FS paths URL: https://github.com/apache/iceberg-rust/pull/712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Extend Storage support to relative local FS paths [iceberg-rust]

2024-11-22 Thread via GitHub
gruuya commented on PR #712: URL: https://github.com/apache/iceberg-rust/pull/712#issuecomment-2493733117 Oh I see, didn't realize it was prohibited at the spec level. Thanks, closing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Test both "new" Flink Avro planned reader and "deprecated" Avro reader [iceberg]

2024-11-22 Thread via GitHub
jbonofre commented on PR #11430: URL: https://github.com/apache/iceberg/pull/11430#issuecomment-2493711948 @Fokko @pvary @RussellSpitzer I updated the tests according to Peter's comment. I'm just not convinced that using `@TestTemplate` is much better than `@ParameterizedTest` πŸ˜„ -- This

Re: [PR] Extend Storage support to relative local FS paths [iceberg-rust]

2024-11-22 Thread via GitHub
Fokko commented on PR #712: URL: https://github.com/apache/iceberg-rust/pull/712#issuecomment-2493686502 Hey @gruuya Thanks for creating this PR. Unfortunately, Iceberg does not support relative paths. There is a long open issue https://github.com/apache/iceberg/issues/1617 but it is pretty

[I] Support relative paths in `Storage::LocalFs` [iceberg-rust]

2024-11-22 Thread via GitHub
gruuya opened a new issue, #711: URL: https://github.com/apache/iceberg-rust/issues/711 Hi, I'd like to ask for extending `iceberg::io::storage::Storage` to support relative paths for local file system storage. I think this isn't too invasive to the present implementation, and can be

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
pan3793 commented on PR #7734: URL: https://github.com/apache/iceberg/pull/7734#issuecomment-2493632737 @nastra I opened https://github.com/apache/iceberg/pull/11628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
nastra commented on PR #7734: URL: https://github.com/apache/iceberg/pull/7734#issuecomment-2493611043 @pan3793 can you please open the same PR against Spark 3.5? We should first get the Spark 3.5 changes in before this PR -- This is an automated message from the Apache Git Service. To re

[PR] fix: expand arrow to iceberg schema to handle nanosecond timestamp [iceberg-rust]

2024-11-22 Thread via GitHub
jdockerty opened a new pull request, #710: URL: https://github.com/apache/iceberg-rust/pull/710 Closes https://github.com/apache/iceberg-rust/issues/709 Nanosecond precision can be written by `parquet-rs` and [`iceberg-rust` has some support for this recently](https://github.com/apac

Re: [PR] Spark 3.4: Correct the two-stage parsing strategy of antlr parser [iceberg]

2024-11-22 Thread via GitHub
pan3793 commented on PR #7734: URL: https://github.com/apache/iceberg/pull/7734#issuecomment-2493538799 @Fokko thanks for reopening this PR (please also help to remove the `stale` tag otherwise it will be closed by GitHub bot again soon). I rebase the PR and fixes the conflicts. -- This

Re: [I] Variant Data Type Support [iceberg]

2024-11-22 Thread via GitHub
tmnd1991 commented on issue #10392: URL: https://github.com/apache/iceberg/issues/10392#issuecomment-2493465035 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

  1   2   >