Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465973319 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-24 Thread via GitHub
wolfeidau commented on code in PR #51: URL: https://github.com/apache/iceberg-go/pull/51#discussion_r1465966365 ## catalog/glue.go: ## @@ -0,0 +1,168 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE fil

Re: [PR] feat: init file writer interface [iceberg-rust]

2024-01-24 Thread via GitHub
Fokko merged PR #168: URL: https://github.com/apache/iceberg-rust/pull/168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] fix: Manifest parsing should consider schema evolution. [iceberg-rust]

2024-01-24 Thread via GitHub
liurenjie1024 commented on PR #171: URL: https://github.com/apache/iceberg-rust/pull/171#issuecomment-1909540730 cc @Xuanwo @ZENOTME @Fokko PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] fix: Manifest parsing should consider schema evolution. [iceberg-rust]

2024-01-24 Thread via GitHub
liurenjie1024 opened a new pull request, #171: URL: https://github.com/apache/iceberg-rust/pull/171 Releated to #165 . This pr tries to resolve the second bug, where the `field_id` maybe missing due to schema evolution. -- This is an automated message from the Apache Git Service. T

Re: [PR] docs: Add release guide for iceberg-rust [iceberg-rust]

2024-01-24 Thread via GitHub
Fokko commented on code in PR #147: URL: https://github.com/apache/iceberg-rust/pull/147#discussion_r1465949315 ## website/src/release.md: ## @@ -0,0 +1,383 @@ + + +This document mainly introduces how the release manager releases a new version in accordance with the Apache requ

Re: [PR] chore(deps): Update env_logger requirement from 0.10.0 to 0.11.0 [iceberg-rust]

2024-01-24 Thread via GitHub
Fokko merged PR #170: URL: https://github.com/apache/iceberg-rust/pull/170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] two bugs [iceberg-rust]

2024-01-24 Thread via GitHub
liurenjie1024 commented on issue #165: URL: https://github.com/apache/iceberg-rust/issues/165#issuecomment-1909480901 Hi @Samrose-Ahmed Sorry for late reply. > One is this line ( > > [iceberg-rust/crates/iceberg/src/spec/manifest.rs](https://github.com/apache/iceberg-rust/blob

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-24 Thread via GitHub
HonahX commented on code in PR #51: URL: https://github.com/apache/iceberg-go/pull/51#discussion_r1462781472 ## catalog/glue.go: ## @@ -0,0 +1,168 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +

Re: [PR] AWS: Support setting description for Glue table [iceberg]

2024-01-24 Thread via GitHub
lkokhreidze commented on PR #9530: URL: https://github.com/apache/iceberg/pull/9530#issuecomment-1909448745 Thanks @amogh-jahagirdar and apologies, should have done it beforehand. Pushed the changes -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Build: Bump pyspark from 3.4.2 to 3.5.0 [iceberg-python]

2024-01-24 Thread via GitHub
HonahX commented on PR #283: URL: https://github.com/apache/iceberg-python/pull/283#issuecomment-1909445168 @Fokko I opened a new PR to manually update the Pyspark version and use `importlib` to fetch version: https://github.com/apache/iceberg-python/pull/303 -- This is an automated me

Re: [PR] Add UnionByName functionality [iceberg-python]

2024-01-24 Thread via GitHub
HonahX commented on code in PR #296: URL: https://github.com/apache/iceberg-python/pull/296#discussion_r1465867474 ## pyiceberg/table/__init__.py: ## @@ -1995,6 +2020,156 @@ def primitive(self, primitive: PrimitiveType) -> Optional[IcebergType]: return primitive +c

Re: [PR] Spark 3.4: Fix writing of default values in CoW for rows with NULL columns which are unmatched [iceberg]

2024-01-24 Thread via GitHub
amogh-jahagirdar commented on PR #9556: URL: https://github.com/apache/iceberg/pull/9556#issuecomment-1909402282 There's another approach that I'm thinking about, I'm not sure yet how reasonable it is but in the Spark plan, if there's no "when not matched" case could we implicitly just add

[PR] Add/Update Snowflake docs to new docs site [iceberg]

2024-01-24 Thread via GitHub
scottteal opened a new pull request, #9557: URL: https://github.com/apache/iceberg/pull/9557 @bitsondatadev I just saw your update in the community sync on the new docs. I hope I'm making updates in the right place and don't miss the 1.5.0 release! As quite a bit has changed on Snowflake's

Re: [PR] Partition Evolution [iceberg-python]

2024-01-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #245: URL: https://github.com/apache/iceberg-python/pull/245#discussion_r1465798502 ## pyiceberg/partitioning.py: ## @@ -85,6 +91,20 @@ def __str__(self) -> str: """Return the string representation of the PartitionField class."""

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-24 Thread via GitHub
wolfeidau commented on code in PR #51: URL: https://github.com/apache/iceberg-go/pull/51#discussion_r1465798254 ## catalog/glue.go: ## @@ -0,0 +1,168 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE fil

Re: [PR] Write support [iceberg-python]

2024-01-24 Thread via GitHub
sebpretzer commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1465791258 ## tests/integration/test_writes.py: ## @@ -0,0 +1,387 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-01-24 Thread via GitHub
wooyeong commented on PR #9455: URL: https://github.com/apache/iceberg/pull/9455#issuecomment-1909253042 @nastra @ajantha-bhat Perhaps it could be minor but it is obvious that we have a bug with the time travel feature. What you pointed out is beyond my knowledge, so could you let me know w

Re: [PR] Build: Bump ray from 2.7.1 to 2.9.1 [iceberg-python]

2024-01-24 Thread via GitHub
dependabot[bot] commented on PR #287: URL: https://github.com/apache/iceberg-python/pull/287#issuecomment-1909250298 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version

Re: [PR] Build: Bump ray from 2.7.1 to 2.9.1 [iceberg-python]

2024-01-24 Thread via GitHub
HonahX closed pull request #287: Build: Bump ray from 2.7.1 to 2.9.1 URL: https://github.com/apache/iceberg-python/pull/287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Build: Bump mkdocs-material from 9.5.4 to 9.5.5 [iceberg-python]

2024-01-24 Thread via GitHub
HonahX merged PR #302: URL: https://github.com/apache/iceberg-python/pull/302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] core: initial support of multi-arg bucket [iceberg]

2024-01-24 Thread via GitHub
advancedxy commented on PR #8259: URL: https://github.com/apache/iceberg/pull/8259#issuecomment-1909241699 > @advancedxy do you mind taking the generic changes here (not including bucketv2) and splitting into another pr? (Just like we did for the doc changes). I feel we can parallelize more

Re: [PR] Build: Bump mkdocs-material from 9.5.4 to 9.5.5 [iceberg-python]

2024-01-24 Thread via GitHub
HonahX commented on PR #302: URL: https://github.com/apache/iceberg-python/pull/302#issuecomment-1909236593 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] AWS: Support setting description for Glue table [iceberg]

2024-01-24 Thread via GitHub
amogh-jahagirdar commented on PR #9530: URL: https://github.com/apache/iceberg/pull/9530#issuecomment-1909230880 @lkokhreidze Could you run spotless `./.gradlew spotlessApply` and push again? The spotless checks are failing. -- This is an automated message from the Apache Git Service. To

Re: [PR] Remove redundant API call to Glue [iceberg-python]

2024-01-24 Thread via GitHub
HonahX merged PR #300: URL: https://github.com/apache/iceberg-python/pull/300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Set Glue Table Information when creating/updating tables [iceberg-python]

2024-01-24 Thread via GitHub
HonahX merged PR #288: URL: https://github.com/apache/iceberg-python/pull/288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] GlueCatalog: Set Glue table input information based on Iceberg table metadata [iceberg-python]

2024-01-24 Thread via GitHub
HonahX closed issue #216: GlueCatalog: Set Glue table input information based on Iceberg table metadata URL: https://github.com/apache/iceberg-python/issues/216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] refactor: Remove support of manifest list format as a list of file paths. [iceberg-rust]

2024-01-24 Thread via GitHub
liurenjie1024 commented on issue #158: URL: https://github.com/apache/iceberg-rust/issues/158#issuecomment-1909181691 See this: https://github.com/apache/iceberg-rust/blob/c91aeaec2aa713a1efdc513e1769220dd53cf443/crates/iceberg/src/spec/snapshot.rs#L98 You can remove the enum, but it

Re: [PR] Fix writing to local filesystem [iceberg-python]

2024-01-24 Thread via GitHub
kevinjqliu commented on code in PR #301: URL: https://github.com/apache/iceberg-python/pull/301#discussion_r1465718521 ## pyiceberg/io/pyarrow.py: ## @@ -288,6 +288,8 @@ def create(self, overwrite: bool = False) -> OutputStream: try: if not overwrite and se

Re: [I] refactor: Remove support of manifest list format as a list of file paths. [iceberg-rust]

2024-01-24 Thread via GitHub
hiirrxnn commented on issue #158: URL: https://github.com/apache/iceberg-rust/issues/158#issuecomment-1909158933 Which file/files should I make this change in ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] `schema_id` not incremented during schema evolution [iceberg-python]

2024-01-24 Thread via GitHub
kevinjqliu commented on issue #290: URL: https://github.com/apache/iceberg-python/issues/290#issuecomment-1909157759 Option (3) makes sense, I'll look for places where the `Schema` constructor sets the `schema_id` field. I'll include the changes in a separate PR since #289 is already

Re: [I] Write stream of unordered rows into partitioned table causes "Already closed files for partition" [iceberg]

2024-01-24 Thread via GitHub
github-actions[bot] commented on issue #717: URL: https://github.com/apache/iceberg/issues/717#issuecomment-1909132721 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] While alter table drop partition column, the error message is not friendly. [iceberg]

2024-01-24 Thread via GitHub
github-actions[bot] commented on issue #714: URL: https://github.com/apache/iceberg/issues/714#issuecomment-1909132699 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] [Spark-3] Add missing abortStagedChanges support for StagedSparkTable [iceberg]

2024-01-24 Thread via GitHub
github-actions[bot] commented on issue #698: URL: https://github.com/apache/iceberg/issues/698#issuecomment-1909132672 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] Support for Apache Beam I/O [iceberg]

2024-01-24 Thread via GitHub
github-actions[bot] commented on issue #693: URL: https://github.com/apache/iceberg/issues/693#issuecomment-1909132642 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] Partition Spec Performance [iceberg]

2024-01-24 Thread via GitHub
github-actions[bot] commented on issue #692: URL: https://github.com/apache/iceberg/issues/692#issuecomment-1909132623 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [PR] Spark 3.4: Fix writing of default values in CoW for rows with NULL columns which are unmatched [iceberg]

2024-01-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #9556: URL: https://github.com/apache/iceberg/pull/9556#discussion_r1465683526 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMerge.java: ## @@ -2144,6 +2144,26 @@ public void testMergeWithTableWithNo

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on PR #9487: URL: https://github.com/apache/iceberg/pull/9487#issuecomment-1909106993 @jbonofre, thanks for updating this. It's looking really good now! I think the remaining issue is what @danielcweeks brought up in the last sync: do we know how this affects existing

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1465669874 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -217,10 +226,11 @@ public boolean dropTable(TableIdentifier identifier, boolean purge) { in

Re: [I] Purge support for Iceberg view [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on issue #9433: URL: https://github.com/apache/iceberg/issues/9433#issuecomment-1909049191 @ajantha-bhat, dropping old metadata files (if we track them) isn't the same thing as a `PURGE` option when dropping a view. For a view, there isn't anything to purge and I would not

Re: [I] Query optimization fails after upgrading to 1.4.0+ with nullif in predicate [iceberg]

2024-01-24 Thread via GitHub
jakelong95 closed issue #9518: Query optimization fails after upgrading to 1.4.0+ with nullif in predicate URL: https://github.com/apache/iceberg/issues/9518 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Query optimization fails after upgrading to 1.4.0+ with nullif in predicate [iceberg]

2024-01-24 Thread via GitHub
jakelong95 commented on issue #9518: URL: https://github.com/apache/iceberg/issues/9518#issuecomment-1909044218 https://issues.apache.org/jira/browse/SPARK-46847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] Build: Bump mkdocs-material from 9.5.4 to 9.5.5 [iceberg-python]

2024-01-24 Thread via GitHub
dependabot[bot] opened a new pull request, #302: URL: https://github.com/apache/iceberg-python/pull/302 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.4 to 9.5.5. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdo

Re: [I] Query optimization fails after upgrading to 1.4.0+ with nullif in predicate [iceberg]

2024-01-24 Thread via GitHub
jakelong95 commented on issue #9518: URL: https://github.com/apache/iceberg/issues/9518#issuecomment-1909008476 Gotcha, that makes sense. I can go ahead and open a ticket with Spark and close this out. Thanks for taking a look! -- This is an automated message from the Apache Git Service.

Re: [I] Query optimization fails after upgrading to 1.4.0+ with nullif in predicate [iceberg]

2024-01-24 Thread via GitHub
singhpk234 commented on issue #9518: URL: https://github.com/apache/iceberg/issues/9518#issuecomment-1909005110 ok so looks like a bug in spark 3.5 introduced via : https://github.com/apache/spark/commit/53df45650af1e48e01e392caed6c1f83c2e9e9f1#diff-b2edb35eb7c49f1f3c1fd074927328ac103e9a2cd0

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465573215 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [I] Support partitioned writes [iceberg-python]

2024-01-24 Thread via GitHub
asheeshgarg commented on issue #208: URL: https://github.com/apache/iceberg-python/issues/208#issuecomment-1908955580 @Fokko @jqin61 Today I tried basic example on partition write from pyiceberg.io.pyarrow import schema_to_pyarrow import pyarrow as pa from pyarrow import parquet

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465565118 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465564320 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465562137 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -350,39 +350,21 @@ public void readFromViewReferencingAnotherView(

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465559938 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache So

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465558301 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveViews.scala: ## @@ -151,7 +179,7 @@ case class ResolveViews(spark: SparkSessi

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465557799 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveViews.scala: ## @@ -59,6 +61,32 @@ case class ResolveViews(spark: SparkSessio

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465553150 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveViews.scala: ## @@ -59,6 +61,32 @@ case class ResolveViews(spark: SparkSessio

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465547862 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465544090 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1465543334 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckViews.scala: ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Write support [iceberg-python]

2024-01-24 Thread via GitHub
asheeshgarg commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1465532105 ## tests/integration/test_writes.py: ## @@ -0,0 +1,387 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

Re: [I] Query optimization fails after upgrading to 1.4.0+ with nullif in predicate [iceberg]

2024-01-24 Thread via GitHub
singhpk234 commented on issue #9518: URL: https://github.com/apache/iceberg/issues/9518#issuecomment-1908896193 was able to replicate : ``` @TestTemplate public void testProjectionBug() { List expected = ImmutableList.of(row(1L), row(2L), row(3L)); assertEquals

Re: [PR] Arrow: Don't copy the list when not needed [iceberg-python]

2024-01-24 Thread via GitHub
Fokko commented on code in PR #252: URL: https://github.com/apache/iceberg-python/pull/252#discussion_r1465502918 ## pyiceberg/io/pyarrow.py: ## @@ -1152,24 +1163,31 @@ def field(self, field: NestedField, _: Optional[pa.Array], field_array: Optional return field_array

Re: [PR] Fix writing to local filesystem [iceberg-python]

2024-01-24 Thread via GitHub
Fokko commented on code in PR #301: URL: https://github.com/apache/iceberg-python/pull/301#discussion_r1465458297 ## pyiceberg/io/pyarrow.py: ## @@ -288,6 +288,8 @@ def create(self, overwrite: bool = False) -> OutputStream: try: if not overwrite and self.ex

Re: [PR] core: initial support of multi-arg bucket [iceberg]

2024-01-24 Thread via GitHub
szehon-ho commented on PR #8259: URL: https://github.com/apache/iceberg/pull/8259#issuecomment-1908721572 @advancedxy do you mind taking the generic changes here (not including bucketv2) and splitting into another pr? (Just like we did for the doc changes). I feel we can parallelize more

Re: [PR] Flink: Adds the ability to read from a branch on the Flink Iceberg Source [iceberg]

2024-01-24 Thread via GitHub
rodmeneses commented on code in PR #9547: URL: https://github.com/apache/iceberg/pull/9547#discussion_r1465368549 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -195,7 +192,10 @@ void monitorAndForwardSplits() { // Re

[PR] Remove redundant API call to Glue [iceberg-python]

2024-01-24 Thread via GitHub
geruh opened a new pull request, #300: URL: https://github.com/apache/iceberg-python/pull/300 While reviewing the GlueCatalog implementation in Python, I noticed the listing methods for tables and namespaces are making redundant calls to AWS Glue. This PR removes the extra call and ensures

[I] Cannot write to local filesystem [iceberg-python]

2024-01-24 Thread via GitHub
kevinjqliu opened a new issue, #299: URL: https://github.com/apache/iceberg-python/issues/299 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 Pulling this issue out of #289 In #289, it was discovered that the library cannot write to the

[I] Spark 3.4 MERGE INTO for CoW replacing NULL unmatched records with default values [iceberg]

2024-01-24 Thread via GitHub
amogh-jahagirdar opened a new issue, #9555: URL: https://github.com/apache/iceberg/issues/9555 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Reproduction: Here's a simple unit test (can copy/pas

Re: [I] Apache Iceberg - Branch cannot be merged using the fast_forward procedure [iceberg]

2024-01-24 Thread via GitHub
Ashwin07 commented on issue #9553: URL: https://github.com/apache/iceberg/issues/9553#issuecomment-1908585973 Here is my pyspark session command pyspark3 --conf spark.sql.catalog.gold_layer.uri=X --conf spark.sql.catalog.gold_layer.ref=main --conf spark.sql.catalog.gold_laye

Re: [I] Apache Iceberg - Branch cannot be merged using the fast_forward procedure [iceberg]

2024-01-24 Thread via GitHub
nastra commented on issue #9553: URL: https://github.com/apache/iceberg/issues/9553#issuecomment-1908564907 @Ashwin07 can you please share your full catalog configuration? It seems you might be missing https://iceberg.apache.org/docs/latest/spark-configuration/#sql-extensions -- This is

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-24 Thread via GitHub
zeroshade commented on code in PR #51: URL: https://github.com/apache/iceberg-go/pull/51#discussion_r1465278034 ## catalog/glue.go: ## @@ -0,0 +1,168 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE fil

Re: [PR] Add missing new docs to nav [iceberg]

2024-01-24 Thread via GitHub
nastra merged PR #9554: URL: https://github.com/apache/iceberg/pull/9554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[I] Apache Iceberg - Branch cannot be merged using the fast_forward procedure [iceberg]

2024-01-24 Thread via GitHub
Ashwin07 opened a new issue, #9553: URL: https://github.com/apache/iceberg/issues/9553 I have been trying to test branching feature in Apache Iceberg 1.4.3, but facing the below issue with procedure call fast_forward, Can you please let me know if this an existing limitation or something

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-01-24 Thread via GitHub
pvary commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1908484159 Also, here is the Flink PR: https://github.com/apache/flink/pull/24191 If the PR has been accepted there, then the serialization format is finalized. Since it takes time for all th

Re: [PR] Update blogs.md [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on PR #9552: URL: https://github.com/apache/iceberg/pull/9552#issuecomment-1908434800 Thanks for the info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Update blogs.md [iceberg]

2024-01-24 Thread via GitHub
bitsondatadev commented on PR #9552: URL: https://github.com/apache/iceberg/pull/9552#issuecomment-1908432764 @ajantha-bhat, thanks! If you add it here it won't show up on iceberg.apache.org as soon as we merge this PR. The swap will happen with #9520 once we get some consensus on the maili

Re: [PR] Update blogs.md [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on PR #9552: URL: https://github.com/apache/iceberg/pull/9552#issuecomment-1908420270 @bitsondatadev : we are using this repo for the site now after https://github.com/apache/iceberg/pull/8919 ? So, I raised PR here instead of Iceberg-docs repo. Let me know if

Re: [PR] Core: Add a util to read and write partition stats [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on PR #9170: URL: https://github.com/apache/iceberg/pull/9170#issuecomment-1908389405 @aokolnychyi: Can this PR be reviewed? I know, I need to rework or analyze about the final spark action to collect the partition stats. But this PR is independent of that

Re: [I] Purge support for Iceberg view [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on issue #9433: URL: https://github.com/apache/iceberg/issues/9433#issuecomment-1908357700 Sorry. I am not very clear on this. I do see that `CatalogUtil.dropTableData` is cleaning up the table metadata files (`metadata.metadataFileLocation()`, `metadata.previo

Re: [PR] Spark 3.5: Fix testDeleteFileThenMetadataDelete failure due to table not refreshed [iceberg]

2024-01-24 Thread via GitHub
manuzhang commented on code in PR #9551: URL: https://github.com/apache/iceberg/pull/9551#discussion_r1465086400 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/SparkRowLevelOperationsTestBase.java: ## @@ -166,6 +166,28 @@ public static Object[][

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-01-24 Thread via GitHub
pvary commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1908265266 Created the FLINK jira: https://issues.apache.org/jira/browse/FLINK-34228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-24 Thread via GitHub
jbonofre commented on PR #9487: URL: https://github.com/apache/iceberg/pull/9487#issuecomment-1908261665 @ajantha-bhat @nastra @rdblue I refactored the PR to use a single SQL table for both tables and views, using `type` column to distinguish. The pros for this new approach: 1. the code

Re: [I] How do I debug iceberg source code using idea [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat closed issue #9549: How do I debug iceberg source code using idea URL: https://github.com/apache/iceberg/issues/9549 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] selects a specific table snapshot [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on issue #9550: URL: https://github.com/apache/iceberg/issues/9550#issuecomment-1908016718 it is not `snapshots-id`, it is `snapshot-id` without the s -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] selects a specific table snapshot [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat closed issue #9550: selects a specific table snapshot URL: https://github.com/apache/iceberg/issues/9550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] AWS: Support setting description for Glue table [iceberg]

2024-01-24 Thread via GitHub
lkokhreidze commented on PR #9530: URL: https://github.com/apache/iceberg/pull/9530#issuecomment-1908004129 Thanks for the review @singhpk234! I've addressed your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] AWS: Support setting description for Glue table [iceberg]

2024-01-24 Thread via GitHub
lkokhreidze commented on code in PR #9530: URL: https://github.com/apache/iceberg/pull/9530#discussion_r1464823134 ## aws/src/main/java/org/apache/iceberg/aws/glue/IcebergToGlueConverter.java: ## @@ -59,7 +60,7 @@ private IcebergToGlueConverter() {} private static final Patte

[I] selects a specific table snapshot [iceberg]

2024-01-24 Thread via GitHub
lpy148145 opened a new issue, #9550: URL: https://github.com/apache/iceberg/issues/9550 ### Apache Iceberg version 1.2.1 ### Query engine Spark ### Please describe the bug 🐞 selects a specific table snapshot no effect val frame: DataFrame = spark

Re: [I] How do I debug iceberg source code using idea [iceberg]

2024-01-24 Thread via GitHub
ajantha-bhat commented on issue #9549: URL: https://github.com/apache/iceberg/issues/9549#issuecomment-1907948061 Locally you can execute the unit testcases and add a break points to the classes (observe logs to find the class names). Or if you have Iceberg running as a process with

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1464767785 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache S

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1464764583 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache S

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1464764253 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/views/CreateIcebergView.scala: ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apac

Re: [I] Hive Catalog: Implement `_commit_table` [iceberg-python]

2024-01-24 Thread via GitHub
Fokko closed issue #275: Hive Catalog: Implement `_commit_table` URL: https://github.com/apache/iceberg-python/issues/275 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-24 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1464613485 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveViews.scala: ## @@ -59,6 +62,10 @@ case class ResolveViews(spark: SparkSessio

Re: [PR] Arrow: Don't copy the list when not needed [iceberg-python]

2024-01-24 Thread via GitHub
HonahX commented on code in PR #252: URL: https://github.com/apache/iceberg-python/pull/252#discussion_r1464527694 ## pyiceberg/io/pyarrow.py: ## @@ -1152,24 +1163,31 @@ def field(self, field: NestedField, _: Optional[pa.Array], field_array: Optional return field_array

Re: [PR] Arrow: Don't copy the list when not needed [iceberg-python]

2024-01-24 Thread via GitHub
HonahX commented on code in PR #252: URL: https://github.com/apache/iceberg-python/pull/252#discussion_r1464482344 ## pyiceberg/io/pyarrow.py: ## @@ -1152,24 +1163,31 @@ def field(self, field: NestedField, _: Optional[pa.Array], field_array: Optional return field_array

Re: [PR] Lock Ray on <2.8.0 [iceberg-python]

2024-01-24 Thread via GitHub
Fokko commented on PR #298: URL: https://github.com/apache/iceberg-python/pull/298#issuecomment-1907632768 Fixed the conflicts, thanks @HonahX -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Lock Ray on <2.8.0 [iceberg-python]

2024-01-24 Thread via GitHub
Fokko merged PR #298: URL: https://github.com/apache/iceberg-python/pull/298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Adding Snowflake's public documentation [iceberg-docs]

2024-01-24 Thread via GitHub
Fokko commented on PR #297: URL: https://github.com/apache/iceberg-docs/pull/297#issuecomment-1907630971 Sorry for the long wait @scottteal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp