Re: [I] kerberos beeline insert iceberg fail error: Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore [iceberg]

2024-01-15 Thread via GitHub
nastra commented on issue #9475: URL: https://github.com/apache/iceberg/issues/9475#issuecomment-1891525594 @xiaolan-bit I don't think this is an Iceberg issue because it seems to be related to Kerberos itself. Can you share your catalog configuration please? You might also need to double-c

Re: [I] access failed from host to iceberg container [iceberg]

2024-01-15 Thread via GitHub
nastra commented on issue #9465: URL: https://github.com/apache/iceberg/issues/9465#issuecomment-1891533320 @vagetablechicken can you check whether the `nyc` schema actually exists before querying? -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Build: Bump actions/checkout from 3 to 4 [iceberg]

2024-01-15 Thread via GitHub
nastra merged PR #9474: URL: https://github.com/apache/iceberg/pull/9474 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Bump actions/setup-python from 4 to 5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9473: URL: https://github.com/apache/iceberg/pull/9473#issuecomment-189153 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.22.12 to 2.23.2 [iceberg]

2024-01-15 Thread via GitHub
nastra merged PR #9471: URL: https://github.com/apache/iceberg/pull/9471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Bump nessie from 0.76.0 to 0.76.2 [iceberg]

2024-01-15 Thread via GitHub
nastra merged PR #9467: URL: https://github.com/apache/iceberg/pull/9467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9382: URL: https://github.com/apache/iceberg/pull/9382#discussion_r1452047874 ## flink/v1.18/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceReaderDeletes.java: ## @@ -23,47 +23,35 @@ import java.util.Map; import org.apache.

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9382: URL: https://github.com/apache/iceberg/pull/9382#discussion_r1452048194 ## flink/v1.18/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceReaderDeletes.java: ## @@ -23,47 +23,35 @@ import java.util.Map; import org.apache.

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9382: URL: https://github.com/apache/iceberg/pull/9382#discussion_r1452048194 ## flink/v1.18/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceReaderDeletes.java: ## @@ -23,47 +23,35 @@ import java.util.Map; import org.apache.

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on PR #9382: URL: https://github.com/apache/iceberg/pull/9382#issuecomment-1891564299 Hi @nastra, for the failing CI, I need some help. If you see the CI errors, older Spark versions are complaining about using `@TempDir` in `TestSparkReaderDeletes` parent `Dele

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9382: URL: https://github.com/apache/iceberg/pull/9382#issuecomment-1891590783 Can you move `TestSparkReaderDeletes` to JUnit5 across all Spark versions? I think that would be the best option here -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452093279 ## pyiceberg/table/__init__.py: ## @@ -797,6 +850,9 @@ def location(self) -> str: def last_sequence_number(self) -> int: return self.metadata.last_sequen

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452105652 ## pyiceberg/table/__init__.py: ## @@ -831,6 +887,46 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool = Fa

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452106699 ## pyiceberg/io/pyarrow.py: ## @@ -1565,13 +1565,56 @@ def fill_parquet_file_metadata( del upper_bounds[field_id] del null_value_counts[field_id] -

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452113793 ## pyiceberg/table/__init__.py: ## @@ -831,6 +887,46 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool = Fa

Re: [I] Support partitioned writes [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on issue #208: URL: https://github.com/apache/iceberg-python/issues/208#issuecomment-1891664926 @jqin61 I did some more thinking over the weekend, and I think that the approach that you suggested is the most flexible. I forgot about the sort-order that we also want to add at

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452132430 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452137292 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452151938 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameters(name

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452163414 ## pyiceberg/io/pyarrow.py: ## @@ -1565,13 +1564,54 @@ def fill_parquet_file_metadata( del upper_bounds[field_id] del null_value_counts[field_id] -

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452170208 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Add `iceberg-bom` artifact [iceberg]

2024-01-15 Thread via GitHub
snazy commented on PR #8065: URL: https://github.com/apache/iceberg/pull/8065#issuecomment-1891799423 > I agree that we should get this into 1.5.0. @snazy could you rebase this please? done -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452174491 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452175092 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Add `iceberg-bom` artifact [iceberg]

2024-01-15 Thread via GitHub
snazy commented on PR #8065: URL: https://github.com/apache/iceberg/pull/8065#issuecomment-1891801124 > I just ran the target to build the pom and still see spark and flink included, which I believe should be excluded. I saw comments about adding Scala versions but I don't think that should

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452180757 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

[PR] Nessie: Add table() and view() API to NessieIcebergClient [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat opened a new pull request, #9477: URL: https://github.com/apache/iceberg/pull/9477 `table()` in `NessieIcebergClient` was renamed to `fetchContent()` while adding the view support for NessieCatalog. Nessie doesn't have to maintain compatibility as per https://iceberg.apache

Re: [PR] Nessie: Add table() and view() API to NessieIcebergClient [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9477: URL: https://github.com/apache/iceberg/pull/9477#discussion_r1452221527 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -195,6 +195,14 @@ private TableIdentifier toIdentifier(EntriesResponse.Entry entry) {

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r145035 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Build: Bump actions/setup-python from 4 to 5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9473: URL: https://github.com/apache/iceberg/pull/9473#issuecomment-1891879686 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Build: Bump actions/setup-python from 4 to 5 [iceberg]

2024-01-15 Thread via GitHub
dependabot[bot] commented on PR #9473: URL: https://github.com/apache/iceberg/pull/9473#issuecomment-1891879879 Looks like this PR is already up-to-date with main! If you'd still like to recreate it from scratch, overwriting any edits, you can request `@dependabot recreate`. -- This is a

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r145035 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452232974 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameters(name

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452233736 ## core/src/test/java/org/apache/iceberg/ScanPlanningAndReportingTestBase.java: ## @@ -34,21 +35,24 @@ import org.apache.iceberg.metrics.ScanReport; import org.apache.

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452234249 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameters(name

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452234786 ## core/src/test/java/org/apache/iceberg/TestLocalFilterFiles.java: ## @@ -18,20 +18,17 @@ */ package org.apache.iceberg; -import org.junit.runner.RunWith; -import

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452236196 ## core/src/test/java/org/apache/iceberg/DeleteFileIndexTestBase.java: ## @@ -35,15 +36,17 @@ import org.apache.iceberg.relocated.com.google.common.collect.Lists; impo

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452239144 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452242388 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452242388 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on code in PR #9416: URL: https://github.com/apache/iceberg/pull/9416#discussion_r1452252544 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -173,7 +173,7 @@ public class TestBase { public TestTables.TestTable table = null; @Parameter

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9455: URL: https://github.com/apache/iceberg/pull/9455#discussion_r1452261664 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -122,6 +124,7 @@ public class SparkTable private final Set capabilities; p

Re: [PR] Spark 3.5: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on PR #9382: URL: https://github.com/apache/iceberg/pull/9382#issuecomment-1891992449 Based on CI errrors for Flink, we would have to update files in Flink 1.16, 1.17 to JUnit5 too -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9455: URL: https://github.com/apache/iceberg/pull/9455#issuecomment-1891994205 the only thing I'm a little concerned about is the `CachingCatalog`, since it uses only the `TableIdentifier` as the cache key and doesn't consider the snapshot id of a table. That means y

Re: [PR] Build: Bump actions/setup-python from 4 to 5 [iceberg]

2024-01-15 Thread via GitHub
nastra closed pull request #9473: Build: Bump actions/setup-python from 4 to 5 URL: https://github.com/apache/iceberg/pull/9473 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Build: Bump actions/setup-python from 4 to 5 [iceberg]

2024-01-15 Thread via GitHub
dependabot[bot] commented on PR #9473: URL: https://github.com/apache/iceberg/pull/9473#issuecomment-1892011013 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let

Re: [PR] support python 3.12 [iceberg-python]

2024-01-15 Thread via GitHub
cclauss commented on code in PR #254: URL: https://github.com/apache/iceberg-python/pull/254#discussion_r1452285282 ## pyproject.toml: ## @@ -71,8 +71,8 @@ adlfs = { version = ">=2023.1.0,<2024.1.0", optional = true } gcsfs = { version = ">=2023.1.0,<2024.1.0", optional = true

Re: [PR] support python 3.12 [iceberg-python]

2024-01-15 Thread via GitHub
cclauss commented on code in PR #254: URL: https://github.com/apache/iceberg-python/pull/254#discussion_r1452285282 ## pyproject.toml: ## @@ -71,8 +71,8 @@ adlfs = { version = ">=2023.1.0,<2024.1.0", optional = true } gcsfs = { version = ">=2023.1.0,<2024.1.0", optional = true

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on PR #9416: URL: https://github.com/apache/iceberg/pull/9416#issuecomment-1892034824 We have files that inherit from `TestBase` used in older Spark versions that are causing CI failure. Should I move them to JUnit5 like we did in #9382 ? -- This is an automated mes

Re: [PR] Nessie: Add table() and view() API to NessieIcebergClient [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on code in PR #9477: URL: https://github.com/apache/iceberg/pull/9477#discussion_r1452300295 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -195,6 +195,14 @@ private TableIdentifier toIdentifier(EntriesResponse.Entry ent

Re: [PR] Write support [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1452340468 ## pyiceberg/table/__init__.py: ## @@ -1910,3 +2006,137 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -1

Re: [PR] Flink 1.17: Support specifying equality columns with write options [iceberg]

2024-01-15 Thread via GitHub
manuzhang commented on PR #8195: URL: https://github.com/apache/iceberg/pull/8195#issuecomment-1892165875 Closing this since Spark table can support Flink SQL upsert operation with identifier fields. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Flink 1.17: Support specifying equality columns with write options [iceberg]

2024-01-15 Thread via GitHub
manuzhang closed pull request #8195: Flink 1.17: Support specifying equality columns with write options URL: https://github.com/apache/iceberg/pull/8195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Nessie: Add table() and view() API to NessieIcebergClient [iceberg]

2024-01-15 Thread via GitHub
nastra merged PR #9477: URL: https://github.com/apache/iceberg/pull/9477 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark, Flink: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra merged PR #9382: URL: https://github.com/apache/iceberg/pull/9382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark, Flink: Migrate DeleteReadTests and its subclasses to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9382: URL: https://github.com/apache/iceberg/pull/9382#issuecomment-1892215667 thanks @chinmay-bhat for getting this done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
nastra commented on PR #9416: URL: https://github.com/apache/iceberg/pull/9416#issuecomment-1892216694 > We have files that inherit from `TestBase` used in older Spark versions that are causing CI failure. Should I move them to JUnit5 like we did in #9382 ? Yes that would probably be

Re: [PR] Spark 3.5: Migrate tests that depend on SparkDistributedDataScanTestBase to JUnit5 [iceberg]

2024-01-15 Thread via GitHub
chinmay-bhat commented on PR #9416: URL: https://github.com/apache/iceberg/pull/9416#issuecomment-1892455277 not sure why the one CI test is failing, all tests pass locally for me. Can you rerun the CI? -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Spark 3.5: Add Spark application id to summary of RewriteDataFilesSparkAction [iceberg]

2024-01-15 Thread via GitHub
nastra commented on code in PR #9273: URL: https://github.com/apache/iceberg/pull/9273#discussion_r1452605769 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewritePositionDeleteFilesSparkAction.java: ## @@ -215,7 +215,9 @@ private ExecutorService rewriteServ

Re: [PR] Apply Name mapping, new_schema_for_table [iceberg-python]

2024-01-15 Thread via GitHub
Fokko commented on code in PR #219: URL: https://github.com/apache/iceberg-python/pull/219#discussion_r1452677656 ## pyiceberg/table/__init__.py: ## @@ -831,6 +832,13 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool =

Re: [PR] Add `iceberg-bom` artifact [iceberg]

2024-01-15 Thread via GitHub
danielcweeks commented on PR #8065: URL: https://github.com/apache/iceberg/pull/8065#issuecomment-1892670054 > The _bom_ does not enforce any Scala, Spark or Flink version, it is meant to align the versions of Iceberg artifacts. Some of the iceberg artifacts are tied to specific Spark

Re: [PR] Add `iceberg-bom` artifact [iceberg]

2024-01-15 Thread via GitHub
danielcweeks commented on PR #8065: URL: https://github.com/apache/iceberg/pull/8065#issuecomment-1892699646 ok, I think part of the confusion was that I wasn't including all the versions when building the pom file. Once I included them all, I saw the other version dependencies. LGT

[PR] Build: Bump mkdocs-material from 9.5.3 to 9.5.4 [iceberg-python]

2024-01-15 Thread via GitHub
dependabot[bot] opened a new pull request, #267: URL: https://github.com/apache/iceberg-python/pull/267 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.3 to 9.5.4. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdo

Re: [I] Documentation doesn't tell you how to do a REST catalog [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] commented on issue #7614: URL: https://github.com/apache/iceberg/issues/7614#issuecomment-1892897510 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] stream api how to only update a column ? [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] closed issue #6901: stream api how to only update a column ? URL: https://github.com/apache/iceberg/issues/6901 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] REST-Catalog: missing conflict-checks for `dropTable` and `updateTable` [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] commented on issue #6710: URL: https://github.com/apache/iceberg/issues/6710#issuecomment-1892897551 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Documentation doesn't tell you how to do a REST catalog [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] closed issue #7614: Documentation doesn't tell you how to do a REST catalog URL: https://github.com/apache/iceberg/issues/7614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] stream api how to only update a column ? [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] commented on issue #6901: URL: https://github.com/apache/iceberg/issues/6901#issuecomment-1892897523 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] REST-Catalog: missing conflict-checks for `dropTable` and `updateTable` [iceberg]

2024-01-15 Thread via GitHub
github-actions[bot] closed issue #6710: REST-Catalog: missing conflict-checks for `dropTable` and `updateTable` URL: https://github.com/apache/iceberg/issues/6710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Spark: Support dropping views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9421: URL: https://github.com/apache/iceberg/pull/9421#discussion_r1452824768 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/HijackViewCommands.scala: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] Spark: Support dropping views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9421: URL: https://github.com/apache/iceberg/pull/9421#discussion_r1452824967 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/HijackViewCommands.scala: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] Spark: Support dropping views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9421: URL: https://github.com/apache/iceberg/pull/9421#discussion_r1452825545 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala: ## @@ -90,6 +93,9 @@ case class ExtendedD

Re: [PR] Spark: Support dropping views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on PR #9421: URL: https://github.com/apache/iceberg/pull/9421#issuecomment-1892917215 I think once this includes the changes from https://github.com/nastra/iceberg/pull/138, I'm +1. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Spark: Support renaming views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9343: URL: https://github.com/apache/iceberg/pull/9343#discussion_r1452826314 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveViews.scala: ## @@ -53,6 +54,11 @@ case class ResolveViews(spark: SparkSessio

Re: [PR] Spark: Support renaming views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9343: URL: https://github.com/apache/iceberg/pull/9343#discussion_r1452826597 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala: ## @@ -90,9 +94,20 @@ case class Extended

Re: [PR] Spark: Support renaming views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9343: URL: https://github.com/apache/iceberg/pull/9343#discussion_r1452827023 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -635,6 +633,118 @@ private Catalog tableCatalog() { return Sp

Re: [PR] Spark: Support renaming views [iceberg]

2024-01-15 Thread via GitHub
rdblue commented on code in PR #9343: URL: https://github.com/apache/iceberg/pull/9343#discussion_r1452826597 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala: ## @@ -90,9 +94,20 @@ case class Extended

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-15 Thread via GitHub
aokolnychyi closed pull request #8755: API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors URL: https://github.com/apache/iceberg/pull/8755 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-15 Thread via GitHub
aokolnychyi opened a new pull request, #8755: URL: https://github.com/apache/iceberg/pull/8755 This PR has code to parallelize reading of deletes and enable caching them on executors. I also have a follow-up change to assign tasks for one partition to the same executor, similar to `K

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
liurenjie1024 commented on issue #164: URL: https://github.com/apache/iceberg-rust/issues/164#issuecomment-1892953624 > Hi, > > Since `Catalog` trait is `pub`, any `struct` used in the `fn`s in the trait should be publically buildable. But `TableCommit`'s builder's builder method's s

Re: [PR] Spark 3.5: Add Spark application id to summary of RewriteDataFilesSparkAction [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on code in PR #9273: URL: https://github.com/apache/iceberg/pull/9273#discussion_r1452857004 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -848,6 +849,16 @@ public void testRewri

Re: [PR] Spark: Support renaming views [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on code in PR #9343: URL: https://github.com/apache/iceberg/pull/9343#discussion_r1452861042 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -635,6 +633,118 @@ private Catalog tableCatalog() { ret

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
zeodtr commented on issue #164: URL: https://github.com/apache/iceberg-rust/issues/164#issuecomment-1892976037 Hi, @liurenjie1024, I see. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
liurenjie1024 closed issue #164: TableCommit builder's build method should be `pub` URL: https://github.com/apache/iceberg-rust/issues/164 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2024-01-15 Thread via GitHub
szehon-ho commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1452867560 ## api/src/main/java/org/apache/iceberg/types/TypeUtil.java: ## @@ -452,6 +454,68 @@ private static void checkSchemaCompatibility( } } + /** + * Estimate

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
zeodtr commented on issue #164: URL: https://github.com/apache/iceberg-rust/issues/164#issuecomment-1892983347 @liurenjie1024 Sorry, but there is another question. I want to add a `Snapshot` to a `Table` but `Transaction` does not provide a `fn` to do it (`append_updates` is private)

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
liurenjie1024 commented on issue #164: URL: https://github.com/apache/iceberg-rust/issues/164#issuecomment-1892984071 > @liurenjie1024 Sorry, but there is another question. > > I want to add a `Snapshot` to a `Table` but `Transaction` does not provide a `fn` to do it (`append_updates`

Re: [I] TableCommit builder's build method should be `pub` [iceberg-rust]

2024-01-15 Thread via GitHub
zeodtr commented on issue #164: URL: https://github.com/apache/iceberg-rust/issues/164#issuecomment-1892985986 @liurenjie1024 Thank you for your quick answer. I'm currently implementing my own `Catalog` to handle my own Iceberg-compatible storage. If adding a snapshot cannot be done

Re: [I] access failed from host to iceberg container [iceberg]

2024-01-15 Thread via GitHub
vagetablechicken commented on issue #9465: URL: https://github.com/apache/iceberg/issues/9465#issuecomment-1892990576 > @vagetablechicken can you check whether the `nyc` schema actually exists before querying? Thanks for help. Yes, the table exists. It failed on s3 reading. ```

Re: [I] kerberos beeline insert iceberg fail error: Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore [iceberg]

2024-01-15 Thread via GitHub
xiaolan-bit commented on issue #9475: URL: https://github.com/apache/iceberg/issues/9475#issuecomment-1893006196 my kerberos settings in hive-site.xml: hive.server2.authentication.kerberos.principal hadoop/h...@hadoop.com hive.server2.authe

Re: [I] kerberos beeline insert iceberg fail error: Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore [iceberg]

2024-01-15 Thread via GitHub
pvary commented on issue #9475: URL: https://github.com/apache/iceberg/issues/9475#issuecomment-1893111098 It was a long time ago, when I was working on this, but the commit is failed based on the logs. The commit happens on the ApplicationManager (?), and it needs to access to HMS. I would

Re: [PR] Nessie: Infer default API version from URI [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on code in PR #9459: URL: https://github.com/apache/iceberg/pull/9459#discussion_r1452954586 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNessieCatalog.java: ## @@ -122,9 +119,7 @@ private NessieCatalog initNessieCatalog(String ref) {

Re: [PR] Nessie: Infer default API version from URI [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on PR #9459: URL: https://github.com/apache/iceberg/pull/9459#issuecomment-1893118053 > I think it would be good to add some tests around this (especially with an invalid URI). Additionally, it would be good to configure both the client-api-version and a URI with v1/v

Re: [I] kerberos beeline insert iceberg fail error: Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore [iceberg]

2024-01-15 Thread via GitHub
xiaolan-bit commented on issue #9475: URL: https://github.com/apache/iceberg/issues/9475#issuecomment-1893125437 I use two master nodes and three core nodes, there are the same "hive-site.xml" about kerberos. It happens as follow: ![image](https://github.com/apache/iceberg/assets/62

Re: [PR] Spec: add multi-arg transform support [iceberg]

2024-01-15 Thread via GitHub
szehon-ho commented on code in PR #8579: URL: https://github.com/apache/iceberg/pull/8579#discussion_r1452981468 ## format/spec.md: ## @@ -1119,21 +1156,30 @@ Partition specs are serialized as a JSON object with the following fields: Each partition field in the fields list i

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-01-15 Thread via GitHub
bugorz commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1452995549 ## core/src/main/java/org/apache/iceberg/AllEntriesTable.java: ## @@ -64,9 +84,18 @@ protected TableScan newRefinedScan(Table table, Schema schema, TableScanContext

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-01-15 Thread via GitHub
bugorz commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1452995549 ## core/src/main/java/org/apache/iceberg/AllEntriesTable.java: ## @@ -64,9 +84,18 @@ protected TableScan newRefinedScan(Table table, Schema schema, TableScanContext

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-01-15 Thread via GitHub
bugorz commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1452995549 ## core/src/main/java/org/apache/iceberg/AllEntriesTable.java: ## @@ -64,9 +84,18 @@ protected TableScan newRefinedScan(Table table, Schema schema, TableScanContext

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-01-15 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1453000460 ## core/src/main/java/org/apache/iceberg/AllEntriesTable.java: ## @@ -64,9 +84,18 @@ protected TableScan newRefinedScan(Table table, Schema schema, TableScanContext

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-01-15 Thread via GitHub
ajantha-bhat commented on PR #9455: URL: https://github.com/apache/iceberg/pull/9455#issuecomment-1893176190 > the only thing I'm a little concerned about is the CachingCatalog, since it uses only the TableIdentifier as the cache key and doesn't consider the snapshot id of a table. That mea

  1   2   >