Re: [PR] Reduce the number of equity-deletes using bloom filter [iceberg]

2023-12-04 Thread via GitHub
chenwyi2 commented on PR #5026: URL: https://github.com/apache/iceberg/pull/5026#issuecomment-1840199760 how's it going? i think this can be a good way to reduce equality delete file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Core: Schema for a branch should return table schema [iceberg]

2023-12-04 Thread via GitHub
nastra commented on code in PR #9131: URL: https://github.com/apache/iceberg/pull/9131#discussion_r1415055875 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSnapshotSelection.java: ## @@ -425,16 +426,56 @@ public void testSnapshotSelectionByBranchWithSche

Re: [I] An exception occurred while writing iceberg data through Spark: org. apache. iceberg. exceptions. CommitFailedException: metadata location has changed [iceberg]

2023-12-04 Thread via GitHub
Zhangg7723 commented on issue #9178: URL: https://github.com/apache/iceberg/issues/9178#issuecomment-1840183519 maybe you can change the isolation level to snapshot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [I] Delete a partition [iceberg]

2023-12-04 Thread via GitHub
lpy148145 commented on issue #9190: URL: https://github.com/apache/iceberg/issues/9190#issuecomment-1840176731 ok , thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Delete a partition [iceberg]

2023-12-04 Thread via GitHub
lpy148145 closed issue #9190: Delete a partition URL: https://github.com/apache/iceberg/issues/9190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issu

Re: [I] Delete a partition [iceberg]

2023-12-04 Thread via GitHub
Zhangg7723 commented on issue #9190: URL: https://github.com/apache/iceberg/issues/9190#issuecomment-1840151582 expire old snapshots will remove deleted files in partition -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] flink programs sometimes fail to write to icebergTable. The.avro file in metadata cannot be found [iceberg]

2023-12-04 Thread via GitHub
Zhangg7723 commented on issue #9168: URL: https://github.com/apache/iceberg/issues/9168#issuecomment-1840137972 same problem https://github.com/apache/iceberg/issues/6066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Flink Rewrite Files Action OOM [iceberg]

2023-12-04 Thread via GitHub
Zhangg7723 commented on issue #9193: URL: https://github.com/apache/iceberg/issues/9193#issuecomment-1840130345 upsert mode caused too many equal delete records in the table, these delete records will loaded in memory hash set. -- This is an automated message from the Apache Git Service.

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
nk1506 commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414960618 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -264,6 +269,162 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
nk1506 commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414954046 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -234,15 +225,29 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Use Pydantic's `model_copy` for model modification when updating table metadata [iceberg-python]

2023-12-04 Thread via GitHub
HonahX commented on code in PR #182: URL: https://github.com/apache/iceberg-python/pull/182#discussion_r1414899521 ## pyiceberg/table/__init__.py: ## @@ -533,6 +535,8 @@ def update_table_metadata(base_metadata: TableMetadata, updates: Tuple[TableUpda for update in updates:

[PR] Use Pydantic's `model_copy` for model modification when updating table metadata [iceberg-python]

2023-12-04 Thread via GitHub
HonahX opened a new pull request, #182: URL: https://github.com/apache/iceberg-python/pull/182 Fixes #179 This PR uses Pydantic's [`model_copy`](https://docs.pydantic.dev/latest/api/base_model/#pydantic.main.BaseModel.model_copy) to apply table updates to metadata. Specifically:

[I] How many concurrent operations can be supported at most when multiple Spark tasks write to iceberg same table? [iceberg]

2023-12-04 Thread via GitHub
AllenWee1106 opened a new issue, #9218: URL: https://github.com/apache/iceberg/issues/9218 ### Query engine spark iceberg ### Question multiple Spark task causes concurrent operations, so Iceberg throws exception: `org. Apache Iceberg Exceptions CommitFailedException:

[PR] feat: add a Junit5 version of TableTestBase [iceberg]

2023-12-04 Thread via GitHub
lisirrx opened a new pull request, #9217: URL: https://github.com/apache/iceberg/pull/9217 @nastra Hi, as we talked in #9073, I create a copy of `TableTestBase` as `TestBase`, which changed to Junit5+assertj style. Besides, I have tried the `ParameterizedTestExtension` in #9161, and it w

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
lisirrx commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1414876251 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -79,16 +53,39 @@ import org.apache.parquet.hadoop.metadata.ColumnChunkMeta

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
lisirrx commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1414876251 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -79,16 +53,39 @@ import org.apache.parquet.hadoop.metadata.ColumnChunkMeta

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
lisirrx commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1414876251 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -79,16 +53,39 @@ import org.apache.parquet.hadoop.metadata.ColumnChunkMeta

Re: [I] show table extended not supported for v2 table. [iceberg]

2023-12-04 Thread via GitHub
tanweipeng commented on issue #5782: URL: https://github.com/apache/iceberg/issues/5782#issuecomment-1839959981 > Not sure if the error's stem is the same, but I'm facing the same behaviour when using AWS Glue as metastore Ya, I faced the same issue too. Anyone knows workaround? --

Re: [PR] feat: support UnboundPartitionSpec [iceberg-rust]

2023-12-04 Thread via GitHub
my-vegetable-has-exploded commented on code in PR #106: URL: https://github.com/apache/iceberg-rust/pull/106#discussion_r1414825717 ## crates/iceberg/src/spec/partition.rs: ## @@ -60,13 +60,51 @@ impl PartitionSpec { } } +/// Reference to [`UnboundPartitionSpec`]. +pub t

Re: [I] Make CachingCatalog support cache expiration after write [iceberg]

2023-12-04 Thread via GitHub
lirui-apache closed issue #7792: Make CachingCatalog support cache expiration after write URL: https://github.com/apache/iceberg/issues/7792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat: support UnboundPartitionSpec [iceberg-rust]

2023-12-04 Thread via GitHub
Xuanwo commented on code in PR #106: URL: https://github.com/apache/iceberg-rust/pull/106#discussion_r1414797588 ## crates/iceberg/src/spec/partition.rs: ## @@ -60,13 +60,51 @@ impl PartitionSpec { } } +/// Reference to [`UnboundPartitionSpec`]. +pub type UnboundPartitio

Re: [PR] feat: support UnboundPartitionSpec [iceberg-rust]

2023-12-04 Thread via GitHub
liurenjie1024 commented on PR #106: URL: https://github.com/apache/iceberg-rust/pull/106#issuecomment-1839889064 cc @Fokko PTAL, I think this is ready for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414729164 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalogUtil.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414722778 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -264,6 +269,162 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414721934 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -264,6 +269,162 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414721657 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -264,6 +269,162 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414721354 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -234,15 +225,29 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414720444 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -307,24 +308,48 @@ protected enum CommitStatus { * @param newMetadataLocation the p

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414717796 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -168,6 +154,11 @@ public String name() { return name; } + @Override + public

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414716052 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -168,6 +154,11 @@ public String name() { return name; } + @Override + public

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414715080 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -115,42 +126,17 @@ public void initialize(String inputName, Map properties) { @Overr

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414704629 ## api/src/main/java/org/apache/iceberg/exceptions/NoSuchIcebergViewException.java: ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1414697348 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -1534,4 +1535,90 @@ public void updateViewLocationConflict() { .isInstanceOf(NoSuchVi

Re: [PR] Flink: fix flaky test that might fail due to classloader check [iceberg]

2023-12-04 Thread via GitHub
stevenzwu commented on PR #9216: URL: https://github.com/apache/iceberg/pull/9216#issuecomment-1839798312 see an example of failed test here: https://github.com/apache/iceberg/actions/runs/7091452262/job/19300609355?pr=9212 ``` exception caught while trying to get the future re

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1414688045 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -1534,4 +1535,90 @@ public void updateViewLocationConflict() { .isInstanceOf(NoSuchVi

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1414687369 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -1534,4 +1535,90 @@ public void updateViewLocationConflict() { .isInstanceOf(NoSuchVi

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1414686074 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -247,8 +247,9 @@ public void createViewErrorCases() { .withQuery(trino.di

Re: [I] Make CachingCatalog support cache expiration after write [iceberg]

2023-12-04 Thread via GitHub
github-actions[bot] commented on issue #7792: URL: https://github.com/apache/iceberg/issues/7792#issuecomment-1839779395 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1414683541 ## core/src/main/java/org/apache/iceberg/view/ViewMetadata.java: ## @@ -283,20 +292,25 @@ private int addVersionInternal(ViewVersion version) { } } -

[I] SparkCopyOnWriteScan and RowLevelCommandDynamicPruning do not support V2 filtering [iceberg]

2023-12-04 Thread via GitHub
tmnd1991 opened a new issue, #9215: URL: https://github.com/apache/iceberg/issues/9215 ### Query engine Spark 3.4 ### Question Is there any reason why these two classes were left behind when implementing V2 filter -- This is an automated message from the Apache Git Se

Re: [I] BUG: bucket transform on integer 0 return NAN [iceberg-python]

2023-12-04 Thread via GitHub
puchengy commented on issue #173: URL: https://github.com/apache/iceberg-python/issues/173#issuecomment-1839694031 @Fokko Hi, unfortunately I don't have bandwidth recently, will leave it open if anyone is interested for now. -- This is an automated message from the Apache Git Service. To

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9183: URL: https://github.com/apache/iceberg/pull/9183#discussion_r1414618351 ## core/src/main/java/org/apache/iceberg/BaseTransaction.java: ## @@ -446,20 +446,16 @@ private void commitSimpleTransaction() { } Set committedFiles = c

Re: [I] View is no longer in sync with table after catalog cache entry expires [iceberg]

2023-12-04 Thread via GitHub
namrathamyske commented on issue #8977: URL: https://github.com/apache/iceberg/issues/8977#issuecomment-1839672939 @singhpk234 , `rdd` ( which view1 is created on) has a reference to the logical plan which has a reference to older versions of table. After cache expires, no one is refreshing

Re: [I] Core: Snapshot file are not correctly deleted when a snapshot is expired as part of a transaction [iceberg]

2023-12-04 Thread via GitHub
amogh-jahagirdar closed issue #9182: Core: Snapshot file are not correctly deleted when a snapshot is expired as part of a transaction URL: https://github.com/apache/iceberg/issues/9182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
amogh-jahagirdar merged PR #9183: URL: https://github.com/apache/iceberg/pull/9183 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
amogh-jahagirdar commented on PR #9183: URL: https://github.com/apache/iceberg/pull/9183#issuecomment-1839651809 Thanks @bartash for the fix! Merging, @rdblue @jbonofre let me know if you have any concerns -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #9183: URL: https://github.com/apache/iceberg/pull/9183#discussion_r1414595426 ## core/src/main/java/org/apache/iceberg/BaseTransaction.java: ## @@ -446,20 +446,16 @@ private void commitSimpleTransaction() { } Set committe

Re: [PR] Doc: Adding PuppyGraph to the vendor list [iceberg-docs]

2023-12-04 Thread via GitHub
danielcweeks merged PR #293: URL: https://github.com/apache/iceberg-docs/pull/293 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ice

Re: [PR] Build: Bump moto from 4.2.10 to 4.2.11 [iceberg-python]

2023-12-04 Thread via GitHub
Fokko merged PR #180: URL: https://github.com/apache/iceberg-python/pull/180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.21.29 to 2.21.37 [iceberg]

2023-12-04 Thread via GitHub
dependabot[bot] commented on PR #9204: URL: https://github.com/apache/iceberg/pull/9204#issuecomment-1839578847 Superseded by #9214. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.21.29 to 2.21.37 [iceberg]

2023-12-04 Thread via GitHub
dependabot[bot] closed pull request #9204: Build: Bump software.amazon.awssdk:bom from 2.21.29 to 2.21.37 URL: https://github.com/apache/iceberg/pull/9204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[PR] Build: Bump software.amazon.awssdk:bom from 2.21.29 to 2.21.38 [iceberg]

2023-12-04 Thread via GitHub
dependabot[bot] opened a new pull request, #9214: URL: https://github.com/apache/iceberg/pull/9214 Bumps software.amazon.awssdk:bom from 2.21.29 to 2.21.38. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=softwa

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.21.29 to 2.21.37 [iceberg]

2023-12-04 Thread via GitHub
Fokko commented on PR #9204: URL: https://github.com/apache/iceberg/pull/9204#issuecomment-1839576421 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] Build: Bump moto from 4.2.10 to 4.2.11 [iceberg-python]

2023-12-04 Thread via GitHub
dependabot[bot] opened a new pull request, #180: URL: https://github.com/apache/iceberg-python/pull/180 Bumps [moto](https://github.com/getmoto/moto) from 4.2.10 to 4.2.11. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog. 4.

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
GianlucaPrincipini commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1414562237 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -156,22 +156,22 @@ public class TestDictionaryRowGroupFilter {

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
GianlucaPrincipini commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1414562089 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -18,37 +18,13 @@ */ package org.apache.iceberg.parquet; -im

Re: [I] BUG: bucket transform on integer 0 return NAN [iceberg-python]

2023-12-04 Thread via GitHub
Fokko commented on issue #173: URL: https://github.com/apache/iceberg-python/issues/173#issuecomment-1839514047 @puchengy This is a big one, thanks for catching it! Are you interested in filing a PR? -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [I] Ability to the write Metadata JSON [iceberg-python]

2023-12-04 Thread via GitHub
Fokko closed issue #22: Ability to the write Metadata JSON URL: https://github.com/apache/iceberg-python/issues/22 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Update table metadata [iceberg-python]

2023-12-04 Thread via GitHub
Fokko merged PR #139: URL: https://github.com/apache/iceberg-python/pull/139 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Update table metadata [iceberg-python]

2023-12-04 Thread via GitHub
Fokko commented on code in PR #139: URL: https://github.com/apache/iceberg-python/pull/139#discussion_r1414521771 ## pyiceberg/table/__init__.py: ## @@ -350,6 +357,241 @@ class RemovePropertiesUpdate(TableUpdate): removals: List[str] +class TableMetadataUpdateContext: +

[I] Avoid dictionary (de)serialization for model modification [iceberg-python]

2023-12-04 Thread via GitHub
Fokko opened a new issue, #179: URL: https://github.com/apache/iceberg-python/issues/179 ### Feature Request / Improvement Based on a comment in https://github.com/apache/iceberg-python/pull/139#discussion_r1398855771 Instead it is better to use [Pydantic's model_copy](https:/

Re: [PR] Add SQLite support [iceberg-python]

2023-12-04 Thread via GitHub
Fokko commented on PR #178: URL: https://github.com/apache/iceberg-python/pull/178#issuecomment-1839491704 @jayceslesar Thanks for chiming in here! If I'm not mistaken, PyIceberg could serve as a backend behind the Ibis front end. We currently use sqlalchemy as an ORM for different database

Re: [PR] Add SQLite support [iceberg-python]

2023-12-04 Thread via GitHub
jayceslesar commented on PR #178: URL: https://github.com/apache/iceberg-python/pull/178#issuecomment-1839457485 You might be able to support arbitrary SQL-like catalogs using https://github.com/ibis-project/ibis -- This is an automated message from the Apache Git Service. To respond to t

[PR] Add SQLite support [iceberg-python]

2023-12-04 Thread via GitHub
Fokko opened a new pull request, #178: URL: https://github.com/apache/iceberg-python/pull/178 ![image](https://github.com/apache/iceberg-python/assets/1134248/60e7a061-9489-42a1-a19f-91f143105a37) -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Core: Schema for a branch should return table schema [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9131: URL: https://github.com/apache/iceberg/pull/9131#discussion_r1414416904 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestSelect.java: ## @@ -419,6 +420,12 @@ public void testInvalidTimeTravelBasedOnBothAsOfAndTableIdentifi

Re: [PR] Core: Schema for a branch should return table schema [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9131: URL: https://github.com/apache/iceberg/pull/9131#discussion_r1414414566 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSnapshotSelection.java: ## @@ -425,16 +426,56 @@ public void testSnapshotSelectionByBranchWithSche

Re: [PR] Core: Schema for a branch should return table schema [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9131: URL: https://github.com/apache/iceberg/pull/9131#discussion_r1414414566 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSnapshotSelection.java: ## @@ -425,16 +426,56 @@ public void testSnapshotSelectionByBranchWithSche

Re: [PR] Core: Schema for a branch should return table schema [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9131: URL: https://github.com/apache/iceberg/pull/9131#discussion_r1414410572 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -173,6 +173,10 @@ public Long snapshotId() { return snapshotId; } + p

Re: [PR] Spec: Clarify partition equality [iceberg]

2023-12-04 Thread via GitHub
rdblue merged PR #9125: URL: https://github.com/apache/iceberg/pull/9125 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spec: Clarify partition equality [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on PR #9125: URL: https://github.com/apache/iceberg/pull/9125#issuecomment-1839350869 Thanks for the update, @emkornfield! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spec: Clarify partition equality [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9125: URL: https://github.com/apache/iceberg/pull/9125#discussion_r1414398386 ## format/spec.md: ## @@ -607,6 +611,8 @@ Notes: 1. An alternative, *strict projection*, creates a partition predicate that will match a file if all of the rows in t

Re: [PR] Flink: Fix IcebergSource tableloader lifecycle management in batch mode [iceberg]

2023-12-04 Thread via GitHub
mas-chen commented on code in PR #9173: URL: https://github.com/apache/iceberg/pull/9173#discussion_r1414397623 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -197,17 +203,21 @@ private SplitEnumerator createEnumer LOG.info(

Re: [PR] Spec: Clarify partition equality [iceberg]

2023-12-04 Thread via GitHub
rdblue commented on code in PR #9125: URL: https://github.com/apache/iceberg/pull/9125#discussion_r1414397271 ## format/spec.md: ## @@ -305,6 +305,10 @@ The source column, selected by id, must be a primitive type and cannot be contai Partition specs capture the transform fro

Re: [I] Apache Flink not committing new snapshots to Iceberg Table [iceberg]

2023-12-04 Thread via GitHub
FranMorilloAWS commented on issue #9089: URL: https://github.com/apache/iceberg/issues/9089#issuecomment-1839311467 Hi any updates o this? I have seen that the issue happens if the table has more than 1000 Snapshots. Could it be that as it as that many snapshots, the commit process is ta

Re: [I] Add View Support to Spark [iceberg]

2023-12-04 Thread via GitHub
jzhuge commented on issue #7938: URL: https://github.com/apache/iceberg/issues/7938#issuecomment-1839245214 Spark umbrella JIRA: [SPARK-31357](https://issues.apache.org/jira/browse/SPARK-31357) -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
bartash commented on PR #9183: URL: https://github.com/apache/iceberg/pull/9183#issuecomment-1839240076 Thanks @amogh-jahagirdar I pushed changes for the nits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[PR] Adds support for 1.18 version [iceberg]

2023-12-04 Thread via GitHub
rodmeneses opened a new pull request, #9211: URL: https://github.com/apache/iceberg/pull/9211 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [I] Add View Support to Spark [iceberg]

2023-12-04 Thread via GitHub
jzhuge commented on issue #7938: URL: https://github.com/apache/iceberg/issues/7938#issuecomment-1839214706 https://github.com/apache/spark/pull/39796#issuecomment-1839214419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] chore: junit 5 migration for TestFlinkScan [iceberg]

2023-12-04 Thread via GitHub
pvary commented on code in PR #9185: URL: https://github.com/apache/iceberg/pull/9185#discussion_r1414267995 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkScan.java: ## @@ -49,37 +51,28 @@ import org.apache.iceberg.types.Types; import org.apache.i

Re: [PR] Core: Expired Snapshot files in a transaction should be deleted. [iceberg]

2023-12-04 Thread via GitHub
amogh-jahagirdar commented on code in PR #9183: URL: https://github.com/apache/iceberg/pull/9183#discussion_r1414245847 ## core/src/test/java/org/apache/iceberg/TestSequenceNumberForV2Table.java: ## @@ -309,6 +309,8 @@ public void testExpirationInTransaction() { V2Assert.as

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
nk1506 commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1414247701 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -162,8 +163,11 @@ protected void doRefresh() { Thread.currentThread().inte

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
ExplorData24 commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1839075388 @nastra okay thank you so much. I will try to test in parallel. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
nastra commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1839006269 I would suggest to try the Quickstart example and then modify it to your needs step-by-step to figure out where it breaks exactly -- This is an automated message from the Apache Git

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
ExplorData24 commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838940568 @nastra unfortunately it still brings up the same error: ![image](https://github.com/apache/iceberg/assets/149940691/fb056f8d-b310-4a08-9d10-4a22cd0bf79d) I don

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
ExplorData24 commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838866556 @nastra thank you so much - I will try to test it with these versions right away. - I realized that this only happened to me when reading from a bucket in the

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
nastra commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838842043 Please try and use `spark.jars.packages', 'org.apache.hadoop:hadoop-aws:3.3.2,org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.4.2,org.apache.iceberg:iceberg-aws-bundle:1.4.2'`. In

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
nastra commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838837878 Just curious, are you trying to use SigV4 signing or what exactly is -Dcom.amazonaws.services.s3.enableV4=true required for? -- This is an automated message from the Apache Git Serv

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-12-04 Thread via GitHub
Fokko commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1838785955 @aokolnychyi I think we can start a release somewhere soon, but I need to align this with the Avro community. I also wanted to include nanosecond timestamp in there. -- This is an automa

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
ExplorData24 commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838761423 @nastra Yes, even I tested with iceberg 1.4.2: ![image](https://github.com/apache/iceberg/assets/149940691/33d504c1-90f9-4c33-a7ce-2a17912621ce) ![image](https://githu

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
nastra commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838693595 I can't speak to https://github.com/developer-advocacy-dremio/quick-guides-from-dremio/blob/main/sparknotebook.md but the Quickstart example I mentioned further above is the official

Re: [I] Add JUnit5-equivalent of class-level parameterized tests [iceberg]

2023-12-04 Thread via GitHub
nastra commented on issue #9210: URL: https://github.com/apache/iceberg/issues/9210#issuecomment-1838689497 https://github.com/apache/iceberg/pull/9161 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[I] Add JUnit5-equivalent of class-level parameterized tests [iceberg]

2023-12-04 Thread via GitHub
nastra opened a new issue, #9210: URL: https://github.com/apache/iceberg/issues/9210 ### Feature Request / Improvement JUnit5 doesn't support class-level parameterized testing that is equivalent to JUnit4. It might be worth exploring https://github.com/junit-team/junit5/issues/315

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-12-04 Thread via GitHub
ajantha-bhat commented on code in PR #8701: URL: https://github.com/apache/iceberg/pull/8701#discussion_r1413885283 ## core/src/main/java/org/apache/iceberg/avro/TypeToSchema.java: ## @@ -238,8 +247,51 @@ public Schema primitive(Type.PrimitiveType primitive) { throw new

Re: [I] Replace `Arrays.asList` with `Collections.singletonList` [iceberg]

2023-12-04 Thread via GitHub
yyy1000 commented on issue #9207: URL: https://github.com/apache/iceberg/issues/9207#issuecomment-1838660883 I'm interested in this one! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] How to connect apache iceberg to minio [iceberg]

2023-12-04 Thread via GitHub
ExplorData24 commented on issue #9205: URL: https://github.com/apache/iceberg/issues/9205#issuecomment-1838577648 @nastra Thank you so much. - I actually tried following this: https://github.com/developer-advocacy-dremio/quick-guides-from-dremio/blob/main/sparknotebook.md - I s

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2023-12-04 Thread via GitHub
pvary commented on code in PR #8907: URL: https://github.com/apache/iceberg/pull/8907#discussion_r1413808983 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -162,8 +163,11 @@ protected void doRefresh() { Thread.currentThread().inter

Re: [I] Flink throw an exception while reading table data through icebergsource [iceberg]

2023-12-04 Thread via GitHub
pvary commented on issue #9188: URL: https://github.com/apache/iceberg/issues/9188#issuecomment-1838537826 This is the code which fails with `NullPointerException`: ``` public static void init() { EnvironmentContext.put(EnvironmentContext.ENGINE_NAME, "flink"); Environme

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
nastra commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1413802787 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -156,22 +156,22 @@ public class TestDictionaryRowGroupFilter { private Me

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
nastra commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1413802427 ## parquet/src/test/java/org/apache/iceberg/parquet/TestDictionaryRowGroupFilter.java: ## @@ -79,15 +58,36 @@ import org.apache.parquet.hadoop.metadata.ColumnChunkMetaD

Re: [PR] iceberg-parquet: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2023-12-04 Thread via GitHub
nastra commented on code in PR #9161: URL: https://github.com/apache/iceberg/pull/9161#discussion_r1413801515 ## parquet/src/test/java/org/apache/iceberg/utils/ParameterizedTestExtension.java: ## @@ -0,0 +1,255 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) u

  1   2   >