Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726477211 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -186,6 +186,8 @@ public void before() { public void after() throws

Re: [PR] Add REST Compatibility Kit [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10908: URL: https://github.com/apache/iceberg/pull/10908#discussion_r1726507429 ## open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java: ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Add REST Compatibility Kit [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10908: URL: https://github.com/apache/iceberg/pull/10908#discussion_r1726514883 ## open-api/src/test/java/org/apache/iceberg/rest/RESTCompatibilityKitViewCatalogTests.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Add REST Compatibility Kit [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10908: URL: https://github.com/apache/iceberg/pull/10908#discussion_r1726517038 ## open-api/src/test/java/org/apache/iceberg/rest/RESTCompatibilityKitCatalogTests.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Add REST Compatibility Kit [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10908: URL: https://github.com/apache/iceberg/pull/10908#discussion_r1726545200 ## open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java: ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10484: URL: https://github.com/apache/iceberg/pull/10484#discussion_r1726594832 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/maintenance/operator/TestLockFactoryBase.java: ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10484: URL: https://github.com/apache/iceberg/pull/10484#discussion_r1726595300 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/JdbcLockFactory.java: ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726607136 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -162,6 +162,9 @@ public void before() { this.tableLocation = tabl

Re: [PR] Flink: backport PR #10777 from 1.19 to 1.18 for sink test refactoring. [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10965: URL: https://github.com/apache/iceberg/pull/10965#discussion_r1726616704 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -70,6 +70,7 @@ public class FlinkCatalogFactory implements CatalogFactory {

Re: [PR] fix (issue-1079): allow update_column to set doc as '' [iceberg-python]

2024-08-22 Thread via GitHub
TiansuYu commented on code in PR #1083: URL: https://github.com/apache/iceberg-python/pull/1083#discussion_r1726626714 ## pyiceberg/table/__init__.py: ## @@ -2492,21 +2492,22 @@ def update_column( except ResolveError as e: raise ValidationEr

[I] deadlock when spark call delete row postition [iceberg]

2024-08-22 Thread via GitHub
DapengShi opened a new issue, #10987: URL: https://github.com/apache/iceberg/issues/10987 ### Apache Iceberg version main (development) ### Query engine Spark ### Please describe the bug šŸž we ran spark on k8s and this dead lock happened when we set spark exe

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
manuzhang commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726746358 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -162,6 +162,9 @@ public void before() { this.tableLocation = t

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
manuzhang commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726746358 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -162,6 +162,9 @@ public void before() { this.tableLocation = t

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1726782294 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return CloseableItera

[I] Nessie Iceberg REST catalog and writing to localstack raises `OSError: When initiating multiple part upload` [iceberg-python]

2024-08-22 Thread via GitHub
PetrasTYR opened a new issue, #1087: URL: https://github.com/apache/iceberg-python/issues/1087 ### Apache Iceberg version 0.7.1 (latest release) ### Please describe the bug šŸž Hello! I am trying to use Nessie as an iceberg catalog, specifically the REST catalog tha

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1726800757 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactory.java: ## @@ -220,8 +221,11 @@ private ArrowVectorAccessor getPlai }

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1726803463 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/GenericArrowVectorAccessorFactory.java: ## @@ -244,6 +248,18 @@ public final boolean getBoolean(int rowId) {

Re: [PR] DRAFT - Issue 10275 - Reward support for nulls [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1726810395 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java: ## @@ -463,12 +460,16 @@ public static VectorizedArrowReader positionsWithSetAr

Re: [PR] Build: Bump orc from 1.9.3 to 1.9.4 (#10728) [iceberg]

2024-08-22 Thread via GitHub
nastra merged PR #10988: URL: https://github.com/apache/iceberg/pull/10988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1726833603 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return CloseableItera

[I] tbl.append(df): schema validation of tbl & df during compares the order & data types [iceberg-python]

2024-08-22 Thread via GitHub
sivaraman-ai opened a new issue, #1088: URL: https://github.com/apache/iceberg-python/issues/1088 ### Apache Iceberg version 0.6.1 ### Please describe the bug šŸž while writing dataframe to iceberg through tbl.append(df), there happens to be a schema validation of table sc

Re: [I] tbl.append(df): schema validation of tbl & df during compares the order & data types [iceberg-python]

2024-08-22 Thread via GitHub
sivaraman-ai commented on issue #1088: URL: https://github.com/apache/iceberg-python/issues/1088#issuecomment-2304424901 when digging deeper, this condition compares the struct with order this condition checks the schema order & data types as struct `if table_schema.as_struct()

Re: [I] JdbcCatalog fails to initialize with MS SQL Server [iceberg]

2024-08-22 Thread via GitHub
PPattusamy commented on issue #10068: URL: https://github.com/apache/iceberg/issues/10068#issuecomment-2304463472 @jbonofre I am facing below issue while using oracle jdbc catalog with schema-version v1. Alter table iceberg_tables add Column icerberg_type(varchar6). Error: ORA-0090

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1726966943 ## gradle/libs.versions.toml: ## @@ -37,6 +37,7 @@ delta-standalone = "3.1.0" delta-spark = "3.2.0" esotericsoftware-kryo = "4.0.3" errorprone-annotations = "2.27.0

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1726971204 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputStream.java: ## @@ -92,13 +93,13 @@ public void seek(long newPos) { public int read() throws IOException {

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726988968 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -162,6 +162,9 @@ public void before() { this.tableLocation = tabl

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726989251 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -186,6 +189,12 @@ public void before() { public void after() throws

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10952: URL: https://github.com/apache/iceberg/pull/10952#discussion_r1726990942 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -162,6 +162,9 @@ public void before() { this.tableLocation = tabl

Re: [PR] AWS: Include http-auth-aws-crt module into iceberg-aws-bundle [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10972: URL: https://github.com/apache/iceberg/pull/10972#discussion_r1726996810 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -251,6 +272,24 @@ public void testNewOutputStreamWithCrossRegionAccessPoint(

Re: [PR] AWS: Include http-auth-aws-crt module into iceberg-aws-bundle [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10972: URL: https://github.com/apache/iceberg/pull/10972#discussion_r1727002025 ## aws-bundle/build.gradle: ## @@ -27,6 +27,7 @@ project(":iceberg-aws-bundle") { implementation platform(libs.awssdk.bom) implementation "software.amazon.aw

Re: [PR] AWS: Include http-auth-aws-crt module into iceberg-aws-bundle [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10972: URL: https://github.com/apache/iceberg/pull/10972#discussion_r1727014510 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -202,6 +206,23 @@ public void testNewInputStreamWithCrossRegionAccessPoint()

Re: [PR] API,Core: Introduce metrics for data files by file format [iceberg]

2024-08-22 Thread via GitHub
findepi commented on PR #5837: URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2304699180 > new metrics for the number of data files broken down by file format. how common is it to have tables with mixed file formats? -- This is an automated message from the Apache Git

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra commented on PR #10952: URL: https://github.com/apache/iceberg/pull/10952#issuecomment-2304744807 thanks for fixing this @manuzhang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spark 3.5: Fix incorrect catalog loaded in TestCreateActions [iceberg]

2024-08-22 Thread via GitHub
nastra merged PR #10952: URL: https://github.com/apache/iceberg/pull/10952 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727119978 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727121576 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727122550 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727122550 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727128007 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] API,Core: Introduce metrics for data files by file format [iceberg]

2024-08-22 Thread via GitHub
gaborkaszab commented on PR #5837: URL: https://github.com/apache/iceberg/pull/5837#issuecomment-2304756917 Thanks for taking a look, @findepi ! I've seen users doing this. One of the motivation is that they gradually move away from one file format into another. What I've seen is that Imp

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1727126121 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-08-22 Thread via GitHub
pvary merged PR #10484: URL: https://github.com/apache/iceberg/pull/10484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Flink: Maintenance - TriggerManager [iceberg]

2024-08-22 Thread via GitHub
pvary commented on PR #10484: URL: https://github.com/apache/iceberg/pull/10484#issuecomment-2304799537 Merged to master. Thanks for the review @stevenzwu, @singhpk234 and @gyfora! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Flink: backport PR #10777 from 1.19 to 1.18 for sink test refactoring. [iceberg]

2024-08-22 Thread via GitHub
stevenzwu commented on code in PR #10965: URL: https://github.com/apache/iceberg/pull/10965#discussion_r1727184450 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -70,6 +70,7 @@ public class FlinkCatalogFactory implements CatalogFactor

Re: [PR] Flink: backport PR #10956 for converter interface that deprecates ReaderFunction [iceberg]

2024-08-22 Thread via GitHub
stevenzwu merged PR #10985: URL: https://github.com/apache/iceberg/pull/10985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Increase the minimum required version of pyarrow. [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on code in PR #1090: URL: https://github.com/apache/iceberg-python/pull/1090#discussion_r1727207820 ## pyproject.toml: ## @@ -94,7 +94,7 @@ pytest-mock = "3.14.0" pyspark = "3.5.2" cython = "3.0.11" deptry = ">=0.14,<0.20" -docutils = "!=0.21.post1" # http

Re: [I] Increase the minimal required pyarrow version to 14.0.0 [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on issue #1089: URL: https://github.com/apache/iceberg-python/issues/1089#issuecomment-2304878483 Made a small tweak to the Issue title, as there is also ongoing work to raise the minimum required version to 17.0.0 -- This is an automated message from the Apache Git Serv

Re: [I] tbl.append(df): schema validation of tbl & df during compares the order & data types [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on issue #1088: URL: https://github.com/apache/iceberg-python/issues/1088#issuecomment-2304881978 Hi @sivaraman-ai - this was fixed in 0.7.x. Could you try using a newer version of PyIceberg? https://github.com/apache/iceberg-python/pull/921 The latest release is 0.7

Re: [I] Sqlalchemy breaks projects that need to support Pandas < 2.0 [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on issue #1085: URL: https://github.com/apache/iceberg-python/issues/1085#issuecomment-2304909347 Hi @aschreiber1 - sorry that you are running into this issue. Unfortunately with Python projects, dealing with the dependency hell and making decisions on what remains the min

Re: [PR] AWS: Include http-auth-aws-crt module into iceberg-aws-bundle [iceberg]

2024-08-22 Thread via GitHub
aajisaka commented on code in PR #10972: URL: https://github.com/apache/iceberg/pull/10972#discussion_r1727273820 ## aws-bundle/build.gradle: ## @@ -27,6 +27,7 @@ project(":iceberg-aws-bundle") { implementation platform(libs.awssdk.bom) implementation "software.amazon.

Re: [PR] fix (issue-1079): allow update_column to set doc as '' [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on code in PR #1083: URL: https://github.com/apache/iceberg-python/pull/1083#discussion_r1727268386 ## pyproject.toml: ## @@ -16,7 +16,7 @@ # under the License. [tool.poetry] name = "pyiceberg" -version = "0.7.1" +version = "0.7.2" Review Comment: We will

Re: [PR] Increase the minimum required version of pyarrow. [iceberg-python]

2024-08-22 Thread via GitHub
ndrluis commented on code in PR #1090: URL: https://github.com/apache/iceberg-python/pull/1090#discussion_r1727293556 ## pyproject.toml: ## @@ -94,7 +94,7 @@ pytest-mock = "3.14.0" pyspark = "3.5.2" cython = "3.0.11" deptry = ">=0.14,<0.20" -docutils = "!=0.21.post1" # htt

Re: [PR] fix (issue-1079): allow update_column to set doc as '' [iceberg-python]

2024-08-22 Thread via GitHub
TiansuYu commented on code in PR #1083: URL: https://github.com/apache/iceberg-python/pull/1083#discussion_r1727383478 ## tests/table/test_init.py: ## @@ -512,6 +512,29 @@ def test_add_column(table_v2: Table) -> None: assert apply_schema.highest_field_id == 4 +def test_

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1727409290 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return CloseableIter

Re: [I] Sqlalchemy breaks projects that need to support Pandas < 2.0 [iceberg-python]

2024-08-22 Thread via GitHub
aschreiber1 commented on issue #1085: URL: https://github.com/apache/iceberg-python/issues/1085#issuecomment-2305115246 Yeah totally appreciate depedency hell... So if i just do a basic conda install (conda env create) with the above envrionment.yml file, it fails for me that error. If

Re: [I] [feat] Table maintenance tasks [iceberg-python]

2024-08-22 Thread via GitHub
sungwy commented on issue #1065: URL: https://github.com/apache/iceberg-python/issues/1065#issuecomment-2305384929 assigned "Compact data files" to myself -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Add list/map block sizes [iceberg]

2024-08-22 Thread via GitHub
rustyconover commented on PR #10973: URL: https://github.com/apache/iceberg/pull/10973#issuecomment-2305419762 Cool! I'm glad to see this coming along. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on PR #10986: URL: https://github.com/apache/iceberg/pull/10986#issuecomment-2305512491 I will take a look at the partition stats PR first by @ajantha-bhat. I want to understand if we want a single analyze procedure or different procedures for table and partition stats

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
anton-db commented on PR #10986: URL: https://github.com/apache/iceberg/pull/10986#issuecomment-2305511357 I will take a look at the partition stats PR first by @ajantha-bhat. I want to understand if we want a single analyze procedure or different procedures for table and partition stats.

Re: [PR] Flink: backport PR #10777 from 1.19 to 1.18 for sink test refactoring. [iceberg]

2024-08-22 Thread via GitHub
stevenzwu merged PR #10965: URL: https://github.com/apache/iceberg/pull/10965 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spec: Minor modifications for v3 [iceberg]

2024-08-22 Thread via GitHub
rdblue merged PR #10948: URL: https://github.com/apache/iceberg/pull/10948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Flink: backport PR #10859 for range distribution [iceberg]

2024-08-22 Thread via GitHub
stevenzwu opened a new pull request, #10990: URL: https://github.com/apache/iceberg/pull/10990 it is a clean back port except that change in `TestFlinkTableSinkExtended` has been included in another back port PR #10965 -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1727796525 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -63,33 +60,43 @@ protected CloseableIterable doPlanFiles( return CloseableItera

Re: [PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-08-22 Thread via GitHub
venkata91 commented on code in PR #10832: URL: https://github.com/apache/iceberg/pull/10832#discussion_r1727811554 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSpeculativeExecutionSupport.java: ## @@ -144,9 +151,9 @@ public void testSpeculativeEx

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1727828125 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh retu

[PR] Core, API, Arrow: Type promotion for int/long to string for V3 tables [iceberg]

2024-08-22 Thread via GitHub
amogh-jahagirdar opened a new pull request, #10991: URL: https://github.com/apache/iceberg/pull/10991 Leaving in draft as we discuss as a community, but this change adds support for int/long -> string conversion without any additional metadata changes. InclusiveMetricsEvaluator is updated t

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1727847483 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh retu

Re: [PR] Core, API, Arrow: Type promotion for int/long to string for V3 tables [iceberg]

2024-08-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #10991: URL: https://github.com/apache/iceberg/pull/10991#discussion_r1727855740 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -285,7 +285,7 @@ public UpdateSchema updateColumn(String name, Type.PrimitiveType newType)

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1727881929 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1727881929 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-08-22 Thread via GitHub
pvary commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2305694139 Could we add some tests for the `BaseIncrementalChangelogScan` directly? It would be good if we don't depend on Spark to test core functionality. Thanks, Peter -- This is an autom

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1727901172 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Core, API, Arrow: Type promotion for int/long to string for V3 tables [iceberg]

2024-08-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #10991: URL: https://github.com/apache/iceberg/pull/10991#discussion_r1727917346 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java: ## @@ -217,6 +219,10 @@ public Boolean lt(BoundReference ref, Literal

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1727966586 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Flink: backport PR #10859 for range distribution [iceberg]

2024-08-22 Thread via GitHub
stevenzwu merged PR #10990: URL: https://github.com/apache/iceberg/pull/10990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Flink: infer source parallelism for FLIP-27 source in batch execution mode [iceberg]

2024-08-22 Thread via GitHub
stevenzwu commented on code in PR #10832: URL: https://github.com/apache/iceberg/pull/10832#discussion_r1727971926 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSpeculativeExecutionSupport.java: ## @@ -144,9 +151,9 @@ public void testSpeculativeEx

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1727994442 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1728009901 ## data/src/jmh/java/org/apache/iceberg/PartitionStatsGeneratorBenchmark.java: ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[PR] Bump boto3 from 1.34.131 to 1.34.162 [iceberg-python]

2024-08-22 Thread via GitHub
dependabot[bot] opened a new pull request, #1095: URL: https://github.com/apache/iceberg-python/pull/1095 Bumps [boto3](https://github.com/boto/boto3) from 1.34.131 to 1.34.162. Commits https://github.com/boto/boto3/commit/59518e4776a764f6e0ab12cd2975d6ecedae6389";>59518e4 Merg

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-22 Thread via GitHub
aokolnychyi commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1728009901 ## data/src/jmh/java/org/apache/iceberg/PartitionStatsGeneratorBenchmark.java: ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [I] Check if dependencies in libs.versions.toml are the latest supported ones for JDK11 [iceberg]

2024-08-22 Thread via GitHub
imneerajsharma commented on issue #10852: URL: https://github.com/apache/iceberg/issues/10852#issuecomment-2305913828 Hi Team, Iā€™m working on automating compatibility checks for dependencies listed in libs.versions.toml with JDK 11. The main challenge Iā€™m facing is identifying a reli

Re: [PR] Doc: add assume role session name doc and remove redundant spark shell examples [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #5994: URL: https://github.com/apache/iceberg/pull/5994#issuecomment-2305949133 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Data: Support reading default values from generic Avro readers [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6004: URL: https://github.com/apache/iceberg/pull/6004#issuecomment-2305949160 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Core: Partial Update [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6043: URL: https://github.com/apache/iceberg/pull/6043#issuecomment-2305949182 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Parquet: Remove the row position since parquet row group has it natively [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6056: URL: https://github.com/apache/iceberg/pull/6056#issuecomment-2305949207 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Format: Geometry-support [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6062: URL: https://github.com/apache/iceberg/pull/6062#issuecomment-2305949226 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Core: Add scan report for incremental Table scans [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6072: URL: https://github.com/apache/iceberg/pull/6072#issuecomment-2305949244 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Flink 1.15: Support change log scan task [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6075: URL: https://github.com/apache/iceberg/pull/6075#issuecomment-2305949259 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] HuaweiCloud: Introduce the iceberg-huaweicloud [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6088: URL: https://github.com/apache/iceberg/pull/6088#issuecomment-2305949282 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Parquet, Core: Fix collection of Parquet metrics when column names coā€¦ [iceberg]

2024-08-22 Thread via GitHub
github-actions[bot] commented on PR #6118: URL: https://github.com/apache/iceberg/pull/6118#issuecomment-2305949302 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatā€™s incorrect or this pull

Re: [PR] Procedure to compute table stats [iceberg]

2024-08-22 Thread via GitHub
nastra commented on code in PR #10986: URL: https://github.com/apache/iceberg/pull/10986#discussion_r1728365906 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestComputeTableStatsProcedure.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apac

Re: [I] Check if dependencies in libs.versions.toml are the latest supported ones for JDK11 [iceberg]

2024-08-22 Thread via GitHub
nastra commented on issue #10852: URL: https://github.com/apache/iceberg/issues/10852#issuecomment-2306300204 @imneerajsharma unfortunately I don't have a good suggestion in terms of tooling. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] arrow/schema:new func `convert_schema` for `ArrowSchemaConverter` [iceberg-rust]

2024-08-22 Thread via GitHub
AndreMouche commented on PR #539: URL: https://github.com/apache/iceberg-rust/pull/539#issuecomment-2306361982 friendly ping @Xuanwo @liurenjie1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t