[GitHub] [iceberg] jackye1995 commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
jackye1995 commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398734413 Sure, assigned! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6631: URL: https://github.com/apache/iceberg/pull/6631#discussion_r1082878466 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestRowDataToAvroGenericRecordConverter.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg] amogh-jahagirdar commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398725654 I'm working on a fix for this @jackye1995 could you assign this to me? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] amogh-jahagirdar opened a new issue, #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar opened a new issue, #6632: URL: https://github.com/apache/iceberg/issues/6632 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine None ### Please describe the bug 🐞 Creating this issue for awareness, was discussing with @rdblu

[GitHub] [iceberg] stevenzwu commented on pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on PR #6631: URL: https://github.com/apache/iceberg/pull/6631#issuecomment-1398722055 I checked the following diff and found nothing related to the classes touched by PR #6584 ``` git diff --no-index flink/v1.14/flink/src/ flink/v1.16/flink/src git diff -

[GitHub] [iceberg] stevenzwu opened a new pull request, #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu opened a new pull request, #6631: URL: https://github.com/apache/iceberg/pull/6631 I also piggybacked the fix of package name (a mishap from PR #6584). some classes should be in the `flink/source/reader` packages. -- This is an automated message from the Apache Git Service. To r

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6584: URL: https://github.com/apache/iceberg/pull/6584#discussion_r1082782121 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/reader/AvroGenericRecordReaderFunction.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Sof

[GitHub] [iceberg] stevenzwu merged pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu merged PR #6584: URL: https://github.com/apache/iceberg/pull/6584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398630489 > should we maybe raise this discussion topic on the mailing list in order to increase visibility for people? Yes agree, let's do that so we can reach a consensus and procee

[GitHub] [iceberg] nastra commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
nastra commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398623955 I also like annotations like `@Nullable` to indicate that certain things in the API can be nullable as this makes it easier to consume that particular API and reason about it. May

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398577729 @nastra any thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398572156 I would +1 on storing in snapshot summary, because: 1. snapshot corresponds very well to MV refresh, there is a 1:1 relationship between them. 2. table properties is not vers

[GitHub] [iceberg] Fokko merged pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
Fokko merged PR #6628: URL: https://github.com/apache/iceberg/pull/6628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] gaborkaszab commented on issue #6257: Partitions metadata table shows old partitions

2023-01-20 Thread GitBox
gaborkaszab commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1398396353 > What would the algorithm be? If the partition has delete files, try to do a full MOR, and check if records are null? Personally, sounds a bit extreme, I would think a good firs

[GitHub] [iceberg] mriveraFacephi commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-01-20 Thread GitBox
mriveraFacephi commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1398240262 Same problem here with Spark 3.1.1 and Iceberg 0.13.1. I'm trying to write dataframe by using the Spark v2 API command writeTo. Every column in my schema is nullable. In my

[GitHub] [iceberg] ajantha-bhat commented on pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
ajantha-bhat commented on PR #6628: URL: https://github.com/apache/iceberg/pull/6628#issuecomment-1398211777 I think we can bump it to `0.47.1` now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] cgpoh opened a new issue, #6630: Purpose of MAX_CONTINUOUS_EMPTY_COMMITS in IcebergFilesCommitter

2023-01-20 Thread GitBox
cgpoh opened a new issue, #6630: URL: https://github.com/apache/iceberg/issues/6630 ### Query engine Flink ### Question I have a Flink job that uses side output to write to Iceberg table when there are errors in the main processing function. If there are no errors in the

[GitHub] [iceberg] kingeasternsun commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1398154903 > Left a review, thanks for the contribution @kingeasternsun ! Also looks like spotless checks are failing which you can fix by running `./gradlew :iceberg-api:spotlessJavaCheck`

[GitHub] [iceberg] JanKaul commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398113386 Yes, I agree with the proposed design 1. I'm not entirely sure what @rdblue prefers. I will update the Google doc accordingly. The next question for me is where and how

[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101458 thanks for the merge! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101212 > Yes this is the consequence of different styles of the projects. That's a good point. I accept the inherent friction being result of that, but I do hope some of that friction i

[GitHub] [iceberg] findepi commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
findepi commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398096721 > Also there is little indication in the codebase of which field could potentially be null. This causes a lot of confusions for external engine integrations like Trino. I am h

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082057002 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat commented on code in PR #6629: URL: https://github.com/apache/iceberg/pull/6629#discussion_r1082180504 ## core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java: ## @@ -329,14 +329,14 @@ static Long expiresAtMillis(String token) { return null; }

[GitHub] [iceberg] nastra commented on a diff in pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
nastra commented on code in PR #6629: URL: https://github.com/apache/iceberg/pull/6629#discussion_r1082177976 ## core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java: ## @@ -329,14 +329,14 @@ static Long expiresAtMillis(String token) { return null; } -

[GitHub] [iceberg] nastra commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
nastra commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082174852 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { TableI

[GitHub] [iceberg] nastra commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
nastra commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1398007171 > > not sure, usually it's been called out on my own PRs to adjust the error msg to that particular format (hence the reason I mentioned it on the other PR), which is being used across oth

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082173399 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Iceb

[GitHub] [iceberg] nastra commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
nastra commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1082170758 ## core/src/main/java/org/apache/iceberg/view/SQLViewRepresentationParser.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082172027 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { Ta

[GitHub] [iceberg] nastra commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
nastra commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082168946 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { TableI

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397987523 @JanKaul if you agree with the summarized consensus we have mostly reached there, for the sake of moving the progress of the discussion forward, could you update the Google doc wi

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082152137 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +134,28 @@ public void testCreateTableBadName() { Ta

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082151825 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -177,13 +178,9 @@ void initialize( GlueClient client, LockManager lock,

[GitHub] [iceberg] aajisaka commented on a diff in pull request #6358: AWS: Print logs whether Glue optimistic locking is used or not

2023-01-19 Thread GitBox
aajisaka commented on code in PR #6358: URL: https://github.com/apache/iceberg/pull/6358#discussion_r1082137636 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -151,7 +151,12 @@ private LockManager initializeLockManager(Map properties) { if (propert

[GitHub] [iceberg] JanKaul commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397956873 > I don't know if it would work or too crazy, just to throw the idea out that I just came up with: > > We could potentially make MV a representation in view spec, in parallel t

[GitHub] [iceberg] aajisaka commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
aajisaka commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082123777 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +134,28 @@ public void testCreateTableBadName() { Tabl

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118919 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118486 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; n

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118109 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082116187 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082104013 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082104013 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097985 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] amogh-jahagirdar commented on issue #6619: Disaster Recovery Options for AWS Athena/Iceberg Integration

2023-01-19 Thread GitBox
amogh-jahagirdar commented on issue #6619: URL: https://github.com/apache/iceberg/issues/6619#issuecomment-139751 Thanks for creating this issue, @anthonysgro could you provide more details on how you're recreating the table and pointing the location and how AWS Backup fits in? A

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6588: Spark 3.3: Add Default Parallelism Level for All Spark Driver Based Deletes

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6588: URL: https://github.com/apache/iceberg/pull/6588#issuecomment-1397883260 Thanks for clarifying @RussellSpitzer I think it makes a ton of sense to leave the specifics of bulk vs parallel to the FileIO abstraction. In this case, we leverage bulk delete

[GitHub] [iceberg] ajantha-bhat commented on pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat commented on PR #6629: URL: https://github.com/apache/iceberg/pull/6629#issuecomment-1397859049 cc: @nastra, @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat opened a new pull request, #6629: URL: https://github.com/apache/iceberg/pull/6629 I have observed that the build [`./gradlew clean build -x test`] has some warnings. So it is an effort to keep the build green. Before: https://user-images.githubusercontent.com/588

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082061029 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/SnapshotTableProcedure.java: ## @@ -93,10 +94,20 @@ public InternalRow[] call(InternalRow a

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082057002 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082056581 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +285,16 @@ class Iceb

[GitHub] [iceberg] cgpoh closed issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

2023-01-19 Thread GitBox
cgpoh closed issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool URL: https://github.com/apache/iceberg/issues/6606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [iceberg] cgpoh commented on issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

2023-01-19 Thread GitBox
cgpoh commented on issue #6606: URL: https://github.com/apache/iceberg/issues/6606#issuecomment-1397847153 After looking into the code, realised that instead of having s3.connection.maximum in flink configuration, I should set the values in Hadoop configuration and pass in the configuration

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082045421 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; no

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1082037773 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

[GitHub] [iceberg] dmgcodevil commented on issue #6587: Wrong class, java.lang.Long, for object: 19367

2023-01-19 Thread GitBox
dmgcodevil commented on issue #6587: URL: https://github.com/apache/iceberg/issues/6587#issuecomment-1397819184 Delete orphan files action is also affected after the schema change: ``` java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Void at org.apac

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
ajantha-bhat commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1082026758 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and lat

[GitHub] [iceberg] jackye1995 commented on pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on PR #6617: URL: https://github.com/apache/iceberg/pull/6617#issuecomment-1397789606 Ping some people for thoughts around the syntax: @rdblue @RussellSpitzer @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082011726 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateBranchExec.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apach

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082011227 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +285,16 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082010680 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; n

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082010234 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -73,6 +73,13 @@ statement | ALTER T

[GitHub] [iceberg] github-actions[bot] commented on issue #5339: Adding the same file twice for the same table

2023-01-19 Thread GitBox
github-actions[bot] commented on issue #5339: URL: https://github.com/apache/iceberg/issues/5339#issuecomment-1397768506 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397757916 I am referring to the **view spec**, using the example here: https://iceberg.apache.org/view-spec/#appendix-a-an-example So in design 1 where we say we want to have a pointe

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6598: URL: https://github.com/apache/iceberg/pull/6598#issuecomment-1397753619 We probably want to establish a standard in the community at this point on Immutable/Nullable or not. Right now we're in this partial state, where it's used in some cases but def

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397753231 To clarify, I was saying that multiple representations are outside the scope of MVs, and could be part of standard table spec. Not sure if the proposal above is along the same line

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-01-19 Thread GitBox
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1081982772 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,130 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081982454 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081982200 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397751094 I don't know if it would work or too crazy, just to throw the idea out that I just came up with: We could potentially make MV a representation in view spec, in parallel to

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081978533 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397742071 > Generically speaking, a table (MV or not), identified by a UUID, could have multiple storage layouts, and execution engines can choose the best storage layout. That's cor

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397732221 > I am thinking about the case where based on the predicate operating on the view, we can choose intelligently what storage table to use. I think this is potentially a generi

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397711644 > Not sure if there is a strong use case for multiple tables for the same view version. I am thinking about the case where based on the predicate operating on the view, we

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1081947384 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,130 @@ public Filter[] pushedFilters() { return

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1081932750 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/SnapshotTableProcedure.java: ## @@ -93,10 +94,20 @@ public InternalRow[] call(InternalRow

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1397701478 Left a review, thanks for the contribution @kingeasternsun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [iceberg] flyrain commented on a diff in pull request #6582: Add a Spark procedure to collect NDV

2023-01-19 Thread GitBox
flyrain commented on code in PR #6582: URL: https://github.com/apache/iceberg/pull/6582#discussion_r1081900515 ## core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java: ## @@ -26,4 +26,6 @@ private StandardBlobTypes() {} * href="https://datasketches.apache.org/

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397595262 Agreed. Reverse pointer to the view will be hard to maintain, so I am inclined to not having it. I would say each view version could optionally map to a new storage table (s

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397589244 So just want to push the progress forward, I think we have some kind of loose consensus that: 1. view + storage table is likely the general approach to go 2. view stores poin

[GitHub] [iceberg] jackye1995 commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
jackye1995 commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397579647 > not sure, usually it's been called out on my own PRs to adjust the error msg to that particular format (hence the reason I mentioned it on the other PR), which is being used across o

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081835641 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081834635 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6570: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6570: URL: https://github.com/apache/iceberg/pull/6570#discussion_r1081787456 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,533 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6570: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6570: URL: https://github.com/apache/iceberg/pull/6570#discussion_r1081768050 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreUtil.java: ## @@ -53,9 +55,23 @@ private MetastoreUtil() {} */ public static void alterTable(

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081718764 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -328,6 +329,140 @@ public void testCreateTableCustomSortOrder() throws Exception

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
haizhou-zhao commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081701820 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -328,6 +329,140 @@ public void testCreateTableCustomSortOrder() throws Excepti

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
haizhou-zhao commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081700457 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -494,6 +494,17 @@ private void setHmsTableParameters( // remove any pr

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081678527 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -36,17 +38,21 @@ default Type type() { String dialect(); /** The default

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081676028 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -18,14 +18,17 @@ */ package org.apache.iceberg.view; +import edu.umd.cs.fin

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081672308 ## api/src/main/java/org/apache/iceberg/view/ViewRepresentation.java: ## @@ -18,21 +18,16 @@ */ package org.apache.iceberg.view; -import java.util.Locale;

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081672308 ## api/src/main/java/org/apache/iceberg/view/ViewRepresentation.java: ## @@ -18,21 +18,16 @@ */ package org.apache.iceberg.view; -import java.util.Locale;

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081669003 ## core/src/test/java/org/apache/iceberg/view/TestViewRepresentationParser.java: ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] nastra commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
nastra commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397400315 > If we are checking non null, I think the current error message still makes more sense? not sure, usually it's been called out on my own PRs to adjust the error msg to that particu

[GitHub] [iceberg] jackye1995 commented on pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on PR #6586: URL: https://github.com/apache/iceberg/pull/6586#issuecomment-1397395046 @aajisaka can you also take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] jackye1995 commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
jackye1995 commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397386343 If we are checking non null, I think the current error message still makes more sense? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] nastra commented on a diff in pull request #6074: API,Core: SnapshotManager to be created through Transaction

2023-01-19 Thread GitBox
nastra commented on code in PR #6074: URL: https://github.com/apache/iceberg/pull/6074#discussion_r1081611444 ## api/src/main/java/org/apache/iceberg/Transaction.java: ## @@ -155,6 +155,13 @@ default UpdateStatistics updateStatistics() { */ ExpireSnapshots expireSnapshots

  1   2   3   4   5   6   7   8   9   10   >