[GitHub] [iceberg] kingeasternsun closed pull request #3973: :art: Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
kingeasternsun closed pull request #3973: :art: Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable. URL: https://github.com/apache/iceberg/pull/3973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6074: API,Core: SnapshotManager to be created through Transaction

2023-01-19 Thread GitBox
gaborkaszab commented on code in PR #6074: URL: https://github.com/apache/iceberg/pull/6074#discussion_r1081433895 ## core/src/main/java/org/apache/iceberg/CommitCallbackTransaction.java: ## @@ -111,6 +111,12 @@ public ExpireSnapshots expireSnapshots() { return wrapped.expi

[GitHub] [iceberg] huaxingao commented on pull request #6622: push down min/max/count to iceberg

2023-01-19 Thread GitBox
huaxingao commented on PR #6622: URL: https://github.com/apache/iceberg/pull/6622#issuecomment-1397244506 @rdblue Could you please take a look when you have time? I am not so sure if I added you as co-author correctly. It looks suspicious. -- This is an automated message from the Apache G

[GitHub] [iceberg] RussellSpitzer commented on issue #6615: Merge into does not work with spark temp table

2023-01-19 Thread GitBox
RussellSpitzer commented on issue #6615: URL: https://github.com/apache/iceberg/issues/6615#issuecomment-1397252089 Looks like Python is swallowing the error there ... there may be an error in the JVM log or the SparkUI that actually has real info in it. -- This is an automated message fr

[GitHub] [iceberg] jackye1995 commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-19 Thread GitBox
jackye1995 commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1397260633 > We keep asking "can this be null" question in PR reviews Yes this is the consequence of different styles of the projects. I like Trino's approach of checking null at every cons

[GitHub] [iceberg] jackye1995 commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-19 Thread GitBox
jackye1995 commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1397262420 I will merge this PR as suggested, and will create a Github issue for further discussion. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [iceberg] jackye1995 merged pull request #6474: Make it explicit that metrics reporter is required

2023-01-19 Thread GitBox
jackye1995 merged PR #6474: URL: https://github.com/apache/iceberg/pull/6474 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] jackye1995 opened a new issue, #6625: Improve nullability check in Iceberg codebase

2023-01-19 Thread GitBox
jackye1995 opened a new issue, #6625: URL: https://github.com/apache/iceberg/issues/6625 ### Feature Request / Improvement Based on discussion in #6474 We should consolidate some consistent way to check nulls, currently both `checkNoNull` and `checkArgument(xxx != null)` are us

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Docs: java api doc add write data example

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1081542409 ## docs/java-api.md: ## @@ -147,6 +147,53 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Docs: java api doc add write data example

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1081543402 ## docs/java-api.md: ## @@ -147,6 +147,53 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6571: Docs: java api doc add write data example

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1081542733 ## docs/java-api.md: ## @@ -147,6 +147,53 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data in

[GitHub] [iceberg] danielcweeks merged pull request #6609: Core: Add test for token expiration during refresh

2023-01-19 Thread GitBox
danielcweeks merged PR #6609: URL: https://github.com/apache/iceberg/pull/6609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[GitHub] [iceberg] stevenzwu commented on pull request #6614: Flink:fix flink streaming query problem [ Cannot get a client from a closed pool]

2023-01-19 Thread GitBox
stevenzwu commented on PR #6614: URL: https://github.com/apache/iceberg/pull/6614#issuecomment-1397299644 > If a catalog is closed, do tables loaded with its internal objects need to be kept available? @hililiwei I think you have a good question here. if catalog is closed, should `Ta

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397302003 > while underlying table snapshot information are stored in storage table snapshot properties +1, I think we are on the same page on this. In my view it's still desi

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-19 Thread GitBox
stevenzwu commented on code in PR #6584: URL: https://github.com/apache/iceberg/pull/6584#discussion_r1081567565 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/reader/AvroGenericRecordReaderFunction.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Sof

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6582: Add a Spark procedure to collect NDV

2023-01-19 Thread GitBox
huaxingao commented on code in PR #6582: URL: https://github.com/apache/iceberg/pull/6582#discussion_r1081568857 ## core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java: ## @@ -26,4 +26,6 @@ private StandardBlobTypes() {} * href="https://datasketches.apache.or

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
amogh-jahagirdar opened a new pull request, #6627: URL: https://github.com/apache/iceberg/pull/6627 Follow up to https://github.com/apache/iceberg/pull/6575/files, this change updates docs with examples of VERSION AS OF time travel for branches and tags, as well as some important notes.

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6358: AWS: Print logs whether Glue optimistic locking is used or not

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6358: URL: https://github.com/apache/iceberg/pull/6358#discussion_r1081579915 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -151,7 +151,12 @@ private LockManager initializeLockManager(Map properties) { if

[GitHub] [iceberg] szehon-ho commented on issue #6257: Partitions metadata table shows old partitions

2023-01-19 Thread GitBox
szehon-ho commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1397349062 What would the algorithm be? If the partition has delete files, try to do a read, and check if records are null? Personally, sounds a bit extreme, I would think a good first step

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6358: AWS: Print logs whether Glue optimistic locking is used or not

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6358: URL: https://github.com/apache/iceberg/pull/6358#discussion_r1081579915 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -151,7 +151,12 @@ private LockManager initializeLockManager(Map properties) { if

[GitHub] [iceberg] nastra commented on a diff in pull request #6074: API,Core: SnapshotManager to be created through Transaction

2023-01-19 Thread GitBox
nastra commented on code in PR #6074: URL: https://github.com/apache/iceberg/pull/6074#discussion_r1081611444 ## api/src/main/java/org/apache/iceberg/Transaction.java: ## @@ -155,6 +155,13 @@ default UpdateStatistics updateStatistics() { */ ExpireSnapshots expireSnapshots

[GitHub] [iceberg] jackye1995 commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
jackye1995 commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397386343 If we are checking non null, I think the current error message still makes more sense? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] jackye1995 commented on pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on PR #6586: URL: https://github.com/apache/iceberg/pull/6586#issuecomment-1397395046 @aajisaka can you also take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] nastra commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
nastra commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397400315 > If we are checking non null, I think the current error message still makes more sense? not sure, usually it's been called out on my own PRs to adjust the error msg to that particu

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081669003 ## core/src/test/java/org/apache/iceberg/view/TestViewRepresentationParser.java: ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081672308 ## api/src/main/java/org/apache/iceberg/view/ViewRepresentation.java: ## @@ -18,21 +18,16 @@ */ package org.apache.iceberg.view; -import java.util.Locale;

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081672308 ## api/src/main/java/org/apache/iceberg/view/ViewRepresentation.java: ## @@ -18,21 +18,16 @@ */ package org.apache.iceberg.view; -import java.util.Locale;

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081676028 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -18,14 +18,17 @@ */ package org.apache.iceberg.view; +import edu.umd.cs.fin

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1081678527 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -36,17 +38,21 @@ default Type type() { String dialect(); /** The default

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
haizhou-zhao commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081700457 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -494,6 +494,17 @@ private void setHmsTableParameters( // remove any pr

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
haizhou-zhao commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081701820 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -328,6 +329,140 @@ public void testCreateTableCustomSortOrder() throws Excepti

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6621: [HiveCatalog] Support Altering and Dropping Table Ownership

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6621: URL: https://github.com/apache/iceberg/pull/6621#discussion_r1081718764 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -328,6 +329,140 @@ public void testCreateTableCustomSortOrder() throws Exception

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6570: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6570: URL: https://github.com/apache/iceberg/pull/6570#discussion_r1081768050 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreUtil.java: ## @@ -53,9 +55,23 @@ private MetastoreUtil() {} */ public static void alterTable(

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6570: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6570: URL: https://github.com/apache/iceberg/pull/6570#discussion_r1081787456 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,533 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081834635 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081835641 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
jackye1995 commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1397579647 > not sure, usually it's been called out on my own PRs to adjust the error msg to that particular format (hence the reason I mentioned it on the other PR), which is being used across o

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397589244 So just want to push the progress forward, I think we have some kind of loose consensus that: 1. view + storage table is likely the general approach to go 2. view stores poin

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397595262 Agreed. Reverse pointer to the view will be hard to maintain, so I am inclined to not having it. I would say each view version could optionally map to a new storage table (s

[GitHub] [iceberg] flyrain commented on a diff in pull request #6582: Add a Spark procedure to collect NDV

2023-01-19 Thread GitBox
flyrain commented on code in PR #6582: URL: https://github.com/apache/iceberg/pull/6582#discussion_r1081900515 ## core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java: ## @@ -26,4 +26,6 @@ private StandardBlobTypes() {} * href="https://datasketches.apache.org/

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1397701478 Left a review, thanks for the contribution @kingeasternsun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1081932750 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/SnapshotTableProcedure.java: ## @@ -93,10 +94,20 @@ public InternalRow[] call(InternalRow

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1081947384 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,130 @@ public Filter[] pushedFilters() { return

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397711644 > Not sure if there is a strong use case for multiple tables for the same view version. I am thinking about the case where based on the predicate operating on the view, we

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397732221 > I am thinking about the case where based on the predicate operating on the view, we can choose intelligently what storage table to use. I think this is potentially a generi

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397742071 > Generically speaking, a table (MV or not), identified by a UUID, could have multiple storage layouts, and execution engines can choose the best storage layout. That's cor

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
amogh-jahagirdar commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081978533 ## docs/spark-queries.md: ## @@ -95,21 +95,39 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397751094 I don't know if it would work or too crazy, just to throw the idea out that I just came up with: We could potentially make MV a representation in view spec, in parallel to

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081982200 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1081982454 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-01-19 Thread GitBox
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1081982772 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,130 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] wmoustafa commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
wmoustafa commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397753231 To clarify, I was saying that multiple representations are outside the scope of MVs, and could be part of standard table spec. Not sure if the proposal above is along the same line

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6598: URL: https://github.com/apache/iceberg/pull/6598#issuecomment-1397753619 We probably want to establish a standard in the community at this point on Immutable/Nullable or not. Right now we're in this partial state, where it's used in some cases but def

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397757916 I am referring to the **view spec**, using the example here: https://iceberg.apache.org/view-spec/#appendix-a-an-example So in design 1 where we say we want to have a pointe

[GitHub] [iceberg] github-actions[bot] commented on issue #5339: Adding the same file twice for the same table

2023-01-19 Thread GitBox
github-actions[bot] commented on issue #5339: URL: https://github.com/apache/iceberg/issues/5339#issuecomment-1397768506 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082010234 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -73,6 +73,13 @@ statement | ALTER T

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082010680 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; n

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082011227 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +285,16 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082011726 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateBranchExec.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apach

[GitHub] [iceberg] jackye1995 commented on pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on PR #6617: URL: https://github.com/apache/iceberg/pull/6617#issuecomment-1397789606 Ping some people for thoughts around the syntax: @rdblue @RussellSpitzer @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-19 Thread GitBox
ajantha-bhat commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1082026758 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and lat

[GitHub] [iceberg] dmgcodevil commented on issue #6587: Wrong class, java.lang.Long, for object: 19367

2023-01-19 Thread GitBox
dmgcodevil commented on issue #6587: URL: https://github.com/apache/iceberg/issues/6587#issuecomment-1397819184 Delete orphan files action is also affected after the schema change: ``` java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Void at org.apac

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-19 Thread GitBox
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1082037773 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082045421 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; no

[GitHub] [iceberg] cgpoh commented on issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

2023-01-19 Thread GitBox
cgpoh commented on issue #6606: URL: https://github.com/apache/iceberg/issues/6606#issuecomment-1397847153 After looking into the code, realised that instead of having s3.connection.maximum in flink configuration, I should set the values in Hadoop configuration and pass in the configuration

[GitHub] [iceberg] cgpoh closed issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

2023-01-19 Thread GitBox
cgpoh closed issue #6606: MinIO com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool URL: https://github.com/apache/iceberg/issues/6606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082056581 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +285,16 @@ class Iceb

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082057002 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-19 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082061029 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/SnapshotTableProcedure.java: ## @@ -93,10 +94,20 @@ public InternalRow[] call(InternalRow a

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat opened a new pull request, #6629: URL: https://github.com/apache/iceberg/pull/6629 I have observed that the build [`./gradlew clean build -x test`] has some warnings. So it is an effort to keep the build green. Before: https://user-images.githubusercontent.com/588

[GitHub] [iceberg] ajantha-bhat commented on pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat commented on PR #6629: URL: https://github.com/apache/iceberg/pull/6629#issuecomment-1397859049 cc: @nastra, @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6588: Spark 3.3: Add Default Parallelism Level for All Spark Driver Based Deletes

2023-01-19 Thread GitBox
amogh-jahagirdar commented on PR #6588: URL: https://github.com/apache/iceberg/pull/6588#issuecomment-1397883260 Thanks for clarifying @RussellSpitzer I think it makes a ton of sense to leave the specifics of bulk vs parallel to the FileIO abstraction. In this case, we leverage bulk delete

[GitHub] [iceberg] amogh-jahagirdar commented on issue #6619: Disaster Recovery Options for AWS Athena/Iceberg Integration

2023-01-19 Thread GitBox
amogh-jahagirdar commented on issue #6619: URL: https://github.com/apache/iceberg/issues/6619#issuecomment-139751 Thanks for creating this issue, @anthonysgro could you provide more details on how you're recreating the table and pointing the location and how AWS Backup fits in? A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097985 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082104013 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082104013 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
JonasJ-ap commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082097202 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082116187 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118109 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118486 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -168,34 +175,77 @@ fieldList ; n

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082118919 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Ice

[GitHub] [iceberg] aajisaka commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
aajisaka commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082123777 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +134,28 @@ public void testCreateTableBadName() { Tabl

[GitHub] [iceberg] JanKaul commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397956873 > I don't know if it would work or too crazy, just to throw the idea out that I just came up with: > > We could potentially make MV a representation in view spec, in parallel t

[GitHub] [iceberg] aajisaka commented on a diff in pull request #6358: AWS: Print logs whether Glue optimistic locking is used or not

2023-01-19 Thread GitBox
aajisaka commented on code in PR #6358: URL: https://github.com/apache/iceberg/pull/6358#discussion_r1082137636 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -151,7 +151,12 @@ private LockManager initializeLockManager(Map properties) { if (propert

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082151825 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -177,13 +178,9 @@ void initialize( GlueClient client, LockManager lock,

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082152137 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +134,28 @@ public void testCreateTableBadName() { Ta

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-19 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1397987523 @JanKaul if you agree with the summarized consensus we have mostly reached there, for the sake of moving the progress of the discussion forward, could you update the Google doc wi

[GitHub] [iceberg] nastra commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
nastra commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082168946 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { TableI

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
jackye1995 commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082172027 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { Ta

[GitHub] [iceberg] nastra commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-19 Thread GitBox
nastra commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1082170758 ## core/src/main/java/org/apache/iceberg/view/SQLViewRepresentationParser.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-19 Thread GitBox
hililiwei commented on code in PR #6617: URL: https://github.com/apache/iceberg/pull/6617#discussion_r1082173399 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -267,6 +286,12 @@ class Iceb

[GitHub] [iceberg] nastra commented on pull request #6626: Core: Update error msg

2023-01-19 Thread GitBox
nastra commented on PR #6626: URL: https://github.com/apache/iceberg/pull/6626#issuecomment-1398007171 > > not sure, usually it's been called out on my own PRs to adjust the error msg to that particular format (hence the reason I mentioned it on the other PR), which is being used across oth

[GitHub] [iceberg] nastra commented on a diff in pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-19 Thread GitBox
nastra commented on code in PR #6586: URL: https://github.com/apache/iceberg/pull/6586#discussion_r1082174852 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -132,6 +133,29 @@ public void testCreateTableBadName() { TableI

[GitHub] [iceberg] nastra commented on a diff in pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
nastra commented on code in PR #6629: URL: https://github.com/apache/iceberg/pull/6629#discussion_r1082177976 ## core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java: ## @@ -329,14 +329,14 @@ static Long expiresAtMillis(String token) { return null; } -

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6629: Build: Fix minor error-prone warnings

2023-01-19 Thread GitBox
ajantha-bhat commented on code in PR #6629: URL: https://github.com/apache/iceberg/pull/6629#discussion_r1082180504 ## core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Util.java: ## @@ -329,14 +329,14 @@ static Long expiresAtMillis(String token) { return null; }

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082057002 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr

[GitHub] [iceberg] findepi commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
findepi commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398096721 > Also there is little indication in the codebase of which field could potentially be null. This causes a lot of confusions for external engine integrations like Trino. I am h

[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101212 > Yes this is the consequence of different styles of the projects. That's a good point. I accept the inherent friction being result of that, but I do hope some of that friction i

<    1   2   3   4   5   6   7   8   9   10   >