[GitHub] [iceberg] jackye1995 commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-01-20 Thread via GitHub
jackye1995 commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1399195757 Good point, +1 for only RETAIN because https://docs.databricks.com/sql/language-manual/delta-vacuum.html -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [iceberg] hililiwei commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-01-20 Thread via GitHub
hililiwei commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1399194014 In the original proposal, it was "[RETAIN For interval {DAYS | HOURS | MINUTES}]", but in keeping with CREATE BRANCH, I removed the" For" key. What do you think about that? @jackye1995

[GitHub] [iceberg] hililiwei opened a new pull request, #6637: Spark: Spark SQL Extensions for create tag

2023-01-20 Thread via GitHub
hililiwei opened a new pull request, #6637: URL: https://github.com/apache/iceberg/pull/6637 Co-authored-by: Amogh Jahagirdar Co-authored-by: chidayong <247070...@qq.com> ## What is the purpose of the change Implement the syntax in the following documents: https://docs.googl

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083250975 ## core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java: ## @@ -1040,4 +1047,195 @@ public void testAllManifestsTableSnapshotNot() { expectedMan

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread via GitHub
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1083237847 ## api/src/main/java/org/apache/iceberg/actions/MigrateTable.java: ## @@ -50,6 +50,15 @@ default MigrateTable dropBackup() { throw new UnsupportedOperationE

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on PR #5234: URL: https://github.com/apache/iceberg/pull/5234#issuecomment-1399167754 Thanks for the reviews @rdblue ! @namrathamyske I raised a PR to your branch for deprecating the old validation methods and updating rev API https://github.com/namrathamyske/iceb

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083230533 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083195038 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -460,8 +471,8 @@ private void validateNoNewDeletesForDataFiles( * @param dataFilter an e

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083191910 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -397,8 +405,10 @@ protected void validateNoNewDeletesForDataFiles( TableMetadata base

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread via GitHub
jackye1995 commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1083145300 ## api/src/main/java/org/apache/iceberg/actions/MigrateTable.java: ## @@ -50,6 +50,15 @@ default MigrateTable dropBackup() { throw new UnsupportedOperationExcep

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread via GitHub
jackye1995 commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1083145098 ## api/src/main/java/org/apache/iceberg/actions/MigrateTable.java: ## @@ -50,6 +50,15 @@ default MigrateTable dropBackup() { throw new UnsupportedOperationExcep

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083118880 ## core/src/test/java/org/apache/iceberg/TestRowDelta.java: ## @@ -81,155 +95,171 @@ public void testAddDeleteFile() { @Test public void testValidateDa

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083117789 ## core/src/test/java/org/apache/iceberg/TestRowDelta.java: ## @@ -81,155 +95,171 @@ public void testAddDeleteFile() { @Test public void testValidateDataFilesExi

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083117603 ## core/src/main/java/org/apache/iceberg/BasePositionDeletesScanTask.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083117457 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083117287 ## core/src/main/java/org/apache/iceberg/SnapshotScan.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083117144 ## core/src/main/java/org/apache/iceberg/util/PartitionUtil.java: ## @@ -91,7 +91,7 @@ private PartitionUtil() {} } // adapts the provided partition data to m

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083116981 ## core/src/main/java/org/apache/iceberg/SnapshotScan.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] aokolnychyi commented on pull request #6012: Spark 3.3: Add a procedure to generate table changes

2023-01-20 Thread via GitHub
aokolnychyi commented on PR #6012: URL: https://github.com/apache/iceberg/pull/6012#issuecomment-1399044913 Getting to this PR soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083113239 ## core/src/test/java/org/apache/iceberg/TestRowDelta.java: ## @@ -81,155 +95,171 @@ public void testAddDeleteFile() { @Test public void testValidateDataFilesExi

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083112330 ## core/src/test/java/org/apache/iceberg/TestRowDelta.java: ## @@ -39,18 +39,32 @@ import org.apache.iceberg.relocated.com.google.common.collect.Sets; import org.junit

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083109459 ## core/src/test/java/org/apache/iceberg/TestReplacePartitions.java: ## @@ -114,20 +122,22 @@ public void testReplaceAndMergeOnePartition() { // ensure the overwrit

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083103875 ## core/src/test/java/org/apache/iceberg/TestOverwrite.java: ## @@ -164,40 +173,43 @@ public void testOverwriteFailsDelete() { "Should reject commit with file n

[GitHub] [iceberg] jackye1995 opened a new issue, #6636: Unclear messaging about Glue catalog locking

2023-01-20 Thread via GitHub
jackye1995 opened a new issue, #6636: URL: https://github.com/apache/iceberg/issues/6636 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 Based on reader feedback, the message presented in https://iceberg.apache.org/docs/l

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083103566 ## core/src/test/java/org/apache/iceberg/TestOverwrite.java: ## @@ -164,40 +173,43 @@ public void testOverwriteFailsDelete() { "Should reject commit with file n

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083101029 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -96,23 +97,37 @@ public RowDelta validateNoConflictingDeleteFiles() { } @Override - protected

[GitHub] [iceberg] stevenzwu opened a new pull request, #6635: Flink: add table setter to FLIP-27 IcebergSource#Builder.

2023-01-20 Thread via GitHub
stevenzwu opened a new pull request, #6635: URL: https://github.com/apache/iceberg/pull/6635 This is to avoid double loading if table is already loaded before the builder. This is also the same pattern as the pre FLIP-27 FlinkSource#Builder. -- This is an automated message from the Apache

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083099568 ## core/src/main/java/org/apache/iceberg/BaseRowDelta.java: ## @@ -96,23 +97,37 @@ public RowDelta validateNoConflictingDeleteFiles() { } @Override - protected

[GitHub] [iceberg] rdblue commented on a diff in pull request #5234: Core, API: BaseRowDelta, BaseOverwrite,BaseReplacePartitions, BaseRewrite to branch Impl

2023-01-20 Thread via GitHub
rdblue commented on code in PR #5234: URL: https://github.com/apache/iceberg/pull/5234#discussion_r1083099208 ## core/src/main/java/org/apache/iceberg/BaseReplacePartitions.java: ## @@ -79,23 +79,32 @@ public ReplacePartitions validateNoConflictingData() { return this; }

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1399009685 @kingeasternsun A maintainer should take a look when they get a chance. @RussellSpitzer @aokolnychyi @szehon-ho @jackye1995 when you get a chance could you take a look? Thanks!

[GitHub] [iceberg] jackye1995 commented on pull request #6617: Spark: Spark SQL Extensions for create branch

2023-01-20 Thread via GitHub
jackye1995 commented on PR #6617: URL: https://github.com/apache/iceberg/pull/6617#issuecomment-1399009614 I think this PR is mostly ready to go. I see there is a comment in design doc from @flyrain: ``` "VERSION" is used in Iceberg to indicate any table changes including table pr

[GitHub] [iceberg] stevenzwu merged pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread via GitHub
stevenzwu merged PR #6631: URL: https://github.com/apache/iceberg/pull/6631 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083088187 ## core/src/main/java/org/apache/iceberg/BaseMetadataTable.java: ## @@ -64,9 +64,12 @@ protected BaseMetadataTable(TableOperations ops, Table table, String name) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083087907 ## core/src/main/java/org/apache/iceberg/BaseMetadataTable.java: ## @@ -73,9 +73,12 @@ protected BaseMetadataTable(Table table, String name) { */ static Part

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083083501 ## core/src/main/java/org/apache/iceberg/BaseMetadataTable.java: ## @@ -64,9 +64,12 @@ protected BaseMetadataTable(TableOperations ops, Table table, String name) {

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083080152 ## core/src/main/java/org/apache/iceberg/SnapshotScan.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more cont

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083068421 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083065602 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,372 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083065455 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083065272 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083065040 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -0,0 +1,396 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6365: Core: Add position deletes metadata table

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6365: URL: https://github.com/apache/iceberg/pull/6365#discussion_r1083026645 ## core/src/main/java/org/apache/iceberg/BasePositionDeletesScanTask.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1059510877 ## delta-lake/src/main/java/org/apache/iceberg/delta/SupportMigrationFromDeltaLake.java: ## @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6634: Core, API: Fix for tracking intermediate snapshots when a transaction spans multiple branches

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #6634: URL: https://github.com/apache/iceberg/pull/6634#discussion_r1083023727 ## api/src/main/java/org/apache/iceberg/SnapshotUpdate.java: ## @@ -71,4 +71,8 @@ default ThisT toBranch(String branch) { "Cannot commit to branch

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6634: Core, API: Fix for tracking intermediate snapshots when a transaction spans multiple branches

2023-01-20 Thread via GitHub
amogh-jahagirdar opened a new pull request, #6634: URL: https://github.com/apache/iceberg/pull/6634 Fix for https://github.com/apache/iceberg/issues/6632 Moving to draft as I read more of the code to make sure this handles different failure cases properly and will add tests if determi

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6633: URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala: ## @@ -187,14 +187,12 @@ object RewriteMergeIntoTa

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

2023-01-20 Thread via GitHub
aokolnychyi commented on code in PR #6633: URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala: ## @@ -187,14 +187,12 @@ object RewriteMergeIntoTa

[GitHub] [iceberg] aokolnychyi opened a new pull request, #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

2023-01-20 Thread via GitHub
aokolnychyi opened a new pull request, #6633: URL: https://github.com/apache/iceberg/pull/6633 This PR fixes predicate pushdown for copy-on-write MERGE commands, which was broken after #6534. This change contains a test that would previously fail and lead to a data correctness issue. --

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-20 Thread via GitHub
jackye1995 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082987146 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-20 Thread via GitHub
singhpk234 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082969487 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-20 Thread via GitHub
singhpk234 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082969487 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6449: Delta: Support Snapshot Delta Lake Table to Iceberg Table

2023-01-20 Thread via GitHub
singhpk234 commented on code in PR #6449: URL: https://github.com/apache/iceberg/pull/6449#discussion_r1082969077 ## delta-lake/src/main/java/org/apache/iceberg/delta/BaseSnapshotDeltaLakeTableAction.java: ## @@ -0,0 +1,370 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] yyanyy commented on a diff in pull request #6598: Core: View representation core implementation

2023-01-20 Thread via GitHub
yyanyy commented on code in PR #6598: URL: https://github.com/apache/iceberg/pull/6598#discussion_r1082942731 ## core/src/main/java/org/apache/iceberg/view/SQLViewRepresentationParser.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] szehon-ho commented on pull request #6410: Configurable metrics reporter by catalog properties

2023-01-20 Thread via GitHub
szehon-ho commented on PR #6410: URL: https://github.com/apache/iceberg/pull/6410#issuecomment-1398807289 I think its fine with me if we can fix the failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6410: Configurable metrics reporter by catalog properties

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6410: URL: https://github.com/apache/iceberg/pull/6410#discussion_r1082942925 ## core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java: ## @@ -301,4 +305,16 @@ protected static String fullTableName(String catalogName, TableIdentifier

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6591: Core: Avoid creating new metadata file when `registerTable` API is used

2023-01-20 Thread via GitHub
szehon-ho commented on code in PR #6591: URL: https://github.com/apache/iceberg/pull/6591#discussion_r1082932748 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -154,6 +154,12 @@ protected void disableRefresh() { this.shouldRefresh = false;

[GitHub] [iceberg] yyanyy commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-20 Thread via GitHub
yyanyy commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1082937202 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and later sup

[GitHub] [iceberg] szehon-ho commented on issue #6257: Partitions metadata table shows old partitions

2023-01-20 Thread via GitHub
szehon-ho commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1398785459 Yea I admit that is annoying. Maybe just the fact to add delete_files column will help know that perhaps the record_count may change? (As well as documenting of course). But agr

[GitHub] [iceberg] jackye1995 merged pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-20 Thread via GitHub
jackye1995 merged PR #6586: URL: https://github.com/apache/iceberg/pull/6586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] jackye1995 commented on pull request #6586: AWS: make warehouse path optional for read only catalog use cases

2023-01-20 Thread via GitHub
jackye1995 commented on PR #6586: URL: https://github.com/apache/iceberg/pull/6586#issuecomment-1398781852 Thanks for testing with Glue @aajisaka ! And thanks for the review @amogh-jahagirdar @nastra -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [iceberg] jackye1995 merged pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-20 Thread via GitHub
jackye1995 merged PR #6627: URL: https://github.com/apache/iceberg/pull/6627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] jackye1995 commented on pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-20 Thread via GitHub
jackye1995 commented on PR #6627: URL: https://github.com/apache/iceberg/pull/6627#issuecomment-1398780784 Thanks for the update and reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6627: Docs: Update spark SQL examples for time travel to branches and tags

2023-01-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #6627: URL: https://github.com/apache/iceberg/pull/6627#discussion_r1082918805 ## docs/spark-queries.md: ## @@ -95,21 +95,37 @@ The above list is in order of priority. For example: a matching catalog will tak SQL -Spark 3.3 and

[GitHub] [iceberg] jackye1995 commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
jackye1995 commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398734413 Sure, assigned! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6631: URL: https://github.com/apache/iceberg/pull/6631#discussion_r1082878466 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestRowDataToAvroGenericRecordConverter.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg] amogh-jahagirdar commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398725654 I'm working on a fix for this @jackye1995 could you assign this to me? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] amogh-jahagirdar opened a new issue, #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar opened a new issue, #6632: URL: https://github.com/apache/iceberg/issues/6632 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine None ### Please describe the bug 🐞 Creating this issue for awareness, was discussing with @rdblu

[GitHub] [iceberg] stevenzwu commented on pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on PR #6631: URL: https://github.com/apache/iceberg/pull/6631#issuecomment-1398722055 I checked the following diff and found nothing related to the classes touched by PR #6584 ``` git diff --no-index flink/v1.14/flink/src/ flink/v1.16/flink/src git diff -

[GitHub] [iceberg] stevenzwu opened a new pull request, #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu opened a new pull request, #6631: URL: https://github.com/apache/iceberg/pull/6631 I also piggybacked the fix of package name (a mishap from PR #6584). some classes should be in the `flink/source/reader` packages. -- This is an automated message from the Apache Git Service. To r

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6584: URL: https://github.com/apache/iceberg/pull/6584#discussion_r1082782121 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/reader/AvroGenericRecordReaderFunction.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Sof

[GitHub] [iceberg] stevenzwu merged pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu merged PR #6584: URL: https://github.com/apache/iceberg/pull/6584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398630489 > should we maybe raise this discussion topic on the mailing list in order to increase visibility for people? Yes agree, let's do that so we can reach a consensus and procee

[GitHub] [iceberg] nastra commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
nastra commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398623955 I also like annotations like `@Nullable` to indicate that certain things in the API can be nullable as this makes it easier to consume that particular API and reason about it. May

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398577729 @nastra any thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398572156 I would +1 on storing in snapshot summary, because: 1. snapshot corresponds very well to MV refresh, there is a 1:1 relationship between them. 2. table properties is not vers

[GitHub] [iceberg] Fokko merged pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
Fokko merged PR #6628: URL: https://github.com/apache/iceberg/pull/6628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] gaborkaszab commented on issue #6257: Partitions metadata table shows old partitions

2023-01-20 Thread GitBox
gaborkaszab commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1398396353 > What would the algorithm be? If the partition has delete files, try to do a full MOR, and check if records are null? Personally, sounds a bit extreme, I would think a good firs

[GitHub] [iceberg] mriveraFacephi commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-01-20 Thread GitBox
mriveraFacephi commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1398240262 Same problem here with Spark 3.1.1 and Iceberg 0.13.1. I'm trying to write dataframe by using the Spark v2 API command writeTo. Every column in my schema is nullable. In my

[GitHub] [iceberg] ajantha-bhat commented on pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
ajantha-bhat commented on PR #6628: URL: https://github.com/apache/iceberg/pull/6628#issuecomment-1398211777 I think we can bump it to `0.47.1` now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] cgpoh opened a new issue, #6630: Purpose of MAX_CONTINUOUS_EMPTY_COMMITS in IcebergFilesCommitter

2023-01-20 Thread GitBox
cgpoh opened a new issue, #6630: URL: https://github.com/apache/iceberg/issues/6630 ### Query engine Flink ### Question I have a Flink job that uses side output to write to Iceberg table when there are errors in the main processing function. If there are no errors in the

[GitHub] [iceberg] kingeasternsun commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1398154903 > Left a review, thanks for the contribution @kingeasternsun ! Also looks like spotless checks are failing which you can fix by running `./gradlew :iceberg-api:spotlessJavaCheck`

[GitHub] [iceberg] JanKaul commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398113386 Yes, I agree with the proposed design 1. I'm not entirely sure what @rdblue prefers. I will update the Google doc accordingly. The next question for me is where and how

[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101458 thanks for the merge! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101212 > Yes this is the consequence of different styles of the projects. That's a good point. I accept the inherent friction being result of that, but I do hope some of that friction i

[GitHub] [iceberg] findepi commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
findepi commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398096721 > Also there is little indication in the codebase of which field could potentially be null. This causes a lot of confusions for external engine integrations like Trino. I am h

[GitHub] [iceberg] kingeasternsun commented on a diff in pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on code in PR #6624: URL: https://github.com/apache/iceberg/pull/6624#discussion_r1082057002 ## spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -39,7 +39,8 @@ class MigrateTableProcedure extends BasePr