[GitHub] [iceberg] 0xffmeta commented on issue #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
0xffmeta commented on issue #6725: URL: https://github.com/apache/iceberg/issues/6725#issuecomment-1415254316 For our case, we just need to add success file for the first time to let the downstream know that the daily or hourly partition is ready to consume. On subsequent update, we leverag

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
hililiwei commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095442150 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -73,14 +73,22 @@ statement | ALTER T

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
hililiwei commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095440300 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -206,8 +206,9 @@ class Iceb

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
hililiwei commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095439780 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -92,14 +94,17 @@ class Icebe

[GitHub] [iceberg] ajantha-bhat commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
ajantha-bhat commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1415230512 @aokolnychyi: back ported PR details. 3.2: https://github.com/apache/iceberg/pull/6730 3.1: https://github.com/apache/iceberg/pull/6731 2.4: https://github.com/apache/iceberg

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6733: Spark-2.4: Handle no-op for rewrite manifests action

2023-02-02 Thread via GitHub
ajantha-bhat commented on code in PR #6733: URL: https://github.com/apache/iceberg/pull/6733#discussion_r1095435003 ## spark/v2.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -478,6 +492,33 @@ public void testRewriteManifestsWithPre

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6733: Spark-2.4: Handle no-op for rewrite manifests action

2023-02-02 Thread via GitHub
ajantha-bhat opened a new pull request, #6733: URL: https://github.com/apache/iceberg/pull/6733 backport of https://github.com/apache/iceberg/pull/6695 Note that spark-2.4 doesn't have call procedures. Hence, only two file changes. Also, Added a No-op testcase newly for the spark

[GitHub] [iceberg] hililiwei opened a new pull request, #6732: Spark 3.1: Relocate all Netty classes

2023-02-02 Thread via GitHub
hililiwei opened a new pull request, #6732: URL: https://github.com/apache/iceberg/pull/6732 Spark 3.1: Relocate all Netty classes #6107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6571: Data: java api add GenericTaskWriter and add write demo to Doc.

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1095384483 ## docs/java-api.md: ## @@ -147,6 +147,69 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData Review Comment: WriteDa

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6728: Spark: Fix isIcebergCommand check for replace branch

2023-02-02 Thread via GitHub
hililiwei commented on code in PR #6728: URL: https://github.com/apache/iceberg/pull/6728#discussion_r1095393403 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -206,9 +206,10 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095391409 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -92,14 +94,17 @@ class Iceb

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095386173 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -92,14 +94,17 @@ class Iceb

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095385718 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -206,8 +206,9 @@ class Ice

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095385434 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -73,14 +73,22 @@ statement | ALTER

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1095385058 ## spark/v3.3/spark-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4: ## @@ -73,14 +73,22 @@ statement | ALTER

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6731: Spark-3.1: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
ajantha-bhat opened a new pull request, #6731: URL: https://github.com/apache/iceberg/pull/6731 backport of https://github.com/apache/iceberg/pull/6695 (no conflicts) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6730: Spark-3.2: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
ajantha-bhat opened a new pull request, #6730: URL: https://github.com/apache/iceberg/pull/6730 clean backport of https://github.com/apache/iceberg/pull/6695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095368618 ## core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java: ## @@ -168,6 +168,10 @@ private static String implFromLocation(String location) { return SCHEM

[GitHub] [iceberg] amogh-jahagirdar closed pull request #6728: Spark: Fix isIcebergCommand check for replace branch

2023-02-02 Thread via GitHub
amogh-jahagirdar closed pull request #6728: Spark: Fix isIcebergCommand check for replace branch URL: https://github.com/apache/iceberg/pull/6728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] RussellSpitzer commented on issue #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
RussellSpitzer commented on issue #6725: URL: https://github.com/apache/iceberg/issues/6725#issuecomment-1414972569 That doesn't make sense to me. Would we add success on just the first time? What happens on subsequent modifications? It's way simpler to just use the Java api(if you do

[GitHub] [iceberg] tomtongue opened a new pull request, #6729: Docs: Add the description of creating a table using DataFrameWriterV2 with a table location

2023-02-02 Thread via GitHub
tomtongue opened a new pull request, #6729: URL: https://github.com/apache/iceberg/pull/6729 This change adds how to run CTAS using DataFrame V2 API with specifying a table location. As a background, some iceberg users ask how to create an iceberg table by DataFrame V2 with specifying a tab

[GitHub] [iceberg] 0xffmeta commented on issue #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
0xffmeta commented on issue #6725: URL: https://github.com/apache/iceberg/issues/6725#issuecomment-1414874764 Do you think we can add one configuration to allow iceberg output the `_SUCCESS` file to external path? For most of the use cases in our company, we need to listen to the sensor fil

[GitHub] [iceberg] kmozaid commented on pull request #6410: Configurable metrics reporter by catalog properties

2023-02-02 Thread via GitHub
kmozaid commented on PR #6410: URL: https://github.com/apache/iceberg/pull/6410#issuecomment-1414849197 @gaborkaszab @szehon-ho Could you please trigger workflows? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095306379 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +88,38 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) { r

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095306379 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +88,38 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) { r

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095304010 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -67,14 +62,13 @@ public boolean caseSensitive() { } public boolean local

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095304010 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -67,14 +62,13 @@ public boolean caseSensitive() { } public boolean local

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095304010 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -67,14 +62,13 @@ public boolean caseSensitive() { } public boolean local

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6728: Spark: Fix isIcebergCommand check for replace branch

2023-02-02 Thread via GitHub
amogh-jahagirdar opened a new pull request, #6728: URL: https://github.com/apache/iceberg/pull/6728 @hililiwei @jackye1995 @flyrain @yyanyy When working on the drop branch implementation, I noticed that the "or" condition for replace branch wasn't placed right in https://github.com/apache

[GitHub] [iceberg] lurnagao commented on issue #3127: iceberg HiveCatalog insert exception of GSS initiate failed

2023-02-02 Thread via GitHub
lurnagao commented on issue #3127: URL: https://github.com/apache/iceberg/issues/3127#issuecomment-1414694746 the same problem by using hivecli(2.3.7) + mr + insert into iceberg_table(0.13.2) -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [iceberg] ajantha-bhat commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
ajantha-bhat commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414692489 > Thanks, @ajantha-bhat! I merged this. Would you mind following up with cherry-picks to other versions? Thanks for merging. Today I will work on backporting this PR to other s

[GitHub] [iceberg] jackye1995 commented on pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on PR #5029: URL: https://github.com/apache/iceberg/pull/5029#issuecomment-1414586011 @stevenzwu since you are reviewing #6660, could you also take a look at this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1095249353 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,11 +126,33 @@ public void initializeState(FunctionInitia

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1095248368 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/FlinkSplitPlanner.java: ## @@ -86,10 +86,18 @@ static CloseableIterable planTasks( Incremen

[GitHub] [iceberg] jackye1995 commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1414576132 > could this PR encapsulate create/replace? +1 Let me know when this is updated, I will take another look! -- This is an automated message from the Apache Git Service. T

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095236241 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095231341 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action For example, if I h

[GitHub] [iceberg] aokolnychyi commented on pull request #6700: Snapshot ref type public

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6700: URL: https://github.com/apache/iceberg/pull/6700#issuecomment-1414548715 Sounds good, @snazy. Would you mind closing this one and re-opening if needed? Trying to reduce the number of open PRs against our repo. -- This is an automated message from the Apa

[GitHub] [iceberg] aokolnychyi commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414546166 Thanks, @ajantha-bhat! I merged this. Would you mind following up with cherry-picks to other versions? -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [iceberg] aokolnychyi merged pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi merged PR #6695: URL: https://github.com/apache/iceberg/pull/6695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] aokolnychyi commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414545406 My bad, I overlooked the condition, @ajantha-bhat! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [iceberg] github-actions[bot] closed issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout

2023-02-02 Thread via GitHub
github-actions[bot] closed issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout URL: https://github.com/apache/iceberg/issues/4607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] github-actions[bot] closed issue #5163: Support catalog method to set table metadata

2023-02-02 Thread via GitHub
github-actions[bot] closed issue #5163: Support catalog method to set table metadata URL: https://github.com/apache/iceberg/issues/5163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] github-actions[bot] commented on issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout

2023-02-02 Thread via GitHub
github-actions[bot] commented on issue #4607: URL: https://github.com/apache/iceberg/issues/4607#issuecomment-1414537066 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] commented on issue #5163: Support catalog method to set table metadata

2023-02-02 Thread via GitHub
github-actions[bot] commented on issue #5163: URL: https://github.com/apache/iceberg/issues/5163#issuecomment-1414537028 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414496739 Thanks for the reviews @flyrain @jackye1995 @yyanyy @hililiwei! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] jackye1995 merged pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
jackye1995 merged PR #6638: URL: https://github.com/apache/iceberg/pull/6638 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] jackye1995 commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
jackye1995 commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414492620 Looks like we have enough votes and all comments are addressed. I will go ahead to merge this, and we can address further comments in subsequent PRs like #6637 Thanks @amogh-ja

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095169075 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +88,38 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1414473094 @hililiwei similar to https://github.com/apache/iceberg/pull/6638 could this PR encapsulate create/replace? We came to the conclusion on the replace PR it made more sense to just

[GitHub] [iceberg] RussellSpitzer commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-02-02 Thread via GitHub
RussellSpitzer commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1414466758 Both Spark and Iceberg have their own checks to determine whether an input schema is valid for writing to a given table. The Spark checks are first and require that all of the

[GitHub] [iceberg] haydenflinner commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-02-02 Thread via GitHub
haydenflinner commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1414446899 Same thing here, happening whether I use INSERT INTO or the dataframe API. How annoying. Is there really no solution besides messing with the dataframe schema to ensure it has

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414441586 Thanks for the review @flyrain really appreciate it! So there are a few operations: 1.) replaceBranch (this PR) -> Replace branch will change the snapshot that

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095134625 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,273 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095133181 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6648: URL: https://github.com/apache/iceberg/pull/6648#issuecomment-1414426100 Thanks for the detailed explanations @pvary I agree it does seem difficult to reconcile the two abstractions at this point. The only thing on my side is can we confirm if all the

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095121483 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,540 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095120594 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095119030 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095119030 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094108930 ## orc/src/main/java/org/apache/iceberg/orc/OrcIterable.java: ## @@ -84,15 +91,18 @@ public CloseableIterator iterator() { addCloseable(orcFileReader); Ty

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1095044926 ## api/src/main/java/org/apache/iceberg/expressions/BoundAggregate.java: ## @@ -44,4 +57,85 @@ public Type type() { return term().type(); } } + + publ

[GitHub] [iceberg] flyrain commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
flyrain commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095013576 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095013112 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -253,6 +257,39 @@ protected DeleteSummary deleteFiles( return

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095010538 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFactory

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095010538 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFactory

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095009182 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] flyrain commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
flyrain commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095008197 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,273 @@ +/* + * Licensed to the Apache Software Fo

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095006673 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) {

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095005897 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094981997 ## core/src/main/java/org/apache/iceberg/MetadataTable.java: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more cont

[GitHub] [iceberg] snazy closed issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy closed issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed URL: https://github.com/apache/iceberg/issues/6727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] snazy commented on issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy commented on issue #6727: URL: https://github.com/apache/iceberg/issues/6727#issuecomment-1414242918 (Sorry, my bad, seems that _both_ can happen.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094892920 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -75,16 +75,15 @@ public Schema schema() { return schema; } - private Schema cal

[GitHub] [iceberg] snazy commented on pull request #6700: Snapshot ref type public

2023-02-02 Thread via GitHub
snazy commented on PR #6700: URL: https://github.com/apache/iceberg/pull/6700#issuecomment-1414203692 I needed this, while I was trying (without success for many reasons) to generate JAX-RS code from the spec. Such an approach would need this change. I'm okay to leave it as it is. Mos

[GitHub] [iceberg] snazy commented on pull request #6701: Add missing `last-column-id` to spec

2023-02-02 Thread via GitHub
snazy commented on PR #6701: URL: https://github.com/apache/iceberg/pull/6701#issuecomment-1414201166 (Side note: I no longer use the the rest-spec in my experiments - for various reasons - and use the JSON serialization from `iceberg-core`.) But IMO the spec should exactly reflect th

[GitHub] [iceberg] snazy opened a new issue, #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy opened a new issue, #6727: URL: https://github.com/apache/iceberg/issues/6727 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 The current REST client / OpenAPI spec defines the attribute `stageCreate`. The RES

[GitHub] [iceberg] rdblue commented on pull request #6720: Python: Publish the docs by hand

2023-02-02 Thread via GitHub
rdblue commented on PR #6720: URL: https://github.com/apache/iceberg/pull/6720#issuecomment-1414172576 Sounds reasonable to me. We could also version the docs eventually. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] rdblue merged pull request #6720: Python: Publish the docs by hand

2023-02-02 Thread via GitHub
rdblue merged PR #6720: URL: https://github.com/apache/iceberg/pull/6720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094106234 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java: ## @@ -50,15 +51,22 @@ public class ParquetMetricsRowGroupFilter { private

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1093922634 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/PositionDeleteRowReader.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094829567 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,48 @@ public void testListNamespaces() { Assertions.assertThat(namespaces).is

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094817232 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094814986 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094803125 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] RussellSpitzer commented on issue #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
RussellSpitzer commented on issue #6725: URL: https://github.com/apache/iceberg/issues/6725#issuecomment-1414053888 There is no such thing in iceberg. If there is data in a partition the commit has suceeded. So if for example you do a Spark Query and there is data in the partition, that mea

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094803125 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] jedrek-VL commented on issue #6713: PyIceberg fails when querying REST catalog

2023-02-02 Thread via GitHub
jedrek-VL commented on issue #6713: URL: https://github.com/apache/iceberg/issues/6713#issuecomment-1414051660 Right. I saw the new docker images and that's how I found out what I was missing :) -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094799318 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094798295 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094796934 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] Fokko merged pull request #6653: API: Fix Transform backward compatibility in PartitionSpec

2023-02-02 Thread via GitHub
Fokko merged PR #6653: URL: https://github.com/apache/iceberg/pull/6653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
Fokko commented on PR #6482: URL: https://github.com/apache/iceberg/pull/6482#issuecomment-1414044966 Thanks for letting me know and creating the PR in the first place, much appreciated 👍🏻 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [iceberg] Fokko closed pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
Fokko closed pull request #6482: API: Fix inconsistent TimeTransform Type URL: https://github.com/apache/iceberg/pull/6482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

  1   2   >