[GitHub] [iceberg] theoxu31 opened a new pull request, #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
theoxu31 opened a new pull request, #6742: URL: https://github.com/apache/iceberg/pull/6742 Add support for registerTable in GlueCatalog. Customizations: - allowing GlueDataCatalog registerTable API for exiting Table - remove the commit part in registerTable API to avoid creating new

[GitHub] [iceberg] aokolnychyi commented on pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-03 Thread via GitHub
aokolnychyi commented on PR #6655: URL: https://github.com/apache/iceberg/pull/6655#issuecomment-1416560469 @singhpk234, oh, now I got what you meant. The new behavior seems reasonable to me for two reasons: - It is only triggered if someone explicitly passes an option to enable locality

[GitHub] [iceberg] jackieo168 commented on a diff in pull request #6717: spark 3.3 read by snapshot ref schema

2023-02-03 Thread via GitHub
jackieo168 commented on code in PR #6717: URL: https://github.com/apache/iceberg/pull/6717#discussion_r1096420864 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java: ## @@ -105,6 +108,7 @@ public class SparkCatalog extends BaseCatalog { private stati

[GitHub] [iceberg] singhpk234 commented on pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-03 Thread via GitHub
singhpk234 commented on PR #6655: URL: https://github.com/apache/iceberg/pull/6655#issuecomment-1416573251 > Do you think that's reasonable, @singhpk234? +1, makes sense to me. Thanks @aokolnychyi ! -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [iceberg] jackye1995 commented on pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
jackye1995 commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416573278 Thanks for picking this up from me, ping a few people for review @amogh-jahagirdar @singhpk234 @rajarshisarkar @aajisaka @JonasJ-ap -- This is an automated message from the Apache G

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1096433760 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1096433760 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1096434383 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
jackye1995 commented on code in PR #6742: URL: https://github.com/apache/iceberg/pull/6742#discussion_r1096435346 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -437,6 +439,44 @@ public void renameTable(TableIdentifier from, TableIdentifier to) {

[GitHub] [iceberg] JunchengMa commented on issue #6741: Support write distribution mode as a Spark SqlConf option in Iceberg

2023-02-03 Thread via GitHub
JunchengMa commented on issue #6741: URL: https://github.com/apache/iceberg/issues/6741#issuecomment-1416594916 Thanks @dramaticlly for opening this feature request In addition to the GDPR deletion use case, the column redaction use case would also benefit from setting `write.update.distr

[GitHub] [iceberg] JunchengMa commented on issue #6679: Change Default Write Distribution Mode

2023-02-03 Thread via GitHub
JunchengMa commented on issue #6679: URL: https://github.com/apache/iceberg/issues/6679#issuecomment-1416599385 > +1 on @dramaticlly 's comment, changing the write distribution mode affects Spark job performance (causes heavy shuffle) when using Spark SQL like ``` DELETE FROM d

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096417802 ## api/src/main/java/org/apache/iceberg/expressions/AggregateEvaluator.java: ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] aokolnychyi merged pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-03 Thread via GitHub
aokolnychyi merged PR #6655: URL: https://github.com/apache/iceberg/pull/6655 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] aokolnychyi commented on pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-03 Thread via GitHub
aokolnychyi commented on PR #6655: URL: https://github.com/apache/iceberg/pull/6655#issuecomment-1416609858 Thanks for this change, @singhpk234! Thanks for reviewing, @jackye1995 @amogh-jahagirdar! @singhpk234, would you be interested to cherry-pick this change to other query engine

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096460758 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkAggregates.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] aokolnychyi commented on issue #2221: Spark: Extend expire_snapshots procedure with an optional arg for snapshot ids

2023-02-03 Thread via GitHub
aokolnychyi commented on issue #2221: URL: https://github.com/apache/iceberg/issues/2221#issuecomment-1416622132 @Neuw84, I think we switched the approach used in the table API to leverage a reachability set. I assume it should be safe. -- This is an automated message from the Apache Git

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096460758 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkAggregates.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096462757 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return push

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096463211 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return push

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096469739 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return push

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096470135 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return push

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096470135 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return push

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096470489 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestAggregatePushDown.java: ## @@ -0,0 +1,467 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] [iceberg] aokolnychyi commented on pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
aokolnychyi commented on PR #6622: URL: https://github.com/apache/iceberg/pull/6622#issuecomment-1416642361 Great work, @huaxingao! I am looking forward on this being merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [iceberg] aokolnychyi commented on pull request #6651: Spark 3.3 write to branch snapshot

2023-02-03 Thread via GitHub
aokolnychyi commented on PR #6651: URL: https://github.com/apache/iceberg/pull/6651#issuecomment-1416642954 I'd be interested to review this one on Monday. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-03 Thread via GitHub
aokolnychyi commented on code in PR #6637: URL: https://github.com/apache/iceberg/pull/6637#discussion_r1096471169 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsAstBuilder.scala: ## @@ -128,6 +133,36 @@ class Ic

[GitHub] [iceberg] aokolnychyi commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-03 Thread via GitHub
aokolnychyi commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1416643854 cc @RussellSpitzer @szehon-ho @flyrain @karuppayya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096473431 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on pull request #6622: push down min/max/count to iceberg

2023-02-03 Thread via GitHub
huaxingao commented on PR #6622: URL: https://github.com/apache/iceberg/pull/6622#issuecomment-1416647671 @aokolnychyi @rdblue Thank you very much for your review! I have addressed most of the comments. Will finish the rest at a later time. -- This is an automated message from the Apache

[GitHub] [iceberg] ajantha-bhat commented on pull request #6742: support registerTable in GlueCatalog

2023-02-03 Thread via GitHub
ajantha-bhat commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416679180 > remove the commit part in registerTable API to avoid creating new metadata file https://github.com/apache/iceberg/pull/6591 already fixed this part for Glue too right? --

[GitHub] [iceberg] singhpk234 opened a new pull request, #6743: Flink: Backport handling ResolvingFileIO in determining locality - PR 6655

2023-02-04 Thread via GitHub
singhpk234 opened a new pull request, #6743: URL: https://github.com/apache/iceberg/pull/6743 Backports https://github.com/apache/iceberg/pull/6655 to Flink 1.14, 1.15 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [iceberg] jackye1995 commented on pull request #6742: support registerTable in GlueCatalog

2023-02-04 Thread via GitHub
jackye1995 commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416693030 > https://github.com/apache/iceberg/pull/6591 already fixed this part for Glue too right? great, did not notice that one! In that case I think the only missing feature is

[GitHub] [iceberg] singhpk234 opened a new pull request, #6744: Spark: Backport handling ResolvingFileIO in determining locality - PR-6655

2023-02-04 Thread via GitHub
singhpk234 opened a new pull request, #6744: URL: https://github.com/apache/iceberg/pull/6744 Backports https://github.com/apache/iceberg/pull/6655 to Spark 2.4, 3.1, 3.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] singhpk234 commented on pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-04 Thread via GitHub
singhpk234 commented on PR #6655: URL: https://github.com/apache/iceberg/pull/6655#issuecomment-1416694194 Thanks @jackye1995 @aokolnychyi @amogh-jahagirdar for the reviews ! > @singhpk234, would you be interested to cherry-pick this change to other query engine versions? sure

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-04 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1096506568 ## core/src/main/java/org/apache/iceberg/MetadataTable.java: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] ajantha-bhat commented on pull request #6742: support registerTable in GlueCatalog

2023-02-04 Thread via GitHub
ajantha-bhat commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416714745 - allowing GlueDataCatalog registerTable API for exiting Table There is also related on going work https://github.com/apache/iceberg/pull/5327 -- This is an automated mes

[GitHub] [iceberg] Fokko commented on issue #6475: Python: Improve PyArrow performance

2023-02-04 Thread via GitHub
Fokko commented on issue #6475: URL: https://github.com/apache/iceberg/issues/6475#issuecomment-1416747817 https://github.com/apache/arrow/pull/34015 Has just been merged. This will reduce the IO overhead by removing an unnecessary call. -- This is an automated message from the Apache Git

[GitHub] [iceberg] jackye1995 commented on pull request #6742: support registerTable in GlueCatalog

2023-02-04 Thread via GitHub
jackye1995 commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416812241 Nice, I anticipated similar concerns as in that thread, that's why I'd like to just put it up and see how the community reacts to this. I think the conversation there was around

[GitHub] [iceberg] srilman opened a new pull request, #6745: Python: Use Version Ranges for Various Dependencies

2023-02-04 Thread via GitHub
srilman opened a new pull request, #6745: URL: https://github.com/apache/iceberg/pull/6745 As discussed in #6620, this PR uses version ranges for some dependencies. This will unfix the versions for certain dependencies and allow users to use older versions when install PyIceberg to use as a

[GitHub] [iceberg] JonasJ-ap opened a new pull request, #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
JonasJ-ap opened a new pull request, #6746: URL: https://github.com/apache/iceberg/pull/6746 ## Problem Addressed This PR fix the problem described in issue #6715 by using reflection to instantiate the httpclient configuration impl class to avoid runtime deps of both `url-connection-clie

[GitHub] [iceberg] github-actions[bot] commented on issue #5453: Issue after migrating to Spark 3.3.0 and Iceberg 14.0

2023-02-04 Thread via GitHub
github-actions[bot] commented on issue #5453: URL: https://github.com/apache/iceberg/issues/5453#issuecomment-1416882230 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
jackye1995 commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096607340 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -366,36 +362,48 @@ public class AwsProperties implements Serializable { */ public static

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
jackye1995 commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096607921 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -366,36 +362,48 @@ public class AwsProperties implements Serializable { */ public static

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
jackye1995 commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096608183 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1294,27 @@ private void configureEndpoint(T builder, String en } } - @Vi

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
jackye1995 commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096608183 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1294,27 @@ private void configureEndpoint(T builder, String en } } - @Vi

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-04 Thread via GitHub
jackye1995 commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096608183 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1294,27 @@ private void configureEndpoint(T builder, String en } } - @Vi

[GitHub] [iceberg] yabola commented on pull request #6742: support registerTable in GlueCatalog

2023-02-04 Thread via GitHub
yabola commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1416903235 @jackye1995 Thanks for pinging me. I agree with you : there is no requirement in a recovery use case, but this can be a requirement in automatic switch of metadata location. But I am not

[GitHub] [iceberg] dependabot[bot] opened a new pull request, #6747: Build: Bump moto from 4.1.0 to 4.1.2 in /python

2023-02-04 Thread via GitHub
dependabot[bot] opened a new pull request, #6747: URL: https://github.com/apache/iceberg/pull/6747 Bumps [moto](https://github.com/getmoto/moto) from 4.1.0 to 4.1.2. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog. 4.1.2

[GitHub] [iceberg] dependabot[bot] opened a new pull request, #6748: Build: Bump pre-commit from 3.0.1 to 3.0.4 in /python

2023-02-04 Thread via GitHub
dependabot[bot] opened a new pull request, #6748: URL: https://github.com/apache/iceberg/pull/6748 Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.0.1 to 3.0.4. Release notes Sourced from https://github.com/pre-commit/pre-commit/releases";>pre-commit's releases

[GitHub] [iceberg] Fokko merged pull request #6748: Build: Bump pre-commit from 3.0.1 to 3.0.4 in /python

2023-02-04 Thread via GitHub
Fokko merged PR #6748: URL: https://github.com/apache/iceberg/pull/6748 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] danielcweeks opened a new pull request, #6749: Prevent RESTCatalog AuthSession from expiring

2023-02-04 Thread via GitHub
danielcweeks opened a new pull request, #6749: URL: https://github.com/apache/iceberg/pull/6749 When using OAuth with RESTCatalog, the catalog's auth session is returned by `newSession` if credentials are not provided. This results in the main auth session being cached and eventually expir

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626306 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,130 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626413 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626529 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadOptions.java: ## @@ -90,4 +90,6 @@ private SparkReadOptions() {} public static final String VER

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626563 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -47,4 +47,8 @@ private SparkSQLProperties() {} public static final String

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626615 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkLocalScan.java: ## @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626648 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626715 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096626750 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627039 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627078 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627129 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627160 ## api/src/main/java/org/apache/iceberg/expressions/AggregateEvaluator.java: ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627231 ## api/src/main/java/org/apache/iceberg/expressions/AggregateEvaluator.java: ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627249 ## api/src/main/java/org/apache/iceberg/expressions/AggregateEvaluator.java: ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627266 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -243,4 +243,15 @@ public boolean preserveDataGrouping() { .defaultValue(

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627279 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -243,4 +243,15 @@ public boolean preserveDataGrouping() { .defaultValue(

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627292 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627307 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627360 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,150 @@ public Filter[] pushedFilters() { return pushed

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096627383 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestAggregatePushDown.java: ## @@ -0,0 +1,467 @@ +/* + * Licensed to the Apache Software Foundation (AS

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-04 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1096628616 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/spark/sql/TestAggregatePushDown.java: ## @@ -459,9 +492,157 @@ public void testAggregatePushDownForTimeTravel() {

[GitHub] [iceberg] Fokko merged pull request #6747: Build: Bump moto from 4.1.0 to 4.1.2 in /python

2023-02-04 Thread via GitHub
Fokko merged PR #6747: URL: https://github.com/apache/iceberg/pull/6747 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] danielcweeks merged pull request #6674: Add support for special characters in snowflake identifiers for Snowflake Catalog

2023-02-04 Thread via GitHub
danielcweeks merged PR #6674: URL: https://github.com/apache/iceberg/pull/6674 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[GitHub] [iceberg] nastra commented on pull request #6410: Configurable metrics reporter by catalog properties

2023-02-05 Thread via GitHub
nastra commented on PR #6410: URL: https://github.com/apache/iceberg/pull/6410#issuecomment-1417157234 @kmozaid could you rebase the PR please? @danielcweeks could you review this please when you get a chance (and also trigger CI)? -- This is an automated message from the Apache Git S

[GitHub] [iceberg] nastra commented on a diff in pull request #6740: Add application identifier for Snowflake JDBC driver

2023-02-05 Thread via GitHub
nastra commented on code in PR #6740: URL: https://github.com/apache/iceberg/pull/6740#discussion_r109180 ## snowflake/src/main/java/org/apache/iceberg/snowflake/SnowflakeCatalog.java: ## @@ -109,6 +110,10 @@ public void initialize(String name, Map properties) {

[GitHub] [iceberg] nastra commented on pull request #6696: Build: Bump Arrow from 10.0.1 to 11.0.0

2023-02-05 Thread via GitHub
nastra commented on PR #6696: URL: https://github.com/apache/iceberg/pull/6696#issuecomment-1417512864 @ajantha-bhat could you please copy/paste the actual benchmark numbers from the linked benchmark runs to have an easier comparison of the numbers here? -- This is an automated message fr

[GitHub] [iceberg] ajantha-bhat commented on pull request #6742: support registerTable in GlueCatalog

2023-02-05 Thread via GitHub
ajantha-bhat commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1418182510 > I think it has benefit in OSS, as people naturally think of using registerTable when talking about this use case, and as we have so many catalog offerings, it's worth supporting cr

[GitHub] [iceberg] ajantha-bhat commented on pull request #6090: Core: Handle statistics file clean up from expireSnapshots

2023-02-05 Thread via GitHub
ajantha-bhat commented on PR #6090: URL: https://github.com/apache/iceberg/pull/6090#issuecomment-1418186286 If the changes are ok, please merge this PR. So that I can rebase #6091 and make it ready for review. -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [iceberg] jackye1995 commented on pull request #6742: support registerTable in GlueCatalog

2023-02-05 Thread via GitHub
jackye1995 commented on PR #6742: URL: https://github.com/apache/iceberg/pull/6742#issuecomment-1418207531 I briefly read the project you linked, that's very cool CLI! But we are not really trying to build a new migration project out of it, the ask is much simpler. What I want to ge

[GitHub] [iceberg] github-actions[bot] commented on issue #5461: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

2023-02-05 Thread via GitHub
github-actions[bot] commented on issue #5461: URL: https://github.com/apache/iceberg/issues/5461#issuecomment-1418308455 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #5477: Docs: Add build of particular compute engine versions

2023-02-05 Thread via GitHub
github-actions[bot] commented on issue #5477: URL: https://github.com/apache/iceberg/issues/5477#issuecomment-1418308439 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #5381: Problem with assume-role

2023-02-05 Thread via GitHub
github-actions[bot] commented on issue #5381: URL: https://github.com/apache/iceberg/issues/5381#issuecomment-1418308479 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-05 Thread via GitHub
stevenzwu commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096886387 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -366,36 +362,48 @@ public class AwsProperties implements Serializable { */ public static

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-05 Thread via GitHub
stevenzwu commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096892590 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1294,27 @@ private void configureEndpoint(T builder, String en } } - @Vis

[GitHub] [iceberg] stevenzwu merged pull request #6743: Flink: Backport handling ResolvingFileIO in determining locality - PR 6655

2023-02-05 Thread via GitHub
stevenzwu merged PR #6743: URL: https://github.com/apache/iceberg/pull/6743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] stevenzwu commented on pull request #6743: Flink: Backport handling ResolvingFileIO in determining locality - PR 6655

2023-02-05 Thread via GitHub
stevenzwu commented on PR #6743: URL: https://github.com/apache/iceberg/pull/6743#issuecomment-1418419988 thanks @singhpk234 for the backport and @jackye1995 for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [iceberg] bluzy opened a new issue, #6750: Failed to get table info from metastore using impersonation

2023-02-05 Thread via GitHub
bluzy opened a new issue, #6750: URL: https://github.com/apache/iceberg/issues/6750 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine Hive ### Please describe the bug 🐞 We provide hiveserver for query to iceberg tables. Impersonation is e

[GitHub] [iceberg] eric666666 commented on issue #5043: Flink import debezium cdc record(delete type) to iceberg(0.13.2+) got IndexOutOfBoundsException

2023-02-05 Thread via GitHub
eric66 commented on issue #5043: URL: https://github.com/apache/iceberg/issues/5043#issuecomment-1418446822 > I have solved this. Here is my code ` case DELETE: writer.deleteKey(keyProjection.wrap(row), schema); break; `

[GitHub] [iceberg] ajantha-bhat commented on pull request #6696: Build: Bump Arrow from 10.0.1 to 11.0.0

2023-02-05 Thread via GitHub
ajantha-bhat commented on PR #6696: URL: https://github.com/apache/iceberg/pull/6696#issuecomment-1418451836 [With Arrow 10.0.1] run-benchmark (SparkParquetWritersNestedDataBenchmark) 5m31s run-benchmark (SparkParquetWritersFlatDataBenchmark) 5m13s run-benchmark (SparkParquetReadersN

[GitHub] [iceberg] jfly0902 opened a new issue, #6751: iceberg roadmap

2023-02-05 Thread via GitHub
jfly0902 opened a new issue, #6751: URL: https://github.com/apache/iceberg/issues/6751 ### Query engine _No response_ ### Question Hudi has added many new features (secondary index,sparkSQL(ETL) increment read data),iceberg roadmap? -- This is an automated message fr

[GitHub] [iceberg] kmozaid commented on pull request #6410: Configurable metrics reporter by catalog properties

2023-02-05 Thread via GitHub
kmozaid commented on PR #6410: URL: https://github.com/apache/iceberg/pull/6410#issuecomment-1418463948 @nastra I have rebased PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6752: Spark: DROP BRANCH SQL implementation

2023-02-05 Thread via GitHub
amogh-jahagirdar opened a new pull request, #6752: URL: https://github.com/apache/iceberg/pull/6752 This is an implementation of DROP BRANCH Spark SQL Co-authored-by: liliwei hilili...@gmail.com Co-authored-by: xuwei xuwei...@huawei.com Co-authored-by: chidayong chidayo...@h-part

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-05 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1096922524 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDropBranch.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-05 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1096922590 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropBranchExec.scala: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the A

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-05 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1096922657 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropBranchExec.scala: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the A

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6752: Spark: DROP BRANCH SQL implementation

2023-02-05 Thread via GitHub
amogh-jahagirdar commented on code in PR #6752: URL: https://github.com/apache/iceberg/pull/6752#discussion_r1096923356 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDropBranch.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [iceberg] JonasJ-ap commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-05 Thread via GitHub
JonasJ-ap commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096940112 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -366,36 +362,48 @@ public class AwsProperties implements Serializable { */ public static

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-05 Thread via GitHub
stevenzwu commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096946728 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -1314,55 +1281,27 @@ private void configureEndpoint(T builder, String en } } - @Vis

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6746: AWS: Load HttpClientBuilder dynamically to avoid runtime deps of both urlconnection and apache client

2023-02-05 Thread via GitHub
stevenzwu commented on code in PR #6746: URL: https://github.com/apache/iceberg/pull/6746#discussion_r1096949257 ## aws/src/test/java/org/apache/iceberg/aws/TestHttpClientConfigurations.java: ## @@ -0,0 +1,404 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

<    8   9   10   11   12   13   14   15   16   17   >