[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
singhpk234 commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1094215115 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +84,16 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) { r

[GitHub] [iceberg] rbalamohan opened a new issue, #6726: Dynamic partition pruning filters should be applied before invoking Table::planTasks in IcebergInputFormat

2023-02-02 Thread via GitHub
rbalamohan opened a new issue, #6726: URL: https://github.com/apache/iceberg/issues/6726 ### Apache Iceberg version 1.0.0 ### Query engine Hive ### Please describe the bug 🐞 Context: == Split computation in iceberg takes 3-5 seconds (in certain qu

[GitHub] [iceberg] boroknagyz commented on issue #6709: BasicStats: TOTAL_RECORDS_PROP does not update after deletes

2023-02-02 Thread via GitHub
boroknagyz commented on issue #6709: URL: https://github.com/apache/iceberg/issues/6709#issuecomment-1413400733 I think we cannot assume that numRows(table) is equal to numRows(data files) - numRows(position delete files). Because - Concurent deletes might create delete files t

[GitHub] [iceberg] jedrek-VL commented on issue #6713: PyIceberg fails when querying REST catalog

2023-02-02 Thread via GitHub
jedrek-VL commented on issue #6713: URL: https://github.com/apache/iceberg/issues/6713#issuecomment-1413425607 Ok, I managed to make it work by replacing the `load_catalog` code by the following: ``` catalog = load_catalog('default', **{ 'uri': 'http://localhost:8181',

[GitHub] [iceberg] jedrek-VL closed issue #6713: PyIceberg fails when querying REST catalog

2023-02-02 Thread via GitHub
jedrek-VL closed issue #6713: PyIceberg fails when querying REST catalog URL: https://github.com/apache/iceberg/issues/6713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094266424 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094267066 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -135,13 +92,6 @@ public class HiveTableOperations extends BaseMetastoreTableOpera

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094267540 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,538 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094267902 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,538 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094268240 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,538 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] pvary commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
pvary commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1094268929 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,538 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] ajantha-bhat commented on pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
ajantha-bhat commented on PR #6712: URL: https://github.com/apache/iceberg/pull/6712#issuecomment-1413808807 Thanks @dimas-b for the review. @snazy : Do you have any suggestions for this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] Fokko commented on issue #6713: PyIceberg fails when querying REST catalog

2023-02-02 Thread via GitHub
Fokko commented on issue #6713: URL: https://github.com/apache/iceberg/issues/6713#issuecomment-1413913229 Thanks @jedrek-VL for the comprehensive write-up. I'm just seeing this now. Yesterday the docker image was updated with a PyIceberg notebook, I would recommend pulling the latest conta

[GitHub] [iceberg] deniskuzZ commented on pull request #6653: API: Fix Transform backward compatibility in PartitionSpec

2023-02-02 Thread via GitHub
deniskuzZ commented on PR #6653: URL: https://github.com/apache/iceberg/pull/6653#issuecomment-1413922192 hi @pvary, could you please help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [iceberg] nastra commented on a diff in pull request #6723: Docs: Separate page for Branching and Tagging

2023-02-02 Thread via GitHub
nastra commented on code in PR #6723: URL: https://github.com/apache/iceberg/pull/6723#discussion_r1094663480 ## docs/branching-and-tagging.md: ## @@ -0,0 +1,218 @@ +--- +title: "Branching and Tagging" +url: configuration +aliases: +- "tables/branching" +menu: +main: +

[GitHub] [iceberg] Fokko commented on a diff in pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
Fokko commented on code in PR #6482: URL: https://github.com/apache/iceberg/pull/6482#discussion_r1094693759 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -440,7 +440,7 @@ public Builder year(String sourceName, String targetName) { sourceColumn

[GitHub] [iceberg] nastra commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
nastra commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094703306 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -88,11 +89,22 @@ public void initialize(String name, Map options) { options.get(rem

[GitHub] [iceberg] deniskuzZ commented on pull request #6653: API: Fix Transform backward compatibility in PartitionSpec

2023-02-02 Thread via GitHub
deniskuzZ commented on PR #6653: URL: https://github.com/apache/iceberg/pull/6653#issuecomment-1413952926 > Sorry for the late reply, I was traveling the last few days. I like this solution. It is until Iceberg 2.0.0 that we have to keep the lazy and non-lazy versions of initializing a Part

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
ajantha-bhat commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094732867 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,48 @@ public void testListNamespaces() { Assertions.assertThat(namespace

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
ajantha-bhat commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094732867 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,48 @@ public void testListNamespaces() { Assertions.assertThat(namespace

[GitHub] [iceberg] nastra commented on pull request #6674: Add support for special characters in snowflake identifiers for Snowflake Catalog

2023-02-02 Thread via GitHub
nastra commented on PR #6674: URL: https://github.com/apache/iceberg/pull/6674#issuecomment-1413975497 would be good to also get feedback from @danielcweeks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [iceberg] zhangbutao commented on pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
zhangbutao commented on PR #6482: URL: https://github.com/apache/iceberg/pull/6482#issuecomment-1413976249 > Sorry for being late to the party here, I was traveling the last few days. I would be in favor of #6653 if that also solves your problem. Never mind :). Yes, https://github.com

[GitHub] [iceberg] nastra commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
nastra commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094773774 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -103,7 +103,9 @@ public void initialize(String name, Map options) { api = nessieClien

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094775999 ## api/src/main/java/org/apache/iceberg/expressions/BoundAggregate.java: ## @@ -44,4 +57,85 @@ public Type type() { return term().type(); } } + + public

[GitHub] [iceberg] nastra commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
nastra commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094775416 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -103,7 +103,9 @@ public void initialize(String name, Map options) { api = nessieClien

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094779610 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkReadOptions.java: ## @@ -90,4 +90,6 @@ private SparkReadOptions() {} public static final String VERSIO

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094780732 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -47,4 +47,8 @@ private SparkSQLProperties() {} public static final String PR

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094789846 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkLocalScan.java: ## @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [iceberg] Fokko commented on issue #6620: Python: More Flexible Dependency Requirements, especially for Optional Deps

2023-02-02 Thread via GitHub
Fokko commented on issue #6620: URL: https://github.com/apache/iceberg/issues/6620#issuecomment-1414042971 Thanks for pining for me @srilman Most libraries can vary the version, and some are fixed. For example, `aiobotocore` is linked to `boto3`. For Arrow, I think you're okay using

[GitHub] [iceberg] Fokko commented on pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
Fokko commented on PR #6482: URL: https://github.com/apache/iceberg/pull/6482#issuecomment-1414044966 Thanks for letting me know and creating the PR in the first place, much appreciated 👍🏻 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [iceberg] Fokko closed pull request #6482: API: Fix inconsistent TimeTransform Type

2023-02-02 Thread via GitHub
Fokko closed pull request #6482: API: Fix inconsistent TimeTransform Type URL: https://github.com/apache/iceberg/pull/6482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [iceberg] Fokko merged pull request #6653: API: Fix Transform backward compatibility in PartitionSpec

2023-02-02 Thread via GitHub
Fokko merged PR #6653: URL: https://github.com/apache/iceberg/pull/6653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094796934 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094798295 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094799318 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] jedrek-VL commented on issue #6713: PyIceberg fails when querying REST catalog

2023-02-02 Thread via GitHub
jedrek-VL commented on issue #6713: URL: https://github.com/apache/iceberg/issues/6713#issuecomment-1414051660 Right. I saw the new docker images and that's how I found out what I was missing :) -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094803125 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] RussellSpitzer commented on issue #6725: How to detect if the partition's data is ready to consume

2023-02-02 Thread via GitHub
RussellSpitzer commented on issue #6725: URL: https://github.com/apache/iceberg/issues/6725#issuecomment-1414053888 There is no such thing in iceberg. If there is data in a partition the commit has suceeded. So if for example you do a Spark Query and there is data in the partition, that mea

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094803125 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094814986 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] rdblue commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
rdblue commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1094817232 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ## @@ -158,6 +182,141 @@ public Filter[] pushedFilters() { return pushedFil

[GitHub] [iceberg] dimas-b commented on a diff in pull request #6712: Nessie: Support ApiV2 for Nessie client

2023-02-02 Thread via GitHub
dimas-b commented on code in PR #6712: URL: https://github.com/apache/iceberg/pull/6712#discussion_r1094829567 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNamespace.java: ## @@ -73,6 +77,48 @@ public void testListNamespaces() { Assertions.assertThat(namespaces).is

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1093922634 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/PositionDeleteRowReader.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094106234 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java: ## @@ -50,15 +51,22 @@ public class ParquetMetricsRowGroupFilter { private

[GitHub] [iceberg] rdblue merged pull request #6720: Python: Publish the docs by hand

2023-02-02 Thread via GitHub
rdblue merged PR #6720: URL: https://github.com/apache/iceberg/pull/6720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6720: Python: Publish the docs by hand

2023-02-02 Thread via GitHub
rdblue commented on PR #6720: URL: https://github.com/apache/iceberg/pull/6720#issuecomment-1414172576 Sounds reasonable to me. We could also version the docs eventually. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] snazy opened a new issue, #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy opened a new issue, #6727: URL: https://github.com/apache/iceberg/issues/6727 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 The current REST client / OpenAPI spec defines the attribute `stageCreate`. The RES

[GitHub] [iceberg] snazy commented on pull request #6701: Add missing `last-column-id` to spec

2023-02-02 Thread via GitHub
snazy commented on PR #6701: URL: https://github.com/apache/iceberg/pull/6701#issuecomment-1414201166 (Side note: I no longer use the the rest-spec in my experiments - for various reasons - and use the JSON serialization from `iceberg-core`.) But IMO the spec should exactly reflect th

[GitHub] [iceberg] snazy commented on pull request #6700: Snapshot ref type public

2023-02-02 Thread via GitHub
snazy commented on PR #6700: URL: https://github.com/apache/iceberg/pull/6700#issuecomment-1414203692 I needed this, while I was trying (without success for many reasons) to generate JAX-RS code from the spec. Such an approach would need this change. I'm okay to leave it as it is. Mos

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094892920 ## core/src/main/java/org/apache/iceberg/PositionDeletesTable.java: ## @@ -75,16 +75,15 @@ public Schema schema() { return schema; } - private Schema cal

[GitHub] [iceberg] snazy commented on issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy commented on issue #6727: URL: https://github.com/apache/iceberg/issues/6727#issuecomment-1414242918 (Sorry, my bad, seems that _both_ can happen.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg] snazy closed issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed

2023-02-02 Thread via GitHub
snazy closed issue #6727: REST-Catalog: CreateTableRequest.stageCreate can be removed URL: https://github.com/apache/iceberg/issues/6727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094981997 ## core/src/main/java/org/apache/iceberg/MetadataTable.java: ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more cont

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095005897 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095006673 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) {

[GitHub] [iceberg] flyrain commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
flyrain commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095008197 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,273 @@ +/* + * Licensed to the Apache Software Fo

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095009182 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095010538 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFactory

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095010538 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -85,6 +88,7 @@ private static final Logger LOG = LoggerFactory

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
RussellSpitzer commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095013112 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -253,6 +257,39 @@ protected DeleteSummary deleteFiles( return

[GitHub] [iceberg] flyrain commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
flyrain commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095013576 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6622: push down min/max/count to iceberg

2023-02-02 Thread via GitHub
huaxingao commented on code in PR #6622: URL: https://github.com/apache/iceberg/pull/6622#discussion_r1095044926 ## api/src/main/java/org/apache/iceberg/expressions/BoundAggregate.java: ## @@ -44,4 +57,85 @@ public Type type() { return term().type(); } } + + publ

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6716: Spark 3.3: Implement Position Deletes Table

2023-02-02 Thread via GitHub
szehon-ho commented on code in PR #6716: URL: https://github.com/apache/iceberg/pull/6716#discussion_r1094108930 ## orc/src/main/java/org/apache/iceberg/orc/OrcIterable.java: ## @@ -84,15 +91,18 @@ public CloseableIterator iterator() { addCloseable(orcFileReader); Ty

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095119030 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095119030 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095120594 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveLock.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6648: URL: https://github.com/apache/iceberg/pull/6648#discussion_r1095121483 ## hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreLock.java: ## @@ -0,0 +1,540 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6648: Hive: Refactor commit lock mechanism from HiveTableOperations

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6648: URL: https://github.com/apache/iceberg/pull/6648#issuecomment-1414426100 Thanks for the detailed explanations @pvary I agree it does seem difficult to reconcile the two abstractions at this point. The only thing on my side is can we confirm if all the

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095128415 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095133181 ## spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateOrReplaceBranchExec.scala: ## @@ -0,0 +1,82 @@ +/* + * Licens

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #6638: URL: https://github.com/apache/iceberg/pull/6638#discussion_r1095134625 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceBranch.java: ## @@ -0,0 +1,273 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414441586 Thanks for the review @flyrain really appreciate it! So there are a few operations: 1.) replaceBranch (this PR) -> Replace branch will change the snapshot that

[GitHub] [iceberg] haydenflinner commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-02-02 Thread via GitHub
haydenflinner commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1414446899 Same thing here, happening whether I use INSERT INTO or the dataframe API. How annoying. Is there really no solution besides messing with the dataframe schema to ensure it has

[GitHub] [iceberg] RussellSpitzer commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-02-02 Thread via GitHub
RussellSpitzer commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1414466758 Both Spark and Iceberg have their own checks to determine whether an input schema is valid for writing to a given table. The Spark checks are first and require that all of the

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1414473094 @hililiwei similar to https://github.com/apache/iceberg/pull/6638 could this PR encapsulate create/replace? We came to the conclusion on the replace PR it made more sense to just

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6655: Spark: Handle ResolvingFileIO while determining LocalityPreference

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6655: URL: https://github.com/apache/iceberg/pull/6655#discussion_r1095169075 ## core/src/main/java/org/apache/iceberg/hadoop/Util.java: ## @@ -84,10 +88,38 @@ public static String[] blockLocations(FileIO io, ScanTaskGroup taskGroup) {

[GitHub] [iceberg] jackye1995 commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
jackye1995 commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414492620 Looks like we have enough votes and all comments are addressed. I will go ahead to merge this, and we can address further comments in subsequent PRs like #6637 Thanks @amogh-ja

[GitHub] [iceberg] jackye1995 merged pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
jackye1995 merged PR #6638: URL: https://github.com/apache/iceberg/pull/6638 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6638: Spark: REPLACE BRANCH SQL implementation

2023-02-02 Thread via GitHub
amogh-jahagirdar commented on PR #6638: URL: https://github.com/apache/iceberg/pull/6638#issuecomment-1414496739 Thanks for the reviews @flyrain @jackye1995 @yyanyy @hililiwei! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] github-actions[bot] closed issue #5163: Support catalog method to set table metadata

2023-02-02 Thread via GitHub
github-actions[bot] closed issue #5163: Support catalog method to set table metadata URL: https://github.com/apache/iceberg/issues/5163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] github-actions[bot] commented on issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout

2023-02-02 Thread via GitHub
github-actions[bot] commented on issue #4607: URL: https://github.com/apache/iceberg/issues/4607#issuecomment-1414537066 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] commented on issue #5163: Support catalog method to set table metadata

2023-02-02 Thread via GitHub
github-actions[bot] commented on issue #5163: URL: https://github.com/apache/iceberg/issues/5163#issuecomment-1414537028 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] closed issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout

2023-02-02 Thread via GitHub
github-actions[bot] closed issue #4607: [Docs] Create an item list for re-organizing docs to the proposed layout URL: https://github.com/apache/iceberg/issues/4607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] aokolnychyi commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414545406 My bad, I overlooked the condition, @ajantha-bhat! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [iceberg] aokolnychyi merged pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi merged PR #6695: URL: https://github.com/apache/iceberg/pull/6695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] aokolnychyi commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414546166 Thanks, @ajantha-bhat! I merged this. Would you mind following up with cherry-picks to other versions? -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [iceberg] aokolnychyi commented on pull request #6700: Snapshot ref type public

2023-02-02 Thread via GitHub
aokolnychyi commented on PR #6700: URL: https://github.com/apache/iceberg/pull/6700#issuecomment-1414548715 Sounds good, @snazy. Would you mind closing this one and re-opening if needed? Trying to reduce the number of open PRs against our repo. -- This is an automated message from the Apa

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095231341 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action For example, if I h

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095236241 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -67,7 +67,11 @@ public interface DeleteOrphanFiles extends Action

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6682: Bulk delete

2023-02-02 Thread via GitHub
aokolnychyi commented on code in PR #6682: URL: https://github.com/apache/iceberg/pull/6682#discussion_r1095234465 ## api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java: ## @@ -80,9 +84,16 @@ public interface DeleteOrphanFiles extends Action> deleteFunc) { Re

[GitHub] [iceberg] jackye1995 commented on pull request #6637: Spark: Spark SQL Extensions for create tag

2023-02-02 Thread via GitHub
jackye1995 commented on PR #6637: URL: https://github.com/apache/iceberg/pull/6637#issuecomment-1414576132 > could this PR encapsulate create/replace? +1 Let me know when this is updated, I will take another look! -- This is an automated message from the Apache Git Service. T

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1095248368 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/FlinkSplitPlanner.java: ## @@ -86,10 +86,18 @@ static CloseableIterable planTasks( Incremen

[GitHub] [iceberg] jackye1995 commented on a diff in pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on code in PR #5029: URL: https://github.com/apache/iceberg/pull/5029#discussion_r1095249353 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/StreamingMonitorFunction.java: ## @@ -124,11 +126,33 @@ public void initializeState(FunctionInitia

[GitHub] [iceberg] jackye1995 commented on pull request #5029: Flink: Use Tag or Branch to scan data.

2023-02-02 Thread via GitHub
jackye1995 commented on PR #5029: URL: https://github.com/apache/iceberg/pull/5029#issuecomment-1414586011 @stevenzwu since you are reviewing #6660, could you also take a look at this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] ajantha-bhat commented on pull request #6695: Spark-3.3: Handle no-op for rewrite manifests procedure/action

2023-02-02 Thread via GitHub
ajantha-bhat commented on PR #6695: URL: https://github.com/apache/iceberg/pull/6695#issuecomment-1414692489 > Thanks, @ajantha-bhat! I merged this. Would you mind following up with cherry-picks to other versions? Thanks for merging. Today I will work on backporting this PR to other s

[GitHub] [iceberg] lurnagao commented on issue #3127: iceberg HiveCatalog insert exception of GSS initiate failed

2023-02-02 Thread via GitHub
lurnagao commented on issue #3127: URL: https://github.com/apache/iceberg/issues/3127#issuecomment-1414694746 the same problem by using hivecli(2.3.7) + mr + insert into iceberg_table(0.13.2) -- This is an automated message from the Apache Git Service. To respond to the message, please lo

<    6   7   8   9   10   11   12   13   14   15   >