Re: [PR] Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882 [iceberg]

2024-10-11 Thread via GitHub
chenwyi2 commented on PR #6570: URL: https://github.com/apache/iceberg/pull/6570#issuecomment-2408332791 "Minimally Hive 2 HMS client is needed to use HIVE-26882 based locking" why we have to check Hive 2? Suppose if i am in hive 1 and i cherry pick HIVE-26882, that will not be right? --

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796516710 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -740,6 +743,10 @@ private boolean partitionMatch(Record

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796516348 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestAlterTable.java: ## @@ -275,6 +278,9 @@ public void testAlterColumnPositionFirst() { @TestTempl

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796527051 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { } @Befor

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796527938 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { } @Befor

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796534509 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { } @Befor

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796534509 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { } @Befor

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on PR #11180: URL: https://github.com/apache/iceberg/pull/11180#issuecomment-2406717923 @rdblue @danielcweeks @amogh-jahagirdar @nastra @jackye1995 @singhpk234 Added a new commit `Add support for scan planning apis in REST Catalog` which invokes the new apis and int

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796534994 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { } @Befor

Re: [PR] feat: Derive PartialEq for FileScanTask [iceberg-rust]

2024-10-11 Thread via GitHub
Xuanwo commented on PR #660: URL: https://github.com/apache/iceberg-rust/pull/660#issuecomment-2406717591 Thank you, @sdd, for the reminder. It's weird that GitHub doesn't provide an `update branches` button for me. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] RecordBatchTransformer: Handle schema migration and column re-ordering in table scans [iceberg-rust]

2024-10-11 Thread via GitHub
Xuanwo merged PR #602: URL: https://github.com/apache/iceberg-rust/pull/602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-11 Thread via GitHub
Xuanwo commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1796543659 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,95 @@ components: uuid: type: string +ADLSCredential: + type: object +

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796544204 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -89,13 +138,37 @@ public static void dropWarehouse() throws IOException {

Re: [PR] OpenAPI: Add planning-mode to loadTable response [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on PR #11156: URL: https://github.com/apache/iceberg/pull/11156#issuecomment-2406727113 @amogh-jahagirdar I was wondering if we should tag this on the milestone board for 1.7.0 as it relates to the impl pr https://github.com/apache/iceberg/pull/11180 -- This is an autom

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796545689 ## open-api/src/testFixtures/java/org/apache/iceberg/rest/RESTCatalogServer.java: ## @@ -64,7 +65,9 @@ public Map configuration() { private CatalogContext initial

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796635546 ## open-api/rest-catalog-open-api.yaml: ## @@ -3142,6 +3211,10 @@ components: type: object additionalProperties: type: string +

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796630239 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3141,32 @@ components: uuid: type: string +Credential: + type: object + requ

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796630239 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3141,32 @@ components: uuid: type: string +Credential: + type: object + requ

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796632766 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3141,32 @@ components: uuid: type: string +Credential: + type: object + requ

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
nastra commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796630239 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3141,32 @@ components: uuid: type: string +Credential: + type: object + requ

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-10-11 Thread via GitHub
ookumuso commented on PR #2: URL: https://github.com/apache/iceberg/pull/2#issuecomment-2408212343 @danielcweeks @jackye1995 Updated the change to divide entropy into dirs so we follow the following format now: partitioned-path=true: /data/0100/0110/0010/10101001/key=val/

Re: [PR] [Core][Spark] Improve DeleteOrphanFiles action to return additional details of deleted orphan files [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on PR #7127: URL: https://github.com/apache/iceberg/pull/7127#issuecomment-2408257986 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Rollback compaction on conflicts [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on PR #5888: URL: https://github.com/apache/iceberg/pull/5888#issuecomment-2408257978 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] [Core][Spark] Improve DeleteOrphanFiles action to return additional details of deleted orphan files [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed pull request #7127: [Core][Spark] Improve DeleteOrphanFiles action to return additional details of deleted orphan files URL: https://github.com/apache/iceberg/pull/7127 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] API: Add ParquetUtils.getSplitOffsets that takes an InputFile [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on PR #7267: URL: https://github.com/apache/iceberg/pull/7267#issuecomment-2408258000 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] API: Add ParquetUtils.getSplitOffsets that takes an InputFile [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed pull request #7267: API: Add ParquetUtils.getSplitOffsets that takes an InputFile URL: https://github.com/apache/iceberg/pull/7267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Core: Fix retry behavior for Jdbc Client [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed pull request #7561: Core: Fix retry behavior for Jdbc Client URL: https://github.com/apache/iceberg/pull/7561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Slow RewriteManifests due to Validation of Manifest Entries [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed issue #8932: Slow RewriteManifests due to Validation of Manifest Entries URL: https://github.com/apache/iceberg/issues/8932 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Slow RewriteManifests due to Validation of Manifest Entries [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on issue #8932: URL: https://github.com/apache/iceberg/issues/8932#issuecomment-2408258032 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Core: Rollback compaction on conflicts [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed pull request #5888: Core: Rollback compaction on conflicts URL: https://github.com/apache/iceberg/pull/5888 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Core: Fix retry behavior for Jdbc Client [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on PR #7561: URL: https://github.com/apache/iceberg/pull/7561#issuecomment-2408258007 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] commented on PR #7914: URL: https://github.com/apache/iceberg/pull/7914#issuecomment-2408258018 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2024-10-11 Thread via GitHub
github-actions[bot] closed pull request #7914: Use SupportsPrefixOperations for Remove OrphanFile Procedure URL: https://github.com/apache/iceberg/pull/7914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
stevenzwu commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797505656 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the fo

Re: [I] Implement rolling manifest-writers [iceberg-python]

2024-10-11 Thread via GitHub
github-actions[bot] commented on issue #596: URL: https://github.com/apache/iceberg-python/issues/596#issuecomment-2408259229 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
stevenzwu commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797498850 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the fo

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
stevenzwu commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797498850 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the fo

Re: [PR] Core: Map methods should return immutable collections [iceberg]

2024-10-11 Thread via GitHub
anuragmantri commented on PR #11304: URL: https://github.com/apache/iceberg/pull/11304#issuecomment-2408184398 StructLikeSet has an overridden `equals()` method which compared the classes. This will[ fail when we compare against](https://github.com/apache/iceberg/blob/67dc9e58cd57d953726677

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-11 Thread via GitHub
wypoon commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1797485367 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ public void

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797486598 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Arrow: Remove unused fixed width binary reader classes [iceberg]

2024-10-11 Thread via GitHub
wypoon commented on code in PR #11292: URL: https://github.com/apache/iceberg/pull/11292#discussion_r1797487178 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/parquet/VectorizedColumnIterator.java: ## @@ -214,20 +214,6 @@ protected int nextBatchOf( } } - p

Re: [PR] Arrow: Remove unused fixed width binary reader classes [iceberg]

2024-10-11 Thread via GitHub
wypoon commented on code in PR #11292: URL: https://github.com/apache/iceberg/pull/11292#discussion_r1797487178 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/parquet/VectorizedColumnIterator.java: ## @@ -214,20 +214,6 @@ protected int nextBatchOf( } } - p

Re: [PR] Core: Store schema and spec in TaskContext to avoid unnecessary deserialization (#11235) [iceberg]

2024-10-11 Thread via GitHub
gitzwz commented on PR #11280: URL: https://github.com/apache/iceberg/pull/11280#issuecomment-2406797716 This works well when table's schema is over 1k, and when there is a need to read table specs & schema after table scan. In our case(2k column, 25.9TB, 460,000 files), this can reduce the

Re: [PR] OpenAPI: Add endpoint for refreshing vended credentials [iceberg]

2024-10-11 Thread via GitHub
snazy commented on code in PR #11281: URL: https://github.com/apache/iceberg/pull/11281#discussion_r1796573772 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3141,32 @@ components: uuid: type: string +Credential: + type: object + requi

[PR] Bump getdaft from 0.3.2 to 0.3.8 [iceberg-python]

2024-10-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1228: URL: https://github.com/apache/iceberg-python/pull/1228 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.3.2 to 0.3.8. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] Bump getdaft from 0.3.2 to 0.3.6 [iceberg-python]

2024-10-11 Thread via GitHub
dependabot[bot] closed pull request #1225: Bump getdaft from 0.3.2 to 0.3.6 URL: https://github.com/apache/iceberg-python/pull/1225 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Bump getdaft from 0.3.2 to 0.3.6 [iceberg-python]

2024-10-11 Thread via GitHub
dependabot[bot] commented on PR #1225: URL: https://github.com/apache/iceberg-python/pull/1225#issuecomment-2408219432 Superseded by #1228. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-11 Thread via GitHub
wypoon commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1797485367 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/parquet/vectorized/TestParquetDictionaryEncodedVectorizedReads.java: ## @@ -93,4 +125,64 @@ public void

Re: [PR] Handling NO Coordinator Scenario and Data Loss in the current Design [iceberg]

2024-10-11 Thread via GitHub
kumarpritam863 commented on PR #11298: URL: https://github.com/apache/iceberg/pull/11298#issuecomment-2408288420 @bryanck Sir can we please review this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Api, Spark: Make StrictMetricsEvaluator not fail on nested column predicates [iceberg]

2024-10-11 Thread via GitHub
amogh-jahagirdar commented on code in PR #11261: URL: https://github.com/apache/iceberg/pull/11261#discussion_r1797529748 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestDelete.java: ## @@ -1401,6 +1401,22 @@ public void testDeleteToCustomWa

[I] [Spark] Identity partition on required column generates nullable partition tuple in manifest file [iceberg]

2024-10-11 Thread via GitHub
mosenberg opened a new issue, #11300: URL: https://github.com/apache/iceberg/issues/11300 ### Apache Iceberg version None ### Query engine Spark ### Please describe the bug 🐞 The issue repros using the following SQL: ```sql CREATE TABLE iceberg.Nullabi

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
singhpk234 commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797074028 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
singhpk234 commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797072760 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Core: Make namespace separator configurable [iceberg]

2024-10-11 Thread via GitHub
cwsteinbach commented on code in PR #10877: URL: https://github.com/apache/iceberg/pull/10877#discussion_r1797223415 ## open-api/rest-catalog-open-api.yaml: ## @@ -1747,7 +1749,9 @@ components: required: true description: A namespace identifier as a single

Re: [PR] Core: Make namespace separator configurable [iceberg]

2024-10-11 Thread via GitHub
cwsteinbach commented on code in PR #10877: URL: https://github.com/apache/iceberg/pull/10877#discussion_r1797223185 ## open-api/rest-catalog-open-api.yaml: ## @@ -261,7 +261,9 @@ paths: description: An optional namespace, underneath which to list namespa

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rdblue commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797228734 ## .palantir/revapi.yml: ## @@ -1058,6 +1058,11 @@ acceptedBreaks: new: "method void org.apache.iceberg.encryption.PlaintextEncryptionManager::()" justi

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rdblue commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797229861 ## api/src/main/java/org/apache/iceberg/expressions/ResidualEvaluator.java: ## @@ -89,6 +89,12 @@ public static ResidualEvaluator of(PartitionSpec spec, Expression ex

[PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-11 Thread via GitHub
abharath9 opened a new pull request, #11301: URL: https://github.com/apache/iceberg/pull/11301 Currently we can't create views on top of IcebergSource DataStreams directly. We need to convert the RowData to Row explicitly using map function. I thought creating a RowConverter to convert RowD

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-10-11 Thread via GitHub
aokolnychyi commented on PR #10935: URL: https://github.com/apache/iceberg/pull/10935#issuecomment-2407845436 Will start looking into this PR today and should be able to finish over the weekend. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797310997 ## core/src/main/java/org/apache/iceberg/rest/requests/PlanTableScanRequest.java: ## @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797309737 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchScanTasksResponse.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797310350 ## core/src/main/java/org/apache/iceberg/rest/responses/PlanTableScanResponse.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797312608 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797313224 ## core/src/main/java/org/apache/iceberg/rest/requests/FetchScanTasksRequest.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797315352 ## core/src/main/java/org/apache/iceberg/RESTPlanningMode.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more con

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797317407 ## core/src/main/java/org/apache/iceberg/ScanTasksIterable.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

Re: [PR] Core: Map methods should return immutable collections [iceberg]

2024-10-11 Thread via GitHub
anuragmantri commented on PR #11304: URL: https://github.com/apache/iceberg/pull/11304#issuecomment-2407956628 `TestStructLikeMap#testKeyAndEntrySetEquality()` is failing. Investigating if `equals()` method needs some changes. -- This is an automated message from the Apache Git Service. T

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-11 Thread via GitHub
rahil-c commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1797318171 ## core/src/main/java/org/apache/iceberg/rest/requests/PlanTableScanRequest.java: ## @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797318587 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,47 @@ protected static Object[][] parameters() { }

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797360429 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.matches()

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797363531 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.matches()

Re: [PR] Spark 3.5: Update Spark to use planned Avro reads [iceberg]

2024-10-11 Thread via GitHub
rdblue commented on PR #11299: URL: https://github.com/apache/iceberg/pull/11299#issuecomment-2408058064 Here are the benchmark results: ``` ## main Benchmark Mode Cnt Score Error Units IcebergSourceFlatAvroDataRead

Re: [PR] Flink: Tests alignment for the Flink Sink v2-based implemenation (IcebergSink) [iceberg]

2024-10-11 Thread via GitHub
arkadius commented on PR #11219: URL: https://github.com/apache/iceberg/pull/11219#issuecomment-2408058098 > > > Hi @arkadius I have started working in backporting the RANGE distribution to the IcebergSink. The unit tests in my code will benefit from the new marker interface you are introdu

Re: [I] Rest Catalog and Writing data to Minio Raises `OSError: When initiating multiple part upload` [iceberg-python]

2024-10-11 Thread via GitHub
allilou commented on issue #974: URL: https://github.com/apache/iceberg-python/issues/974#issuecomment-2408060903 > As a workaround, you can define the S3 configurations (endpoint, secret access key, and access key ID) in the load_rest method. > > ```python > catalog = load_rest(

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797420599 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797421408 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797423412 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797427317 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -105,6 +166,10 @@ public void before() { spark.conf().set("spar

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797263955 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { }

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1797391875 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [I] Implement Remaining Catalog operations for REST catalog [iceberg-go]

2024-10-11 Thread via GitHub
jhump commented on issue #63: URL: https://github.com/apache/iceberg-go/issues/63#issuecomment-2408081566 @zeroshade, I've got a fork where I've implemented DropTable and added CreateTable and UpdateTable, which have the most complicated API. Did you already have ideas on what the API might

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-11 Thread via GitHub
rodmeneses commented on code in PR #11305: URL: https://github.com/apache/iceberg/pull/11305#discussion_r1797396711 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSinkBuilder.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [I] Implement Remaining Catalog operations for REST catalog [iceberg-go]

2024-10-11 Thread via GitHub
zeroshade commented on issue #63: URL: https://github.com/apache/iceberg-go/issues/63#issuecomment-2408083564 Go ahead and put the PR up and we'll discuss and iterate! Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-11 Thread via GitHub
rodmeneses commented on code in PR #11305: URL: https://github.com/apache/iceberg/pull/11305#discussion_r1797397663 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSinkBuilder.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797403156 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -18,24 +18,34 @@ */ package org.apache.iceberg.azure.adlsv2; +import java.net.URI;

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-11 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1797354699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [I] Nessie Iceberg REST catalog and writing to localstack raises `OSError: When initiating multiple part upload` [iceberg-python]

2024-10-11 Thread via GitHub
allilou commented on issue #1087: URL: https://github.com/apache/iceberg-python/issues/1087#issuecomment-2408049850 > I updated my docker-compose.yaml to use extra_hosts and it worked. Closing this issue. I'm facing the same error, can you please give a snippet how you add the extra

[PR] Flink: Add IcebergSinkBuilder interface allowed unification of most of operations on FlinkSink and IcebergSink Builders [iceberg]

2024-10-11 Thread via GitHub
arkadius opened a new pull request, #11305: URL: https://github.com/apache/iceberg/pull/11305 This PR extracts the concept of `IcebergSinkBuilder` interface from the #11219. This interface will be used to avoid code duplication in tests and to keep the interface of both implemenations of si

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797405095 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.matches()

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797404275 ## azure/src/test/java/org/apache/iceberg/azure/adlsv2/ADLSLocationTest.java: ## @@ -38,11 +38,26 @@ public void testLocationParsing(String scheme) { assertThat(lo

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1797405966 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-11 Thread via GitHub
emkornfield commented on PR #11238: URL: https://github.com/apache/iceberg/pull/11238#issuecomment-2408091808 > I’m not sure it’s worth drawing a line in the sand over this particular issue and I’d like to talk about it a bit more as a community before we merge this. I don’t want to set a p

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-11 Thread via GitHub
mrcnc commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1797409933 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.matches()

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797438018 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestAlterTable.java: ## @@ -275,6 +278,11 @@ public void testAlterColumnPositionFirst() { @Te

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797438855 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +70,47 @@ protected static Object[][] parameters() { }

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797439446 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -18,8 +18,11 @@ */ package org.apache.iceberg.

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797439934 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { }

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797439776 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -740,6 +743,10 @@ private boolean partitionMatch(

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797439642 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestAlterTable.java: ## @@ -275,6 +278,9 @@ public void testAlterColumnPositionFirst() { @Tes

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-11 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1797440072 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,55 @@ protected static Object[][] parameters() { }

  1   2   >