Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796507653 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -18,8 +18,11 @@ */ package org.apache.iceberg.spark.

Re: [PR] RecordBatchTransformer: Handle schema migration and column re-ordering in table scans [iceberg-rust]

2024-10-10 Thread via GitHub
sdd commented on PR #602: URL: https://github.com/apache/iceberg-rust/pull/602#issuecomment-2406669061 @Xuanwo PTAL - you approved an earlier version but there are some small additional changes since then. I added: * a performance improvement for a particular scenario; * a chan

Re: [PR] feat: Derive PartialEq for FileScanTask [iceberg-rust]

2024-10-10 Thread via GitHub
sdd commented on PR #660: URL: https://github.com/apache/iceberg-rust/pull/660#issuecomment-2406660124 Looks like you just need a rebase on this to get your Arrow 53.1 fix in :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1796490776 ## open-api/rest-catalog-open-api.yaml: ## @@ -3142,6 +3163,10 @@ components: type: object additionalProperties: type: string +

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
stevenzwu commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796422065 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct valu

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796441921 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796436581 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Api, Spark: Make StrictMetricsEvaluator not fail on nested column predicates [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11261: URL: https://github.com/apache/iceberg/pull/11261#discussion_r1796430454 ## api/src/main/java/org/apache/iceberg/expressions/StrictMetricsEvaluator.java: ## @@ -60,7 +57,6 @@ public StrictMetricsEvaluator(Schema schema, Expression

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796431209 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796430758 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796429496 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796427704 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796427242 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [I] chore(deps): Bump crate-ci/typos from 1.24.6 to 1.25.0 [iceberg-rust]

2024-10-10 Thread via GitHub
liurenjie1024 closed issue #659: chore(deps): Bump crate-ci/typos from 1.24.6 to 1.25.0 URL: https://github.com/apache/iceberg-rust/issues/659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796426122 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796424825 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796424254 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796423298 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796423298 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796422257 ## format/spec.md: ## @@ -51,6 +51,7 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * New data types: nanosecond timestamp(

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on PR #11238: URL: https://github.com/apache/iceberg/pull/11238#issuecomment-2406552680 I am going to share a PR with some basic implementation that follows this spec. We can use it as an example that will hopefully clarify some questions. Thanks for putting this toget

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796421177 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796421177 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796421105 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796419760 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796420278 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796419515 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796419309 ## .palantir/revapi.yml: ## @@ -1058,6 +1058,11 @@ acceptedBreaks: new: "method void org.apache.iceberg.encryption.PlaintextEncryptionManager::()"

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796419039 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796418244 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-10 Thread via GitHub
ajantha-bhat commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1796417822 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796416800 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796416775 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796412573 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796415097 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796412573 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796409299 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796407251 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796405024 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796385502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [I] Before expiring snapshots is there need to provide history snapshot file statistics [iceberg]

2024-10-10 Thread via GitHub
MichaelDeSteven commented on issue #11213: URL: https://github.com/apache/iceberg/issues/11213#issuecomment-2406495376 > It might be valuable to add a `dry_run` option to `expire_snapshots` like `remove_orphan_files`. good idea!When `dry_run` is true, don't actually remove files but r

Re: [PR] chore(deps): bump typos crate to 1.25.0 [iceberg-rust]

2024-10-10 Thread via GitHub
Xuanwo merged PR #662: URL: https://github.com/apache/iceberg-rust/pull/662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796287385 ## core/src/main/java/org/apache/iceberg/rest/responses/PlanTableScanResponse.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [I] Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8921: URL: https://github.com/apache/iceberg/issues/8921#issuecomment-2406263894 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8921: Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option URL: https://github.com/apache/iceberg/issues/8921 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Missing serialVersionUID in Serializable implementation [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8929: Missing serialVersionUID in Serializable implementation URL: https://github.com/apache/iceberg/issues/8929 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8926: URL: https://github.com/apache/iceberg/issues/8926#issuecomment-2406263920 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8927: URL: https://github.com/apache/iceberg/issues/8927#issuecomment-2406263940 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8927: Spark write abort result in table miss metadata location file URL: https://github.com/apache/iceberg/issues/8927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Missing serialVersionUID in Serializable implementation [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8929: URL: https://github.com/apache/iceberg/issues/8929#issuecomment-2406263958 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8926: DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" URL: https://github.com/apache/iceberg/issues/8926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Vulnerabilities found on latest version - jackson, avro, openssl [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8923: URL: https://github.com/apache/iceberg/issues/8923#issuecomment-2406263904 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Vulnerabilities found on latest version - jackson, avro, openssl [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8923: Vulnerabilities found on latest version - jackson, avro, openssl URL: https://github.com/apache/iceberg/issues/8923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796223977 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796222680 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796222680 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796221840 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796220782 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796202036 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796216340 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796200502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796210980 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796204011 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796202036 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796200502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-10 Thread via GitHub
stevenzwu commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1796168929 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is:

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796159961 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796158658 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796153032 ## format/puffin-spec.md: ## @@ -123,6 +123,49 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796139073 ## format/spec.md: ## @@ -604,7 +612,7 @@ Scans are planned by reading the manifest files for the current snapshot. Delete Manifests that contain no matching files

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796135851 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked with "

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796127653 ## core/src/main/java/org/apache/iceberg/GenericDataFile.java: ## @@ -26,7 +26,7 @@ import org.apache.iceberg.relocated.com.google.common.collect.ImmutableM

Re: [PR] AWS: Introduce opt-in S3LocationProvider which is optimized for S3 performance [iceberg]

2024-10-10 Thread via GitHub
ookumuso commented on PR #2: URL: https://github.com/apache/iceberg/pull/2#issuecomment-2406086135 Sounds good @danielcweeks, I will work on this change and update the PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796124558 ## format/puffin-spec.md: ## @@ -123,6 +123,49 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
ashvina commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1796121693 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.matches

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796120834 ## core/src/main/java/org/apache/iceberg/rest/requests/PlanTableScanRequest.java: ## @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796117618 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchPlanningResultResponse.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796116464 ## core/src/main/java/org/apache/iceberg/rest/responses/PlanTableScanResponse.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796117618 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchPlanningResultResponse.java: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796116464 ## core/src/main/java/org/apache/iceberg/rest/responses/PlanTableScanResponse.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796094857 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchScanTasksResponse.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796109183 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796106043 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct values,

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796094857 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchScanTasksResponse.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796092928 ## core/src/main/java/org/apache/iceberg/rest/responses/FetchScanTasksResponse.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-10 Thread via GitHub
jackye1995 commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1796090787 ## open-api/rest-catalog-open-api.yaml: ## @@ -3142,6 +3163,10 @@ components: type: object additionalProperties: type: string +

Re: [PR] Docs: Add Bigquery Iceberg documentation, Update MRAP endpoint and add more docs [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11159: URL: https://github.com/apache/iceberg/pull/11159#discussion_r1796046533 ## docs/docs/gcp-bigquery.md: ## @@ -0,0 +1,257 @@ +--- Review Comment: @eder001 I think this might contain too much marketing, rather then specific instruct

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796004376 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -91,10 +132,30 @@ public static void dropWarehouse() throws IOExceptio

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r179544 ## format/puffin-spec.md: ## @@ -123,6 +123,49 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795992469 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795988544 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795984898 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -18,24 +18,34 @@ */ package org.apache.iceberg.azure.adlsv2; +import java.

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on PR #11294: URL: https://github.com/apache/iceberg/pull/11294#issuecomment-2405881669 @ashivina Can you please review as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795979196 ## azure/src/test/java/org/apache/iceberg/azure/adlsv2/ADLSLocationTest.java: ## @@ -38,11 +38,26 @@ public void testLocationParsing(String scheme) { asse

  1   2   >