Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796405024 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796407251 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796409299 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796412573 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796415097 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796412573 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796416800 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796416775 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] open-api: Build runtime jar for test fixture [iceberg]

2024-10-10 Thread via GitHub
ajantha-bhat commented on code in PR #11279: URL: https://github.com/apache/iceberg/pull/11279#discussion_r1796417822 ## build.gradle: ## @@ -1006,6 +1009,37 @@ project(':iceberg-open-api') { recommend.set(true) } check.dependsOn('validateRESTCatalogSpec') + + // Cre

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796418244 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796418361 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796419039 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796419309 ## .palantir/revapi.yml: ## @@ -1058,6 +1058,11 @@ acceptedBreaks: new: "method void org.apache.iceberg.encryption.PlaintextEncryptionManager::()"

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796419515 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796420278 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796419760 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796421177 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796421105 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796415699 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796427242 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [I] chore(deps): Bump crate-ci/typos from 1.24.6 to 1.25.0 [iceberg-rust]

2024-10-10 Thread via GitHub
liurenjie1024 closed issue #659: chore(deps): Bump crate-ci/typos from 1.24.6 to 1.25.0 URL: https://github.com/apache/iceberg-rust/issues/659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796429496 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Api, Spark: Make StrictMetricsEvaluator not fail on nested column predicates [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11261: URL: https://github.com/apache/iceberg/pull/11261#discussion_r1796430454 ## api/src/main/java/org/apache/iceberg/expressions/StrictMetricsEvaluator.java: ## @@ -60,7 +57,6 @@ public StrictMetricsEvaluator(Schema schema, Expression

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796427704 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796430758 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796431209 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796436581 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796441921 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
stevenzwu commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796422065 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct valu

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796135851 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked with "

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796139073 ## format/spec.md: ## @@ -604,7 +612,7 @@ Scans are planned by reading the manifest files for the current snapshot. Delete Manifests that contain no matching files

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-10 Thread via GitHub
stevenzwu commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1796168929 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is:

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796222680 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796222680 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796223977 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1796490776 ## open-api/rest-catalog-open-api.yaml: ## @@ -3142,6 +3163,10 @@ components: type: object additionalProperties: type: string +

Re: [PR] feat: Derive PartialEq for FileScanTask [iceberg-rust]

2024-10-10 Thread via GitHub
sdd commented on PR #660: URL: https://github.com/apache/iceberg-rust/pull/660#issuecomment-2406660124 Looks like you just need a rebase on this to get your Arrow 53.1 fix in :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] RecordBatchTransformer: Handle schema migration and column re-ordering in table scans [iceberg-rust]

2024-10-10 Thread via GitHub
sdd commented on PR #602: URL: https://github.com/apache/iceberg-rust/pull/602#issuecomment-2406669061 @Xuanwo PTAL - you approved an earlier version but there are some small additional changes since then. I added: * a performance improvement for a particular scenario; * a chan

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1796507653 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -18,8 +18,11 @@ */ package org.apache.iceberg.spark.

Re: [I] Before expiring snapshots is there need to provide history snapshot file statistics [iceberg]

2024-10-10 Thread via GitHub
MichaelDeSteven commented on issue #11213: URL: https://github.com/apache/iceberg/issues/11213#issuecomment-2406495376 > It might be valuable to add a `dry_run` option to `expire_snapshots` like `remove_orphan_files`. good idea!When `dry_run` is true, don't actually remove files but r

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796385502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796153032 ## format/puffin-spec.md: ## @@ -123,6 +123,49 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct

Re: [PR] chore(deps): bump typos crate to 1.25.0 [iceberg-rust]

2024-10-10 Thread via GitHub
Xuanwo merged PR #662: URL: https://github.com/apache/iceberg-rust/pull/662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796159961 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796204011 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796200502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796216340 ## format/puffin-spec.md: ## @@ -123,6 +123,44 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796202036 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796220782 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796221840 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [I] Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8921: Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option URL: https://github.com/apache/iceberg/issues/8921 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Iceberg streaming using checkpoint does not ignore the stream-from-timestamp option [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8921: URL: https://github.com/apache/iceberg/issues/8921#issuecomment-2406263894 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8926: URL: https://github.com/apache/iceberg/issues/8926#issuecomment-2406263920 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8927: URL: https://github.com/apache/iceberg/issues/8927#issuecomment-2406263940 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Spark write abort result in table miss metadata location file [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8927: Spark write abort result in table miss metadata location file URL: https://github.com/apache/iceberg/issues/8927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Missing serialVersionUID in Serializable implementation [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8929: URL: https://github.com/apache/iceberg/issues/8929#issuecomment-2406263958 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Missing serialVersionUID in Serializable implementation [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8929: Missing serialVersionUID in Serializable implementation URL: https://github.com/apache/iceberg/issues/8929 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] Vulnerabilities found on latest version - jackson, avro, openssl [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8923: Vulnerabilities found on latest version - jackson, avro, openssl URL: https://github.com/apache/iceberg/issues/8923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] closed issue #8926: DELETE fails with "java.lang.IllegalArgumentException: info must be ExtendedLogicalWriteInfo" URL: https://github.com/apache/iceberg/issues/8926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Vulnerabilities found on latest version - jackson, avro, openssl [iceberg]

2024-10-10 Thread via GitHub
github-actions[bot] commented on issue #8923: URL: https://github.com/apache/iceberg/issues/8923#issuecomment-2406263904 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Feature: S3 Remote Signing [iceberg-rust]

2024-10-10 Thread via GitHub
Xuanwo commented on issue #506: URL: https://github.com/apache/iceberg-rust/issues/506#issuecomment-2404943472 Hi, @flaneur2020. Thank you for your work, and sorry for the delay with reqsign and opendal. I hope we can integrate them in a more organized way, which will lead to a significant

Re: [PR] Spark: Add RewriteTablePath action interface [iceberg]

2024-10-10 Thread via GitHub
laithalzyoud commented on PR #10920: URL: https://github.com/apache/iceberg/pull/10920#issuecomment-2404869540 @szehon-ho PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[I] Table Not Found While reading IcebergTable from Spark SQL [iceberg]

2024-10-10 Thread via GitHub
AwasthiSomesh opened a new issue, #11297: URL: https://github.com/apache/iceberg/issues/11297 ### Apache Iceberg version 1.6.1 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Hi Team, I have done setup for hive4 docker images but

Re: [I] Support commit retries [iceberg-python]

2024-10-10 Thread via GitHub
mark-major commented on issue #269: URL: https://github.com/apache/iceberg-python/issues/269#issuecomment-2405076085 @maxlucuta Yes, that's what I have been using. It would be nice if there would be an internal retry for the commit so the client application doesn't have to be polluted with

Re: [PR] Spec: Adds Row Lineage [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #11130: URL: https://github.com/apache/iceberg/pull/11130#discussion_r1795454308 ## format/spec.md: ## @@ -298,16 +298,101 @@ Iceberg tables must not use field ids greater than 2147483447 (`Integer.MAX_VALU The set of metadata columns is: -|

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796158658 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796202036 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796200502 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
emkornfield commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796210980 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796287385 ## core/src/main/java/org/apache/iceberg/rest/responses/PlanTableScanResponse.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1796421177 ## core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796422257 ## format/spec.md: ## @@ -51,6 +51,7 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * New data types: nanosecond timestamp(

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on PR #11238: URL: https://github.com/apache/iceberg/pull/11238#issuecomment-2406552680 I am going to share a PR with some basic implementation that follows this spec. We can use it as an example that will hopefully clarify some questions. Thanks for putting this toget

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796423298 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796423298 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1796424254 ## format/puffin-spec.md: ## @@ -123,6 +123,54 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796424825 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-10 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1796426122 ## format/spec.md: ## @@ -454,35 +457,40 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo `data_file` is a struct with the

Re: [I] Feature: S3 Remote Signing [iceberg-rust]

2024-10-10 Thread via GitHub
flaneur2020 commented on issue #506: URL: https://github.com/apache/iceberg-rust/issues/506#issuecomment-2404718870 the pr which allows passing a `Sign` trait has already been merged into reqsign. however, there still needs some work to let it integrated into opendal. it may need some

Re: [PR] Flink: Tests alignment for the Flink Sink v2-based implemenation (IcebergSink) [iceberg]

2024-10-10 Thread via GitHub
arkadius commented on PR #11219: URL: https://github.com/apache/iceberg/pull/11219#issuecomment-2404641938 > Hi @arkadius I have started working in backporting the RANGE distribution to the IcebergSink. The unit tests in my code will benefit from the new marker interface you are introducing

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1795062381 ## open-api/rest-catalog-open-api.yaml: ## @@ -3103,6 +3103,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

Re: [PR] Spec: Add expiry time config to REST table load [iceberg]

2024-10-10 Thread via GitHub
nastra commented on code in PR #10873: URL: https://github.com/apache/iceberg/pull/10873#discussion_r1795610952 ## open-api/rest-catalog-open-api.py: ## @@ -1103,6 +1103,7 @@ class LoadTableResult(BaseModel): ## General Configurations - `token`: Authorization bearer

Re: [PR] Spec: add variant type [iceberg]

2024-10-10 Thread via GitHub
sfc-gh-aixu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1795791601 ## format/spec.md: ## @@ -178,6 +178,8 @@ A **`list`** is a collection of values with some element type. The element field A **`map`** is a collection of key-

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1795891887 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,47 @@ protected static Object[][] parameters() { }

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1795890640 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -740,6 +743,11 @@ private boolean partitionMatch(

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1795881502 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -521,7 +524,7 @@ public void testFilesTableTimeTr

[PR] Handling NO Coordinator Scenario and Data Loss in the current Design [iceberg]

2024-10-10 Thread via GitHub
kumarpritam863 opened a new pull request, #11298: URL: https://github.com/apache/iceberg/pull/11298 This PR combines the issues I have already handled by these two PRs: 1. **No Coordinator Scenario in ICR Mode -> https://github.com/apache/iceberg/pull/11288** 2. **Data Loss in ICR

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-10 Thread via GitHub
haizhou-zhao commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1795896887 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +71,47 @@ protected static Object[][] parameters() { }

Re: [PR] Handling NO Coordinator Scenario and Data Loss in the current Design [iceberg]

2024-10-10 Thread via GitHub
kumarpritam863 commented on PR #11298: URL: https://github.com/apache/iceberg/pull/11298#issuecomment-2405762502 @bryanck can we please review this. This PR handles the most of the ICR issues in the current design itself with slight modifications. We can discuss more on this [PR](https://gi

Re: [PR] API, Core: Add scan planning api models and parsers [iceberg]

2024-10-10 Thread via GitHub
singhpk234 commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1795908515 ## core/src/main/java/org/apache/iceberg/rest/RESTContentFileParser.java: ## @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Table Scan Delete File Handling: Positional Delete Support [iceberg-rust]

2024-10-10 Thread via GitHub
sdd commented on PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#issuecomment-2405778274 @Xuanwo, @liurenjie1024: This is now ready to review, PTAL when you guys get chance. Look forward to your feedback 😁 -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-10 Thread via GitHub
wypoon commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1795958912 ## spark/v3.5/spark/src/test/resources/decimal_dict_and_plain_encoding.parquet: ## Review Comment: I also tried ``` try (FileAppender writer =

Re: [PR] Arrow: Fix indexing in Parquet dictionary encoded values readers [iceberg]

2024-10-10 Thread via GitHub
wypoon commented on code in PR #11247: URL: https://github.com/apache/iceberg/pull/11247#discussion_r1795958912 ## spark/v3.5/spark/src/test/resources/decimal_dict_and_plain_encoding.parquet: ## Review Comment: I also tried ``` try (FileAppender writer =

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795970799 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -18,24 +18,34 @@ */ package org.apache.iceberg.azure.adlsv2; +import java.

Re: [PR] Support wasb[s] paths in ADLSFileIO [iceberg]

2024-10-10 Thread via GitHub
RussellSpitzer commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1795977105 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ## @@ -53,19 +63,17 @@ class ADLSLocation { ValidationException.check(matcher.

Re: [I] Spark, Flink: Test failures after updating JUnit 5.10 to 5.11 [iceberg]

2024-10-10 Thread via GitHub
tomtongue commented on issue #11296: URL: https://github.com/apache/iceberg/issues/11296#issuecomment-2404426076 `iceberg-core` and `iceberg-mr` also have this issue, however created the PR (https://github.com/apache/iceberg/pull/11262) to fix this. For Spark and Flink, there are a lot o

  1   2   >