Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
singhpk234 commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815553135 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Flink 1.20: Update Flink to use planned Avro reads [iceberg]

2024-10-24 Thread via GitHub
jbonofre commented on code in PR #11386: URL: https://github.com/apache/iceberg/pull/11386#discussion_r1816086840 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkPlannedAvroReader.java: ## @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816032805 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Flink 1.20: Update Flink to use planned Avro reads [iceberg]

2024-10-24 Thread via GitHub
pvary commented on code in PR #11386: URL: https://github.com/apache/iceberg/pull/11386#discussion_r1816085856 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/data/FlinkPlannedAvroReader.java: ## @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-10-24 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1816081581 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/MaintenanceTaskBuilder.java: ## @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] GCS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
nastra commented on code in PR #11282: URL: https://github.com/apache/iceberg/pull/11282#discussion_r1816081112 ## gcp/src/main/java/org/apache/iceberg/gcp/gcs/OAuth2RefreshCredentialsHandler.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Exclude reading pos_ column if it's not in the scan list [iceberg]

2024-10-24 Thread via GitHub
huaxingao commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2436567828 also cc @flyrain -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816020709 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1816076539 ## format/spec.md: ## @@ -1025,28 +1033,29 @@ Values should be stored in Parquet using the types and logical type annotations Lists must use the [3-level represe

Re: [I] Parquet bloom filter doesn't work with nested fields [iceberg]

2024-10-24 Thread via GitHub
mdub commented on issue #9898: URL: https://github.com/apache/iceberg/issues/9898#issuecomment-2436981886 I've also been experimenting with Bloom filters, and managed to get it working fairly easily with a nested field: ``` ALTER TABLE glue_catalog.kafka_archive.test_topic SET T

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815755187 ## format/spec.md: ## @@ -1025,28 +1033,29 @@ Values should be stored in Parquet using the types and logical type annotations Lists must use the [3-level represen

Re: [PR] AWS: Use testcontainers-minio instead of S3Mock [iceberg]

2024-10-24 Thread via GitHub
nastra commented on PR #11349: URL: https://github.com/apache/iceberg/pull/11349#issuecomment-2435503742 thanks @sullis, this LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815439191 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-24 Thread via GitHub
ajantha-bhat commented on PR #11067: URL: https://github.com/apache/iceberg/pull/11067#issuecomment-2436946836 @danielcweeks: Thanks for the review. Conflict was due to recent row-lineage merge. I have resolved it now. PR is ready. -- This is an automated message from the Apache Git Serv

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816028652 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Deprecate iceberg-pig [iceberg]

2024-10-24 Thread via GitHub
manuzhang commented on code in PR #11379: URL: https://github.com/apache/iceberg/pull/11379#discussion_r1814618649 ## pig/src/main/java/org/apache/iceberg/pig/IcebergPigInputFormat.java: ## @@ -68,6 +68,7 @@ public class IcebergPigInputFormat extends InputFormat { private Li

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1815725679 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -185,6 +200,13 @@ List filterManifests(Schema tableSchema, List manife ret

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815699418 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1815717333 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -323,11 +345,15 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816028652 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816030939 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816028948 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816033846 ## open-api/rest-catalog-open-api.yaml: ## @@ -4202,6 +4203,14 @@ components: content: type: string enum: [ "position-deletes" ] +

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816028652 ## format/spec.md: ## @@ -982,19 +998,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-leve

Re: [PR] GCS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11282: URL: https://github.com/apache/iceberg/pull/11282#discussion_r1815557030 ## gcp/src/main/java/org/apache/iceberg/gcp/gcs/OAuth2RefreshCredentialsHandler.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816024185 ## open-api/rest-catalog-open-api.py: ## @@ -854,6 +854,16 @@ class ContentFile(BaseModel): class PositionDeleteFile(ContentFile): content: Literal['posit

Re: [PR] Parquet: Make row-group filters cooperate to filter [iceberg]

2024-10-24 Thread via GitHub
zhongyujiang commented on PR #10090: URL: https://github.com/apache/iceberg/pull/10090#issuecomment-2436873425 Hey @amogh-jahagirdar @Fokko @nastra @danielcweeks, can you help review this when you have time? Thanks -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1816020709 ## format/spec.md: ## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked w

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1816014740 ## format/spec.md: ## @@ -444,6 +449,9 @@ Sorting floating-point numbers should produce the following behavior: `-NaN` < ` A data or delete file is associated wit

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
aihuaxu commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1816004362 ## format/spec.md: ## @@ -444,6 +449,9 @@ Sorting floating-point numbers should produce the following behavior: `-NaN` < ` A data or delete file is associated wit

Re: [I] How does client use hadoopcatlog to read the iceberg table writen by hivecatalog? [iceberg]

2024-10-24 Thread via GitHub
manuzhang commented on issue #11375: URL: https://github.com/apache/iceberg/issues/11375#issuecomment-2436850492 Then it will read the metadata file with max version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
singhpk234 commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815553135 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] fix: do not sort indices for `ProjectionMask::leaves` [iceberg-rust]

2024-10-24 Thread via GitHub
wcy-fdu commented on PR #682: URL: https://github.com/apache/iceberg-rust/pull/682#issuecomment-2436806123 cc @liurenjie1024 @Xuanwo for awareness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815764535 ## format/spec.md: ## @@ -1297,54 +1308,56 @@ Example This serialization scheme is for storing single values as individual binary values in the lower and upper bou

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815549411 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Puffin: Add delete-vector-v1 blob type [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11238: URL: https://github.com/apache/iceberg/pull/11238#discussion_r1815638886 ## format/puffin-spec.md: ## @@ -123,6 +123,57 @@ The blob metadata for this blob may include following properties: - `ndv`: estimate of number of distinct va

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-24 Thread via GitHub
RussellSpitzer commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1815721238 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -921,11 +948,26 @@ private void assertRecordValues

Re: [PR] Spark 3.5: Don't change table distribution when only altering local order [iceberg]

2024-10-24 Thread via GitHub
manuzhang commented on PR #10774: URL: https://github.com/apache/iceberg/pull/10774#issuecomment-2436665300 @RussellSpitzer Could you please take another look? It might be valuable to be included in 1.7 since it's a long-standing issue. -- This is an automated message from the Apache Git

Re: [PR] feat: allow empty projection in table scan [iceberg-rust]

2024-10-24 Thread via GitHub
sdd commented on PR #677: URL: https://github.com/apache/iceberg-rust/pull/677#issuecomment-2436019488 @Xuanwo I've got a fix for that build issue here: https://github.com/apache/iceberg-rust/pull/680 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Flink: Add RowConverter for Iceberg Source [iceberg]

2024-10-24 Thread via GitHub
abharath9 commented on code in PR #11301: URL: https://github.com/apache/iceberg/pull/11301#discussion_r1814251209 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/reader/RowConverter.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation

Re: [I] OpenDAL rename of `is_exist` to `exists` has broken the build [iceberg-rust]

2024-10-24 Thread via GitHub
liurenjie1024 closed issue #679: OpenDAL rename of `is_exist` to `exists` has broken the build URL: https://github.com/apache/iceberg-rust/issues/679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
nastra commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815083844 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientProperties.java: ## @@ -66,21 +67,32 @@ public class AwsClientProperties implements Serializable { */ publi

Re: [PR] fix: OpenDAL `is_exist` => `exists` [iceberg-rust]

2024-10-24 Thread via GitHub
liurenjie1024 merged PR #680: URL: https://github.com/apache/iceberg-rust/pull/680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] OpenDAL rename of `is_exist` to `exists` has broken the build [iceberg-rust]

2024-10-24 Thread via GitHub
liurenjie1024 closed issue #679: OpenDAL rename of `is_exist` to `exists` has broken the build URL: https://github.com/apache/iceberg-rust/issues/679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1815725679 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -185,6 +200,13 @@ List filterManifests(Schema tableSchema, List manife ret

Re: [PR] feat: Implement list_views Method and __is_view Utility Function [iceberg-python]

2024-10-24 Thread via GitHub
sungwy commented on PR #1239: URL: https://github.com/apache/iceberg-python/pull/1239#issuecomment-2436218859 Hi @omkenge - thank you for working on this PR. Could we add a test to: https://github.com/apache/iceberg-python/blob/main/tests/catalog/test_glue.py so we can verify the API in our

Re: [PR] Spark 3.5: Fix NotSerializableException when migrating partitioned Spark tables [iceberg]

2024-10-24 Thread via GitHub
manuzhang commented on PR #11157: URL: https://github.com/apache/iceberg/pull/11157#issuecomment-2436574881 `ExecutorService` is used to parallelize reading files to build manifests on the Spark executors for Spark table migration procedures (`add_files`, `migrate`, `snapshot`). -- This

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815700879 ## format/spec.md: ## @@ -52,6 +52,8 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * Default value support for columns * Multi

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815788236 ## api/src/main/java/org/apache/iceberg/types/TypeUtil.java: ## @@ -534,6 +534,7 @@ private static int estimateSize(Type type) { case FIXED: return ((T

Re: [PR] Parquet: Make row-group filters cooperate to filter [iceberg]

2024-10-24 Thread via GitHub
github-actions[bot] commented on PR #10090: URL: https://github.com/apache/iceberg/pull/10090#issuecomment-2436555213 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core: Add portable Roaring bitmap for row positions [iceberg]

2024-10-24 Thread via GitHub
RussellSpitzer commented on code in PR #11372: URL: https://github.com/apache/iceberg/pull/11372#discussion_r1815455196 ## core/src/test/java/org/apache/iceberg/deletes/TestRoaringPositionBitmap.java: ## @@ -0,0 +1,516 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Exclude reading pos_ column if it's not in the scan list [iceberg]

2024-10-24 Thread via GitHub
huaxingao commented on PR #11390: URL: https://github.com/apache/iceberg/pull/11390#issuecomment-2436559370 cc @szehon-ho @pvary @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-10-24 Thread via GitHub
github-actions[bot] commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2436555301 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Hive metastore does not update metdadata durring commit. [iceberg]

2024-10-24 Thread via GitHub
github-actions[bot] commented on issue #10101: URL: https://github.com/apache/iceberg/issues/10101#issuecomment-2436555321 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Build: Move build configurations to project dirs [iceberg]

2024-10-24 Thread via GitHub
github-actions[bot] commented on PR #10097: URL: https://github.com/apache/iceberg/pull/10097#issuecomment-2436555241 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Docs: Fix links of `Get Started` and `Community` sections in footer [iceberg]

2024-10-24 Thread via GitHub
github-actions[bot] commented on issue #10099: URL: https://github.com/apache/iceberg/issues/10099#issuecomment-2436555271 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

[PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
aihuaxu opened a new pull request, #10831: URL: https://github.com/apache/iceberg/pull/10831 Help: #10392 Spec: add variant type Proposal: https://docs.google.com/document/d/1QjhpG_SVNPZh3anFcpicMQx90ebwjL7rmzFYfUP89Iw/edit This is to layout the spec for variant type. T

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815696952 ## format/spec.md: ## @@ -585,13 +591,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_ |

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815771516 ## format/spec.md: ## @@ -1297,54 +1308,56 @@ Example This serialization scheme is for storing single values as individual binary values in the lower and upper bou

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-24 Thread via GitHub
singhpk234 commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1815762264 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -921,11 +948,26 @@ private void assertRecordValues(Rec

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1815807981 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestExpireSnapshotsProcedure.java: ## @@ -294,8 +294,8 @@ public void testEx

[PR] Exclude reading pos_ column if it's not in the scan list [iceberg]

2024-10-24 Thread via GitHub
huaxingao opened a new pull request, #11390: URL: https://github.com/apache/iceberg/pull/11390 In Spark batch reading, Iceberg reads additional columns when there are delete files. For instance, if we have a table `test (int id, string data)` and a query `SELECT id FROM test`, the reques

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815799646 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributo

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815799200 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributo

Re: [PR] Core: Remove one comment from FastAppend [iceberg]

2024-10-24 Thread via GitHub
gaborkaszab commented on PR #10995: URL: https://github.com/apache/iceberg/pull/10995#issuecomment-2435425999 Hi @nastra , Do you disagree with my reasoning? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2436457711 Oops. I didn't mean to close this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11093: URL: https://github.com/apache/iceberg/pull/11093#discussion_r1815793488 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestBaseWithCatalog.java: ## @@ -59,18 +70,45 @@ protected static Object[][] parameters() { }

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815641041 ## format/spec.md: ## @@ -585,13 +591,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _option

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815791799 ## api/src/test/java/org/apache/iceberg/TestHelpers.java: ## @@ -402,6 +406,101 @@ public int hashCode() { } } + /** A VariantLike implementation for testin

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815796072 ## api/src/main/java/org/apache/iceberg/VariantLike.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributo

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815795425 ## api/src/test/java/org/apache/iceberg/TestAccessors.java: ## @@ -247,4 +252,70 @@ public void testEmptySchema() { Schema emptySchema = new Schema(); assert

Re: [PR] API: Add Variant data type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11324: URL: https://github.com/apache/iceberg/pull/11324#discussion_r1815795425 ## api/src/test/java/org/apache/iceberg/TestAccessors.java: ## @@ -247,4 +252,70 @@ public void testEmptySchema() { Schema emptySchema = new Schema(); assert

Re: [PR] Core: Track data files by spec id instead of full PartitionSpec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on PR #11323: URL: https://github.com/apache/iceberg/pull/11323#issuecomment-2436467451 One comment. Otherwise this look good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [KafkaConnect] Fix RecordConverter for UUID and Fixed Types [iceberg]

2024-10-24 Thread via GitHub
RussellSpitzer commented on code in PR #11346: URL: https://github.com/apache/iceberg/pull/11346#discussion_r1815718572 ## kafka-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/data/RecordConverterTest.java: ## @@ -84,11 +93,18 @@ import org.apache.kafka.connect.

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815721061 ## format/spec.md: ## @@ -444,6 +449,9 @@ Sorting floating-point numbers should produce the following behavior: `-NaN` < ` A data or delete file is associated with

Re: [PR] GCS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11282: URL: https://github.com/apache/iceberg/pull/11282#discussion_r1815372401 ## gcp/src/main/java/org/apache/iceberg/gcp/gcs/OAuth2RefreshCredentialsHandler.java: ## @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on PR #10831: URL: https://github.com/apache/iceberg/pull/10831#issuecomment-2436457533 @aihuaxu, I think there are a couple of things missing: * The Avro appendix should be updated to state that a Variant is stored as a Record with two fields, a required binary `metadata

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue closed pull request #10831: Spec: add variant type URL: https://github.com/apache/iceberg/pull/10831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Core: Optimize MergingSnapshotProducer to use referenced manifests to determine if manifest needs to be rewritten [iceberg]

2024-10-24 Thread via GitHub
amogh-jahagirdar commented on code in PR #11131: URL: https://github.com/apache/iceberg/pull/11131#discussion_r1815732975 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -341,43 +378,36 @@ private ManifestFile filterManifest(Schema tableSchema, Manife

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815761335 ## format/spec.md: ## @@ -1133,6 +1142,7 @@ Hash results are not dependent on decimal scale, which is part of the type, not 4. UUIDs are encoded using big endian. Th

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-10-24 Thread via GitHub
szehon-ho commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1777502004 ## core/src/main/java/org/apache/iceberg/AllManifestsTableTaskParser.java: ## @@ -39,6 +39,8 @@ class AllManifestsTableTaskParser { private static final String MAN

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815761508 ## format/spec.md: ## @@ -1148,28 +1158,29 @@ Schemas are serialized as a JSON object with the same fields as a struct in the Types are serialized according to thi

Re: [PR] Spark 3.5: Fix NotSerializableException when migrating partitioned Spark tables [iceberg]

2024-10-24 Thread via GitHub
RussellSpitzer commented on PR #11157: URL: https://github.com/apache/iceberg/pull/11157#issuecomment-2436423651 @manuzhang Can you summerize the usage of ExecutorService on the Spark Executors? It looks like the current fix involves making a new Executor service per task and i'm not sure t

Re: [PR] Spec: add variant type [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #10831: URL: https://github.com/apache/iceberg/pull/10831#discussion_r1815741890 ## format/spec.md: ## @@ -178,6 +178,11 @@ A **`list`** is a collection of values with some element type. The element field A **`map`** is a collection of key-valu

Re: [PR] Implement Kerberos authentication support for Hive Catalog [iceberg-python]

2024-10-24 Thread via GitHub
uniqueinput commented on PR #766: URL: https://github.com/apache/iceberg-python/pull/766#issuecomment-2436364333 It would be really great if this could make it into 0.8. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815703373 ## format/spec.md: ## @@ -841,19 +855,45 @@ Notes: ## Delete Formats -This section details how to encode row-level deletes in Iceberg delete files. Row-level del

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815695732 ## format/spec.md: ## @@ -585,13 +591,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional_ |

Re: [PR] Flink 1.20: Update Flink to use planned Avro reads [iceberg]

2024-10-24 Thread via GitHub
jbonofre commented on PR #11386: URL: https://github.com/apache/iceberg/pull/11386#issuecomment-2435789965 The problem seems to be related to: ``` java.lang.ClassCastException: class java.lang.String cannot be cast to class org.apache.flink.table.data.StringData (java.lang.String i

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815647540 ## format/spec.md: ## @@ -585,13 +591,19 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _option

Re: [PR] Spec v3: Add deletion vectors to the table spec [iceberg]

2024-10-24 Thread via GitHub
aokolnychyi commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1815639704 ## format/spec.md: ## @@ -52,6 +52,8 @@ Version 3 of the Iceberg spec extends data types and existing metadata structure * Default value support for columns *

Re: [PR] feat: Implement list_views Method and __is_view Utility Function [iceberg-python]

2024-10-24 Thread via GitHub
kevinjqliu commented on PR #1239: URL: https://github.com/apache/iceberg-python/pull/1239#issuecomment-2436245388 on testing, it would be great to include an integration test with a Spark Iceberg view -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] feat: Implement list_views Method and __is_view Utility Function [iceberg-python]

2024-10-24 Thread via GitHub
omkenge commented on PR #1239: URL: https://github.com/apache/iceberg-python/pull/1239#issuecomment-2436191642 I successfully tested the code works fine for me .. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
nastra commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815120386 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on PR #11067: URL: https://github.com/apache/iceberg/pull/11067#issuecomment-2436139576 @ajantha-bhat looks like we have conflicts. It would be good to get this in, but I don't think this section of the docs is tied to the 1.7.0 release. -- This is an automated mes

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
nastra commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815120386 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-24 Thread via GitHub
kevinjqliu commented on PR #11354: URL: https://github.com/apache/iceberg/pull/11354#issuecomment-2436105637 thanks for the review @nastra. I've addressed your comments, please take another look -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Kafka Connect: Include third party licenses and notices in distribution [iceberg]

2024-10-24 Thread via GitHub
bryanck commented on PR #10829: URL: https://github.com/apache/iceberg/pull/10829#issuecomment-2436066226 Yes, definitely, I’ll follow up, hopefully with something to make it automated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] AWS: Refresh vended credentials [iceberg]

2024-10-24 Thread via GitHub
danielcweeks commented on code in PR #11389: URL: https://github.com/apache/iceberg/pull/11389#discussion_r1815510702 ## aws/src/main/java/org/apache/iceberg/aws/s3/VendedCredentialsProvider.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Kafka Connect: Include third party licenses and notices in distribution [iceberg]

2024-10-24 Thread via GitHub
danielcweeks merged PR #10829: URL: https://github.com/apache/iceberg/pull/10829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-24 Thread via GitHub
kevinjqliu commented on code in PR #11354: URL: https://github.com/apache/iceberg/pull/11354#discussion_r1815496440 ## core/src/main/java/org/apache/iceberg/SnapshotParser.java: ## @@ -140,6 +142,8 @@ static Snapshot fromJson(JsonNode node) { } } summary =

  1   2   >