Re: [PR] Hadoop: Log where the missing metadata file is located [iceberg]

2024-12-11 Thread via GitHub
manuzhang commented on PR #11643: URL: https://github.com/apache/iceberg/pull/11643#issuecomment-2538063691 @nastra any more comments? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Spark3.4,3.5: Fix the BUG of iceberg views when resolved "group by ordinals" [iceberg]

2024-12-11 Thread via GitHub
nastra merged PR #11729: URL: https://github.com/apache/iceberg/pull/11729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Docs: fix typo in spec [iceberg]

2024-12-11 Thread via GitHub
nastra merged PR #11759: URL: https://github.com/apache/iceberg/pull/11759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881528468 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/AzureSasCredentialRefresher.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on PR #11577: URL: https://github.com/apache/iceberg/pull/11577#issuecomment-2538047119 @danielcweeks could you also take a look at this PR please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Docs: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
Fokko commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881537732 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated to wi

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881537532 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881531755 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/AzureSasCredentialRefresher.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881533944 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881531528 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/AzureSasCredentialRefresher.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881531305 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/AzureSasCredentialRefresher.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1881527842 ## azure/src/main/java/org/apache/iceberg/azure/AzureProperties.java: ## @@ -90,7 +117,9 @@ public Optional adlsWriteBlockSize() { */ public void applyClientCon

Re: [PR] Alter `Transform::Day` to map partition types to `Date` rather than `Int` for consistency with reference implementation [iceberg-rust]

2024-12-11 Thread via GitHub
ZENOTME commented on PR #479: URL: https://github.com/apache/iceberg-rust/pull/479#issuecomment-2538033235 Seems this PR affects the result type of Year, Month, Day transform. Should we only change the result type of Transform::Day.πŸ€” -- This is an automated message from the Apache Git Ser

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881520238 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -149,6 +149,7 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881520238 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -149,6 +149,7 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-11 Thread via GitHub
nastra commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881519406 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -70,6 +85,7 @@ public class Endpoint { public static final Endpoint V1_DELETE_VIEW = Endpoint.cre

Re: [PR] doc: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
xxchan commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881514203 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated to w

Re: [PR] doc: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
ZENOTME commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881503123 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated to

Re: [I] Kafka Connect runtime package is missing Nessie catalog jars [iceberg]

2024-12-11 Thread via GitHub
nikulaja commented on issue #11733: URL: https://github.com/apache/iceberg/issues/11733#issuecomment-2538002097 @Fokko Thank you for reply! Seems very reasonable. Could this information be added to documentation? I'm very new to iceberg so wouldn't know if this is a regular thing with other

Re: [PR] doc: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
ZENOTME commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881503123 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated to

Re: [PR] Reduce code duplication in VectorizedParquetDefinitionLevelReader [iceberg]

2024-12-11 Thread via GitHub
wypoon commented on PR #11661: URL: https://github.com/apache/iceberg/pull/11661#issuecomment-2537946114 Hi @nastra, I have addressed your feedback last week. Can you please review this again? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] refactor: avoid async_trait macro for IcebergWriter and provide extra dyn trait for object safety [iceberg-rust]

2024-12-11 Thread via GitHub
wenym1 commented on PR #760: URL: https://github.com/apache/iceberg-rust/pull/760#issuecomment-2537946595 > I find that the implementation has some problems now, it will cause recursive calls endlessly and stack overflow finally. > > Reproduce: > > ``` >#[tokio::test]

Re: [PR] Hive: Add Hive 4 support and remove Hive 3 [iceberg]

2024-12-11 Thread via GitHub
pvary commented on PR #11750: URL: https://github.com/apache/iceberg/pull/11750#issuecomment-2537927187 Will check next week if it is not merged till then -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spark3.4,3.5,Api,Hive: Fix using NullType in View. [iceberg]

2024-12-11 Thread via GitHub
Ppei-Wang commented on code in PR #11728: URL: https://github.com/apache/iceberg/pull/11728#discussion_r1881446544 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -412,6 +413,24 @@ public String toString() { } } + public static class NullType extends P

Re: [PR] refactor: avoid async_trait macro for IcebergWriter and provide extra dyn trait for object safety [iceberg-rust]

2024-12-11 Thread via GitHub
ZENOTME commented on PR #760: URL: https://github.com/apache/iceberg-rust/pull/760#issuecomment-2537842348 I find that the implementation has some problems now, it will cause recursive calls endlessly and stack overflow finally. Reproduce: ``` #[tokio::test] async fn

Re: [PR] doc: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
xxchan commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881355866 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated to w

Re: [I] Compatibility Issue with pydantic and annotated-types in pyiceberg 0.8.1 [iceberg-python]

2024-12-11 Thread via GitHub
pawansanz commented on issue #1418: URL: https://github.com/apache/iceberg-python/issues/1418#issuecomment-2537769371 can anyone help with original issue that was reported regarding "Apache Iceberg version" -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Spark 3.5: Fix comment and assertion mismatch in PartitionedWritesTestBase/TestRewritePositionDeleteFilesAction [iceberg]

2024-12-11 Thread via GitHub
wzx140 commented on code in PR #11748: URL: https://github.com/apache/iceberg/pull/11748#discussion_r1881295827 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewritePositionDeleteFilesAction.java: ## @@ -275,7 +275,7 @@ public void testRewriteFilter() th

Re: [PR] Core, Spark3.5: Fix tests failure due to timeout [iceberg]

2024-12-11 Thread via GitHub
manuzhang closed pull request #11654: Core, Spark3.5: Fix tests failure due to timeout URL: https://github.com/apache/iceberg/pull/11654 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2024-12-11 Thread via GitHub
Samreay commented on PR #7914: URL: https://github.com/apache/iceberg/pull/7914#issuecomment-2537695273 Has anyone got a nice workaround for how to remove orphan files for an S3-located iceberg table? -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] Spark 3.5: Add query runner in test module [iceberg]

2024-12-11 Thread via GitHub
ebyhr opened a new pull request, #11758: URL: https://github.com/apache/iceberg/pull/11758 I propose adding query runners in tests so developers can debug Spark SQL without local jar publish. This is similar to query runners in Trino project. The project provides several query runne

[PR] Add plan tasks for TableScan [iceberg-python]

2024-12-11 Thread via GitHub
ConeyLiu opened a new pull request, #1427: URL: https://github.com/apache/iceberg-python/pull/1427 Now, we only support plan files. Plan tasks(split large file based on split_offset) would be more useful when we want to read in parallel. -- This is an automated message from the Apache Git

Re: [PR] refactor: avoid async_trait for FileRead and provide object safe dyn methods [iceberg-rust]

2024-12-11 Thread via GitHub
liurenjie1024 commented on PR #761: URL: https://github.com/apache/iceberg-rust/pull/761#issuecomment-2537679283 Hi, @wenym1 I saw you submitted several similar refactoring to current api. While the community appreciate your contribution, could you open an issue to raise discussion about th

Re: [PR] Spark 3.5: Fix comment and assertion mismatch in PartitionedWritesTestBase/TestRewritePositionDeleteFilesAction [iceberg]

2024-12-11 Thread via GitHub
wzx140 commented on code in PR #11748: URL: https://github.com/apache/iceberg/pull/11748#discussion_r1881295827 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewritePositionDeleteFilesAction.java: ## @@ -275,7 +275,7 @@ public void testRewriteFilter() th

[PR] infra: Dismiss stale reviews [iceberg-rust]

2024-12-11 Thread via GitHub
liurenjie1024 opened a new pull request, #779: URL: https://github.com/apache/iceberg-rust/pull/779 Add a protected branch rule to dismiss stale reviews when new commits are added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Spec: Support geo type [iceberg]

2024-12-11 Thread via GitHub
mkaravel commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1881291822 ## format/spec.md: ## @@ -603,6 +608,10 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single

Re: [PR] Spec: Support geo type [iceberg]

2024-12-11 Thread via GitHub
mkaravel commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1881291822 ## format/spec.md: ## @@ -603,6 +608,10 @@ Notes: 4. Position delete metadata can use `referenced_data_file` when all deletes tracked by the entry are in a single

Re: [PR] Add CMake format [iceberg-cpp]

2024-12-11 Thread via GitHub
zhjwpku commented on code in PR #5: URL: https://github.com/apache/iceberg-cpp/pull/5#discussion_r1881282329 ## cmake-format.py: ## @@ -0,0 +1,74 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# d

Re: [PR] Add clang format [iceberg-cpp]

2024-12-11 Thread via GitHub
zhjwpku commented on code in PR #4: URL: https://github.com/apache/iceberg-cpp/pull/4#discussion_r1881257096 ## .clang-format: ## @@ -0,0 +1,22 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# dis

Re: [PR] [WIP][Core] Restrict adding column of StructType with Empty Fields [iceberg]

2024-12-11 Thread via GitHub
ebyhr commented on code in PR #11755: URL: https://github.com/apache/iceberg/pull/11755#discussion_r1881237568 ## core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java: ## @@ -731,6 +731,17 @@ public void testAmbiguousAdd() { .hasMessageStartingWith("Cannot add co

Re: [PR] Spec: Support geo type [iceberg]

2024-12-11 Thread via GitHub
mkaravel commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1879219879 ## format/spec.md: ## @@ -205,13 +205,18 @@ Supported primitive types are defined in the table below. Primitive types added | | **`uuid`**

Re: [PR] Spark 3.5: Add ignore-invalid-options to RewriteDataFilesSparkAction and RewritePositionDeleteFilesSparkAction [iceberg]

2024-12-11 Thread via GitHub
manuzhang commented on PR #11737: URL: https://github.com/apache/iceberg/pull/11737#issuecomment-2537563264 @ajantha-bhat @RussellSpitzer @nastra Please help review whether this is a valid request. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] Fix: Resolve IDENTIFIER FIELDS with merge-on-read bug [iceberg]

2024-12-11 Thread via GitHub
601madman opened a new pull request, #11757: URL: https://github.com/apache/iceberg/pull/11757 ### Problem When IDENTIFIER FIELDS are set, and `merge-on-read` mode is used, a validation error occurs due to incorrect metadata schema checks. ### Solution Adjusted the `calculateMet

Re: [PR] doc: add note for `day` transform [iceberg]

2024-12-11 Thread via GitHub
manuzhang commented on code in PR #11749: URL: https://github.com/apache/iceberg/pull/11749#discussion_r1881244151 ## format/spec.md: ## @@ -454,7 +454,7 @@ Partition field IDs must be reused if an existing partition spec contains an equ | **`truncate[W]`** | Value truncated t

Re: [PR] Remove Hive 2 [iceberg]

2024-12-11 Thread via GitHub
manuzhang commented on PR #10996: URL: https://github.com/apache/iceberg/pull/10996#issuecomment-2537550479 We have a [consensus on the dev list](https://lists.apache.org/thread/jfcqfw9vhq4j7h0kwnlf338jgyzcq8s4) to drop hive-runtime and upgrade to Hive 4. I've submitted https://github.com/

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-11 Thread via GitHub
ebyhr commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1881226290 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -61,6 +63,19 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_TABLE_REGISTE

Re: [I] BUG: Bug: partition name stored in partition data in data file contains special character [iceberg-python]

2024-12-11 Thread via GitHub
github-actions[bot] commented on issue #175: URL: https://github.com/apache/iceberg-python/issues/175#issuecomment-2537469241 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [I] Create iceberg table from existsing parquet files with slightly different schemas (schemas merge is possible). [iceberg-python]

2024-12-11 Thread via GitHub
github-actions[bot] commented on issue #601: URL: https://github.com/apache/iceberg-python/issues/601#issuecomment-2537469223 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [I] Create iceberg table from existsing parquet files with slightly different schemas (schemas merge is possible). [iceberg-python]

2024-12-11 Thread via GitHub
github-actions[bot] closed issue #601: Create iceberg table from existsing parquet files with slightly different schemas (schemas merge is possible). URL: https://github.com/apache/iceberg-python/issues/601 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] closed pull request #11196: update PartitionSpec with snapshot'schema URL: https://github.com/apache/iceberg/pull/11196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Core: Store schema and spec in TaskContext to avoid unnecessary deserialization (#11235) [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] closed pull request #11280: Core: Store schema and spec in TaskContext to avoid unnecessary deserialization (#11235) URL: https://github.com/apache/iceberg/pull/11280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] No module named 'pyiceberg.table.partitioning' [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on issue #10491: URL: https://github.com/apache/iceberg/issues/10491#issuecomment-2537466332 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Merge into using the exactly dataset copy the entire data [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on issue #9736: URL: https://github.com/apache/iceberg/issues/9736#issuecomment-2537466213 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Workers gets stuck as there is no-coordinator for emitting Start_Commit request in Incremental Cooperative Rebalancing[ICR] Mode [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on PR #11288: URL: https://github.com/apache/iceberg/pull/11288#issuecomment-2537466505 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] update PartitionSpec with snapshot'schema [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on PR #11196: URL: https://github.com/apache/iceberg/pull/11196#issuecomment-2537466436 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core: Store schema and spec in TaskContext to avoid unnecessary deserialization (#11235) [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on PR #11280: URL: https://github.com/apache/iceberg/pull/11280#issuecomment-2537466476 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Add support for Spark 4.0.0 [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] commented on issue #10497: URL: https://github.com/apache/iceberg/issues/10497#issuecomment-2537466356 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Merge into using the exactly dataset copy the entire data [iceberg]

2024-12-11 Thread via GitHub
github-actions[bot] closed issue #9736: Merge into using the exactly dataset copy the entire data URL: https://github.com/apache/iceberg/issues/9736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-12-11 Thread via GitHub
rdblue commented on code in PR #11415: URL: https://github.com/apache/iceberg/pull/11415#discussion_r1881180413 ## core/src/main/java/org/apache/iceberg/variants/PrimitiveWrapper.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2024-12-11 Thread via GitHub
rdblue commented on code in PR #11317: URL: https://github.com/apache/iceberg/pull/11317#discussion_r1881174754 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java: ## @@ -365,6 +370,14 @@ public boolean purgeTable(Identifier ident) { String metad

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2024-12-11 Thread via GitHub
rdblue commented on code in PR #11317: URL: https://github.com/apache/iceberg/pull/11317#discussion_r1881170392 ## core/src/main/java/org/apache/iceberg/CatalogProperties.java: ## @@ -78,6 +78,14 @@ private CatalogProperties() {} public static final boolean IO_MANIFEST_CACH

Re: [PR] Add Support for Dynamic Overwrite [iceberg-python]

2024-12-11 Thread via GitHub
jqin61 commented on PR #931: URL: https://github.com/apache/iceberg-python/pull/931#issuecomment-2537415525 Thanks for fixing the jar issue, shall we rerun CI and merge? @Fokko Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881106923 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881106567 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881105847 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881104193 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881103731 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881105376 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881105008 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1881103356 ## format/spec.md: ## @@ -693,6 +686,64 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on The snapshot's `first-row-id` is t

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11365: URL: https://github.com/apache/iceberg/pull/11365#discussion_r1881100218 ## format/view-spec.md: ## @@ -97,7 +97,10 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be repr

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-12-11 Thread via GitHub
wmoustafa commented on code in PR #11365: URL: https://github.com/apache/iceberg/pull/11365#discussion_r1881085717 ## format/view-spec.md: ## @@ -97,7 +97,10 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be represent

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-11 Thread via GitHub
ajreid21 commented on PR #11756: URL: https://github.com/apache/iceberg/pull/11756#issuecomment-2537316670 @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Distributed writes in the same iceberg transaction [iceberg-python]

2024-12-11 Thread via GitHub
jimmyxie-figma commented on issue #357: URL: https://github.com/apache/iceberg-python/issues/357#issuecomment-2537304523 Any update on supporting distributed write, we are also interested in adding iceberg write capability to Ray. https://github.com/ray-project/ray/issues/49032 -- Thi

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-11 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2537284504 I should be able to give this a review tomorrow or Friday. In the meantime can you resolve the conflict in the go.mod? Thanks! -- This is an automated message from the Apache Git Ser

Re: [I] Rest Catalog: spark catalog api fails to work with rest based catalog [iceberg]

2024-12-11 Thread via GitHub
kazuyukitanimura commented on issue #11741: URL: https://github.com/apache/iceberg/issues/11741#issuecomment-2537281522 @sunny1154 I think you would need to specify the catalog in `TableIdentifier()` Otherwise, Spark tries to use `spark_catalog` https://github.com/apache/spark/blo

Re: [I] Rest Catalog: spark catalog api fails to work with rest based catalog [iceberg]

2024-12-11 Thread via GitHub
kazuyukitanimura commented on issue #11741: URL: https://github.com/apache/iceberg/issues/11741#issuecomment-2537255407 Just to add @huaxingao's point tableExists(dbName: String, tableName: String): Boolean it is meant to be only for the hardcoded spark_catalog only. But looks like

Re: [I] Rest Catalog: spark catalog api fails to work with rest based catalog [iceberg]

2024-12-11 Thread via GitHub
sunny1154 commented on issue #11741: URL: https://github.com/apache/iceberg/issues/11741#issuecomment-2537250201 thanks @huaxingao for looking into this. is `spark.sessionState.catalog.getTableMetadata(TableIdentifier(table, Some(database)))` also expected to work with HMS? currently

Re: [PR] Update StrictProjection tests [iceberg-python]

2024-12-11 Thread via GitHub
sungwy commented on code in PR #1422: URL: https://github.com/apache/iceberg-python/pull/1422#discussion_r1881033971 ## tests/test_transforms.py: ## @@ -988,608 +997,367 @@ def _test_projection(lhs: Optional[UnboundPredicate[L]], rhs: Optional[UnboundPr raise ValueErro

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11365: URL: https://github.com/apache/iceberg/pull/11365#discussion_r1881029096 ## format/view-spec.md: ## @@ -97,7 +97,10 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be repr

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on code in PR #11365: URL: https://github.com/apache/iceberg/pull/11365#discussion_r1881029096 ## format/view-spec.md: ## @@ -97,7 +97,10 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be repr

Re: [PR] docker: The `archive` seems unstable [iceberg-python]

2024-12-11 Thread via GitHub
Fokko commented on PR #1425: URL: https://github.com/apache/iceberg-python/pull/1425#issuecomment-2537217676 Thanks @sungwy πŸ™Œ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] docker: The `archive` seems unstable [iceberg-python]

2024-12-11 Thread via GitHub
Fokko merged PR #1425: URL: https://github.com/apache/iceberg-python/pull/1425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-11 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2537210180 @zeroshade hoping you can review when you've got time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] docker: Build for `arm64` architecture [iceberg]

2024-12-11 Thread via GitHub
amogh-jahagirdar merged PR #11753: URL: https://github.com/apache/iceberg/pull/11753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Added support for lowercase FileFormat for Issue #1340 [iceberg-python]

2024-12-11 Thread via GitHub
Fokko commented on PR #1362: URL: https://github.com/apache/iceberg-python/pull/1362#issuecomment-2537057846 @hgollakota It looks like there is a formatting issue, could you run `make lint`? :) -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Ignore partition fields that reference a dropped source-id [iceberg-python]

2024-12-11 Thread via GitHub
Fokko commented on PR #1393: URL: https://github.com/apache/iceberg-python/pull/1393#issuecomment-2537019109 This is actually dangerous in the case of V1 tables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Ignore partition fields that reference a dropped source-id [iceberg-python]

2024-12-11 Thread via GitHub
Fokko closed pull request #1393: Ignore partition fields that reference a dropped source-id URL: https://github.com/apache/iceberg-python/pull/1393 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] [WIP][Core] Restrict adding column of StructType with Empty Fields [iceberg]

2024-12-11 Thread via GitHub
singhpk234 opened a new pull request, #11755: URL: https://github.com/apache/iceberg/pull/11755 ## About the change Recently stumbled on a schema where a column was of struct type but the underlying struct was empty, this lead to failure when writing the parquet file because :

Re: [PR] Fix `Table.scan` to enable case sensitive argument [iceberg-python]

2024-12-11 Thread via GitHub
sungwy commented on PR #1423: URL: https://github.com/apache/iceberg-python/pull/1423#issuecomment-2536972352 Yes, I think updating this PR to include the changes for both makes sense @jiakai-li πŸ‘ Thank you again for tackling this issue! -- This is an automated message from the Ap

Re: [PR] Spark 3.5: Fix comment and assertion mismatch in PartitionedWritesTestBase/TestRewritePositionDeleteFilesAction [iceberg]

2024-12-11 Thread via GitHub
szehon-ho commented on code in PR #11748: URL: https://github.com/apache/iceberg/pull/11748#discussion_r1880821876 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewritePositionDeleteFilesAction.java: ## @@ -275,7 +275,7 @@ public void testRewriteFilter()

Re: [PR] Add `all_manifests` metadata table with tests [iceberg-python]

2024-12-11 Thread via GitHub
soumya-ghosh commented on PR #1241: URL: https://github.com/apache/iceberg-python/pull/1241#issuecomment-2536965547 @Fokko bumping this up for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Support more complex types when reading into arrow record batch. [iceberg-rust]

2024-12-11 Thread via GitHub
ryzhyk commented on issue #405: URL: https://github.com/apache/iceberg-rust/issues/405#issuecomment-2536958987 Thanks for the update @sdd ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [Views] Update view spec with table identifier requirements [iceberg]

2024-12-11 Thread via GitHub
wmoustafa commented on code in PR #11365: URL: https://github.com/apache/iceberg/pull/11365#discussion_r1880814678 ## format/view-spec.md: ## @@ -97,7 +97,10 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be represent

Re: [I] Support more complex types when reading into arrow record batch. [iceberg-rust]

2024-12-11 Thread via GitHub
sdd commented on issue #405: URL: https://github.com/apache/iceberg-rust/issues/405#issuecomment-2536941270 Hi @ryzhyk - I implemented the default value handling and type promotion limitations mentioned in @liurenjie1024's [comment at the top of the issue](https://github.com/apache/iceberg-

Re: [PR] Fix `Table.scan` to enable case sensitive argument [iceberg-python]

2024-12-11 Thread via GitHub
jiakai-li commented on PR #1423: URL: https://github.com/apache/iceberg-python/pull/1423#issuecomment-2536887038 Thanks very much for the guidance guys @sungwy and @Fokko . Is it ok for me to pick up the delete part as well? I'll update this PR to include both operations if that's ok. Thank

Re: [I] Rest Catalog: spark catalog api fails to work with rest based catalog [iceberg]

2024-12-11 Thread via GitHub
dramaticlly commented on issue #11741: URL: https://github.com/apache/iceberg/issues/11741#issuecomment-2536868437 > After taking a closer look at the [Java Doc](https://github.com/apache/spark/blob/branch-3.5/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala#L224), I found

[I] Decouple building and serialization [iceberg-rust]

2024-12-11 Thread via GitHub
Sl1mb0 opened a new issue, #778: URL: https://github.com/apache/iceberg-rust/issues/778 At the moment, the building and serialization of Iceberg metadata is coupled together. For example, let's say I want to build a `ManifestFile` that I then add to a `ManifestList`: (some cod

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2024-12-11 Thread via GitHub
RussellSpitzer commented on PR #11317: URL: https://github.com/apache/iceberg/pull/11317#issuecomment-2536685955 https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc -- This is an automated message from the Apache Git Service.

[I] Drop behavioral change for Spark with REST Catalogs [iceberg]

2024-12-11 Thread via GitHub
c-thiel opened a new issue, #11754: URL: https://github.com/apache/iceberg/issues/11754 ### Feature Request / Improvement Currently when purge-dropping tables with Spark and the REST Catalog, Spark deletes all files of the tables before sending the drop request to the REST Catalog. I

  1   2   >