Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo merged PR #794: URL: https://github.com/apache/iceberg-rust/pull/794 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo commented on code in PR #794: URL: https://github.com/apache/iceberg-rust/pull/794#discussion_r1884928592 ## crates/iceberg/src/spec/manifest.rs: ## @@ -128,7 +130,61 @@ pub struct ManifestWriter { key_metadata: Vec, -field_summary: HashMap, +partitions:

Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on PR #794: URL: https://github.com/apache/iceberg-rust/pull/794#issuecomment-2542949738 Thanks @Xuanwo's suggestion to make the code more clear! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on code in PR #794: URL: https://github.com/apache/iceberg-rust/pull/794#discussion_r1884917390 ## crates/iceberg/src/spec/manifest.rs: ## @@ -128,7 +130,61 @@ pub struct ManifestWriter { key_metadata: Vec, -field_summary: HashMap, +partitions:

[PR] fix(catalog/rest): Ensure token been reused correctly [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo opened a new pull request, #801: URL: https://github.com/apache/iceberg-rust/pull/801 Fix https://github.com/apache/iceberg-rust/issues/791 I discovered that we were not reusing tokens correctly, which could result in sending an unexpectedly high number of token authentication

Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on code in PR #794: URL: https://github.com/apache/iceberg-rust/pull/794#discussion_r1884874229 ## crates/iceberg/src/spec/manifest.rs: ## @@ -128,7 +130,61 @@ pub struct ManifestWriter { key_metadata: Vec, -field_summary: HashMap, +partitions:

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1884873126 ## crates/iceberg/src/spec/values.rs: ## @@ -3439,11 +3443,13 @@ mod tests { "bar".to_string(), ))), None, +

Re: [PR] fix: set key_metadata to Null by default [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo commented on code in PR #800: URL: https://github.com/apache/iceberg-rust/pull/800#discussion_r1884868013 ## crates/iceberg/src/expr/visitors/expression_evaluator.rs: ## @@ -338,7 +338,7 @@ mod tests { nan_value_counts: HashMap::new(), lower_boun

Re: [PR] fix: set key_metadata to Null by default [iceberg-rust]

2024-12-13 Thread via GitHub
feniljain commented on code in PR #800: URL: https://github.com/apache/iceberg-rust/pull/800#discussion_r1884869867 ## crates/iceberg/src/expr/visitors/expression_evaluator.rs: ## @@ -338,7 +338,7 @@ mod tests { nan_value_counts: HashMap::new(), lower_b

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1884868240 ## crates/iceberg/src/spec/values.rs: ## @@ -3439,11 +3443,13 @@ mod tests { "bar".to_string(), ))), None, +

Re: [PR] fix: wrong compute of partitions in manifest [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo commented on code in PR #794: URL: https://github.com/apache/iceberg-rust/pull/794#discussion_r1884853758 ## crates/iceberg/src/spec/manifest.rs: ## @@ -128,7 +130,61 @@ pub struct ManifestWriter { key_metadata: Vec, -field_summary: HashMap, +partitions:

[PR] fix: set key_metadata to Null by default [iceberg-rust]

2024-12-13 Thread via GitHub
feniljain opened a new pull request, #800: URL: https://github.com/apache/iceberg-rust/pull/800 ## Issue Resolved Closes #753 ## About - Converted Manifest spec's `key_metadata` field to be `Option` instead of just `Vec` - Updated tests to reflect the same - Ran

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1884853773 ## crates/iceberg/src/spec/values.rs: ## @@ -3439,11 +3443,13 @@ mod tests { "bar".to_string(), ))), None, +

Re: [PR] feat: support to append delete type data file [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on PR #798: URL: https://github.com/apache/iceberg-rust/pull/798#issuecomment-2542902331 Sorry, I think I have some misunderstanding here since the action which support to append data file and delete file is RowDelta.πŸ€” So I guess what we need is right: 1. MergingSnapsho

Re: [PR] fix: gurantee the deserialize order of struct is same as the struct type [iceberg-rust]

2024-12-13 Thread via GitHub
Xuanwo commented on code in PR #795: URL: https://github.com/apache/iceberg-rust/pull/795#discussion_r1884848752 ## crates/iceberg/src/spec/values.rs: ## @@ -3439,11 +3443,13 @@ mod tests { "bar".to_string(), ))), None, +

Re: [PR] feat: TableMetadata Statistic Files [iceberg-rust]

2024-12-13 Thread via GitHub
c-thiel commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1884798642 ## crates/iceberg/src/spec/table_metadata.rs: ## @@ -158,11 +160,15 @@ pub struct TableMetadata { /// writers, but is not used when reading because reads use th

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884790587 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884789740 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on PR #11772: URL: https://github.com/apache/iceberg/pull/11772#issuecomment-2542722986 > For example, I'm missing the [procedures](https://iceberg.apache.org/docs/latest/spark-procedures/), such as expire-snapshots, compaction etc. The reason I didn't add thi

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884757874 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884756239 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884752146 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on PR #11772: URL: https://github.com/apache/iceberg/pull/11772#issuecomment-2542710889 > Given iceberg-cpp is in development, should we add the columns for it even though it will be N for everything right now? I'm open to this, but currently this page lists ca

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884750217 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751449 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884748942 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751807 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751870 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751449 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751636 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884751449 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884750217 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884748942 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
sungwy commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884748942 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations of the i

Re: [PR] feat: support create partition table for non REST catalog [iceberg-rust]

2024-12-13 Thread via GitHub
FANNG1 closed pull request #577: feat: support create partition table for non REST catalog URL: https://github.com/apache/iceberg-rust/pull/577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884744299 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-13 Thread via GitHub
BsoBird commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2542688203 @RussellSpitzer Sir.If you have time, please review this PR for me. I believe we need to warn users against doing this in the documentation. https://github.com/apache/iceberg

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884744385 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations o

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
dwilson1988 commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542660861 @zeroshade - I'll take a look this weekend! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Fix `Table.scan` to enable case sensitive argument [iceberg-python]

2024-12-13 Thread via GitHub
jiakai-li commented on code in PR #1423: URL: https://github.com/apache/iceberg-python/pull/1423#discussion_r1884721249 ## tests/table/test_init.py: ## @@ -310,6 +310,19 @@ def test_table_scan_row_filter(table_v2: Table) -> None: assert scan.filter(EqualTo("x", 10)).filter(

Re: [PR] Fix `Table.scan` to enable case sensitive argument [iceberg-python]

2024-12-13 Thread via GitHub
jiakai-li commented on PR #1423: URL: https://github.com/apache/iceberg-python/pull/1423#issuecomment-2542657647 Hey guys, thanks a lot for your kind guidance and great suggestions. I've updated the PR to: - Enable `Table.delete` and `Table.overwrite` operations to control case-sensitivi

Re: [PR] [WIP][Core] Restrict adding column of StructType with Empty Fields [iceberg]

2024-12-13 Thread via GitHub
singhpk234 commented on PR #11755: URL: https://github.com/apache/iceberg/pull/11755#issuecomment-2542608870 interesting cases @ebyhr ! [1] Struct of Struct and the inner struct is empty [2] The above handles only add column, we can land this situation for dropping to the column

Re: [PR] API: add hashcode cache in StructType [iceberg]

2024-12-13 Thread via GitHub
singhpk234 commented on PR #11764: URL: https://github.com/apache/iceberg/pull/11764#issuecomment-2542605844 > Pre-execution Preparation Time: the time interval from the first table load to the start of the first stage execution Scan Spec Time: added a timer to the method SparkPartitionin

Re: [PR] API: add hashcode cache in StructType [iceberg]

2024-12-13 Thread via GitHub
singhpk234 commented on code in PR #11764: URL: https://github.com/apache/iceberg/pull/11764#discussion_r1884691655 ## api/src/main/java/org/apache/iceberg/types/Types.java: ## @@ -824,7 +827,10 @@ public boolean equals(Object o) { @Override public int hashCode() { -

Re: [PR] Spark: Relativize in-memory paths for data file and rewritable delete file locations [iceberg]

2024-12-13 Thread via GitHub
github-actions[bot] commented on PR #11525: URL: https://github.com/apache/iceberg/pull/11525#issuecomment-2542578530 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] add Status data structure [iceberg-cpp]

2024-12-13 Thread via GitHub
zhjwpku commented on code in PR #8: URL: https://github.com/apache/iceberg-cpp/pull/8#discussion_r1884658885 ## api/iceberg/status.h: ## @@ -0,0 +1,435 @@ +// Copyright (c) 2011 The LevelDB Authors. All rights reserved. +// Use of this source code is governed by a BSD-style lice

Re: [PR] add Status data structure [iceberg-cpp]

2024-12-13 Thread via GitHub
zhjwpku commented on code in PR #8: URL: https://github.com/apache/iceberg-cpp/pull/8#discussion_r1884656709 ## src/common/CMakeLists.txt: ## @@ -0,0 +1,28 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE

Re: [PR] Spark: Change Delete granularity to file for Spark 3.5 [iceberg]

2024-12-13 Thread via GitHub
aokolnychyi commented on code in PR #11478: URL: https://github.com/apache/iceberg/pull/11478#discussion_r1884625538 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkWriteConf.java: ## @@ -719,7 +719,7 @@ public DeleteGranularity deleteGranularity() {

Re: [I] API table.scan does not conform to Iceberg spec for identity partition columns [iceberg-python]

2024-12-13 Thread via GitHub
gabeiglio commented on issue #1401: URL: https://github.com/apache/iceberg-python/issues/1401#issuecomment-2542474285 Im open for feedback but as I investigated this issue im inclined that the fix would need to be in [_task_to_record_batches](https://github.com/apache/iceberg-python/blob/a

Re: [PR] Spark 3.5: Fix comment and assertion mismatch in PartitionedWritesTestBase/TestRewritePositionDeleteFilesAction [iceberg]

2024-12-13 Thread via GitHub
szehon-ho commented on code in PR #11748: URL: https://github.com/apache/iceberg/pull/11748#discussion_r1884599985 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewritePositionDeleteFilesAction.java: ## @@ -275,7 +275,7 @@ public void testRewriteFilter()

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542428675 LGTM πŸ‘ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542436725 @dwilson1988 When you get a chance, can you take a look at the changes I made here. I liked your thought on isolating things, but there was still a bunch of specific options for partic

Re: [PR] Feat: support aliyun oss backend. [iceberg-go]

2024-12-13 Thread via GitHub
zeroshade commented on PR #216: URL: https://github.com/apache/iceberg-go/pull/216#issuecomment-2542403956 This seems generally good to me. Does Aliyun have something similar to how MinIO works for S3 that can be added to the integration tests to have CI testing the backend? i.e. is there a

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542399559 @loicalleyne following pyiceberg's example, I've added an option to force virtual addressing. That work for you? -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884536195 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-13 Thread via GitHub
sopel39 commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1884535343 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable { @Over

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884533676 ## pyiceberg/table/__init__.py: ## @@ -1423,6 +1451,66 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ]

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2542309392 > However, sir, I might have discovered some issues. When executing the COW-MERGE-INTO command, Spark needs to use the ods_table twice. The first time is to match data

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1884527019 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable {

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884524371 ## pyiceberg/table/__init__.py: ## @@ -1253,6 +1265,22 @@ def __init__( self.start = start or 0 self.length = length or data_file.file_size_in

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
corleyma commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884523394 ## pyiceberg/table/__init__.py: ## @@ -1229,7 +1240,8 @@ def with_case_sensitive(self: S, case_sensitive: bool = True) -> S: class ScanTask(ABC): -pas

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-13 Thread via GitHub
sopel39 commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1884520727 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable { @Over

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542287393 Is it intended to not provide the choice between virtual hosted bucket addressing and path-style addressing? LGTM otherwise - the tests are passing :) -- This is an automated m

Re: [PR] ci(infra): Remove sha256 [iceberg-go]

2024-12-13 Thread via GitHub
zeroshade merged PR #226: URL: https://github.com/apache/iceberg-go/pull/226 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884494856 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884494396 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884493485 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations

Re: [PR] Doc: Add staus page for different implementations. [iceberg]

2024-12-13 Thread via GitHub
RussellSpitzer commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884491976 ## site/docs/status.md: ## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- + + +# Implementations Status + +Apache iceberg now has implementations

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
zeroshade commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2542213486 @loicalleyne can you take a look at the latest changes I made here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[I] Hivemetastore unable to create hive lock after upgrading from hivemetastore 3.1.3 to 4.0.0 during iceberg operations [iceberg]

2024-12-13 Thread via GitHub
mAlf1999 opened a new issue, #11784: URL: https://github.com/apache/iceberg/issues/11784 ### Apache Iceberg version 1.6.0 ### Query engine Spark ### Please describe the bug 🐞 We are currently using iceberg version 1.6.0 and have been successfully using it a

Re: [PR] Spark: Read DVs when reading from .position_deletes table [iceberg]

2024-12-13 Thread via GitHub
aokolnychyi commented on code in PR #11657: URL: https://github.com/apache/iceberg/pull/11657#discussion_r1884381201 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/DVIterator.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] Core: Add TableUtil to provide access to a table's format version [iceberg]

2024-12-13 Thread via GitHub
aokolnychyi commented on code in PR #11620: URL: https://github.com/apache/iceberg/pull/11620#discussion_r1884369789 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -158,6 +160,21 @@ public Map properties() { return properties; } + public int fo

Re: [PR] Core: Add TableUtil to provide access to a table's format version [iceberg]

2024-12-13 Thread via GitHub
aokolnychyi commented on code in PR #11620: URL: https://github.com/apache/iceberg/pull/11620#discussion_r1884367833 ## core/src/main/java/org/apache/iceberg/TableUtil.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

Re: [PR] Hive: Add Hive 4 support and remove Hive runtime [iceberg]

2024-12-13 Thread via GitHub
rdblue commented on code in PR #11750: URL: https://github.com/apache/iceberg/pull/11750#discussion_r1884343692 ## gradle.properties: ## @@ -18,8 +18,8 @@ jmhJsonOutputPath=build/reports/jmh/results.json jmhIncludeRegex=.* systemProp.defaultFlinkVersions=1.20 systemProp.known

Re: [PR] Core: Add Variant implementation to read serialized objects [iceberg]

2024-12-13 Thread via GitHub
rdblue commented on PR #11415: URL: https://github.com/apache/iceberg/pull/11415#issuecomment-2542017239 The Spark failures are a port conflict. I think it's unrelated to these changes. We'll see the next time CI runs (I'm sure we'll have more changes to trigger them) -- This is an autom

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2024-12-13 Thread via GitHub
osscm commented on code in PR #11781: URL: https://github.com/apache/iceberg/pull/11781#discussion_r1884308287 ## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ## @@ -257,17 +257,17 @@ private static class Task implements Supplier>>, Closeable { @Overri

Re: [PR] feat: TableMetadata Statistic Files [iceberg-rust]

2024-12-13 Thread via GitHub
c-thiel commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1884298259 ## crates/iceberg/src/catalog/mod.rs: ## @@ -446,6 +446,30 @@ pub enum TableUpdate { /// Properties to remove removals: Vec, }, +/// Set s

Re: [PR] feat(puffin): Parse Puffin FileMetadata [iceberg-rust]

2024-12-13 Thread via GitHub
c-thiel commented on PR #765: URL: https://github.com/apache/iceberg-rust/pull/765#issuecomment-2541965685 @fqaiser94, just added the higher level statistic files in https://github.com/apache/iceberg-rust/pull/799 FYI. I would guess you would end up building those soon too. -- This is an

[PR] feat: TableMetadata Statistics [iceberg-rust]

2024-12-13 Thread via GitHub
c-thiel opened a new pull request, #799: URL: https://github.com/apache/iceberg-rust/pull/799 Adds `StatisticFile` and `PartitionStatisticsFile` to spec, builder and REST TableUpdate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] add .gitignore [iceberg-cpp]

2024-12-13 Thread via GitHub
pitrou commented on code in PR #9: URL: https://github.com/apache/iceberg-cpp/pull/9#discussion_r1884280319 ## .gitignore: ## @@ -0,0 +1,18 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distrib

[I] HiveCatalog incorrectly uses FileIOTracker [iceberg]

2024-12-13 Thread via GitHub
tom-s-powell opened a new issue, #11783: URL: https://github.com/apache/iceberg/issues/11783 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Encountering an issue with `HiveCatalog` and `S3FileIO`. I bel

Re: [PR] feat: support to append delete type data file [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on PR #798: URL: https://github.com/apache/iceberg-rust/pull/798#issuecomment-2541922752 cc @liurenjie1024 @Xuanwo @Fokko @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] feat: support to append delete type data file [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME opened a new pull request, #798: URL: https://github.com/apache/iceberg-rust/pull/798 This PR support to support to append delete type data file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [Investigate] Whether `data_files` metadata table requires both pyarrow and s3fs [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on issue #1317: URL: https://github.com/apache/iceberg-python/issues/1317#issuecomment-2541919224 Hi @jiakai-li thanks for looking into this! > It was run with the s3fs module removed from the environment, which runs ok: I think you'd want to remove `pyarrow`

[PR] [INFRA] Remove sha256 [iceberg-go]

2024-12-13 Thread via GitHub
kevinjqliu opened a new pull request, #226: URL: https://github.com/apache/iceberg-go/pull/226 Sha512 is enough, Sha256 is not necessary devlist: https://lists.apache.org/thread/rsl3rj9rcqvchb8dqr8tjky97rt5pm22 Part of #204 -- This is an automated message from the Apache Git Ser

Re: [I] Discussion: make DataFile Serializable && Deserializable [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on issue #774: URL: https://github.com/apache/iceberg-rust/issues/774#issuecomment-2541894094 > Hey @ZENOTME thanks for raising this. > > Technically the `Datafile` is already serializable, you can encode it into Iceberg Avro :) I know how this works in Java and Pyth

Re: [PR] feat: expose _serde::DataFile [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on PR #797: URL: https://github.com/apache/iceberg-rust/pull/797#issuecomment-2541894721 cc @liurenjie1024 @Xuanwo @Fokko @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] fix: day transform compute [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME opened a new pull request, #796: URL: https://github.com/apache/iceberg-rust/pull/796 https://github.com/apache/iceberg-rust/pull/479 change the result type from int to date. And we should also change the computed result for this, otherwise, it will cause the inconsistent error. E.g

Re: [PR] fix: day transform compute [iceberg-rust]

2024-12-13 Thread via GitHub
ZENOTME commented on PR #796: URL: https://github.com/apache/iceberg-rust/pull/796#issuecomment-2541879394 cc @liurenjie1024 @Fokko @Xuanwo @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2541868880 [s3.UsePathStyle](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/s3#Options.UsePathStyle ) ``` // Allows you to enable the client to use path-style addressing, i.e.,

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on code in PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#discussion_r1884198201 ## pyiceberg/table/__init__.py: ## @@ -191,6 +193,15 @@ class TableProperties: DELETE_MODE_MERGE_ON_READ = "merge-on-read" DELETE_MODE_DEFAULT = DEL

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-13 Thread via GitHub
ajreid21 commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1884198902 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -138,11 +138,13 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-13 Thread via GitHub
ajreid21 commented on PR #11756: URL: https://github.com/apache/iceberg/pull/11756#issuecomment-2541836163 @nastra I addressed the comments and added the new checks if you want to take another look. Thanks. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Core: Add missing REST endpoint definitions [iceberg]

2024-12-13 Thread via GitHub
ajreid21 commented on code in PR #11756: URL: https://github.com/apache/iceberg/pull/11756#discussion_r1884197553 ## core/src/main/java/org/apache/iceberg/rest/Endpoint.java: ## @@ -46,6 +46,8 @@ public class Endpoint { Endpoint.create("POST", ResourcePaths.V1_NAMESPACE_P

Re: [PR] Fix `Table.scan` to enable case sensitive argument [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on code in PR #1423: URL: https://github.com/apache/iceberg-python/pull/1423#discussion_r1884193700 ## tests/table/test_init.py: ## @@ -310,6 +310,19 @@ def test_table_scan_row_filter(table_v2: Table) -> None: assert scan.filter(EqualTo("x", 10)).filter

Re: [PR] Remove unneeded partitoning [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on PR #1417: URL: https://github.com/apache/iceberg-python/pull/1417#issuecomment-2541817185 Looks like some tests are failing due to number of partitions being hardcoded in tests ``` === short test summary info ===

Re: [PR] fix: field id in name mapping should be optional [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on PR #1426: URL: https://github.com/apache/iceberg-python/pull/1426#issuecomment-2541812934 @barronw looks like theres a linter issue, could you try to run `make lint`? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] IO Implementation using Go CDK [iceberg-go]

2024-12-13 Thread via GitHub
loicalleyne commented on PR #176: URL: https://github.com/apache/iceberg-go/pull/176#issuecomment-2541810573 My understanding is that it's just another property to pass in `props`. Would also have to add it as a recognized property/constant in io/s3.go I should think. -- This is an auto

Re: [I] PyIceberg appending data creates snapshots incompatible with Athena/Spark [iceberg-python]

2024-12-13 Thread via GitHub
kevinjqliu commented on issue #1424: URL: https://github.com/apache/iceberg-python/issues/1424#issuecomment-2541805668 hi @Samreay thanks for reporting this issue! Very odd that its 1+MAX_VALUE. I took a look at the write path and didn't see anything that stood put that would cause

  1   2   >