[I] Some column statistics are missing after writing data to a table [iceberg-python]

2025-01-01 Thread via GitHub
rotem-ad opened a new issue, #1482: URL: https://github.com/apache/iceberg-python/issues/1482 ### Apache Iceberg version 0.8.1 (latest release) ### Please describe the bug 🐞 Following this [Slack thread](https://apache-iceberg.slack.com/archives/C025PH0G1D4/p173470407710

[I] Column Names in REST calls [iceberg]

2025-01-01 Thread via GitHub
kpkab opened a new issue, #11898: URL: https://github.com/apache/iceberg/issues/11898 ### Query engine Spark, Trino, Snowflake, EMR ### Question Hi Happy New Year!. I would like to know if theres a way to capture the list of columns that is being accessed by a user. F

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900365823 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900365823 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900363610 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

[PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-01 Thread via GitHub
rshkv opened a new pull request, #863: URL: https://github.com/apache/iceberg-rust/pull/863 Re #823. This adds support for the the [Manifest Entries (docs)](https://iceberg.apache.org/docs/latest/spark-queries/#entries) which lists entries in the current snapshot's manifest files. I'

Re: [PR] feat: support serialize/deserialize DataFile into avro bytes [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo commented on code in PR #797: URL: https://github.com/apache/iceberg-rust/pull/797#discussion_r1900533257 ## crates/iceberg/src/spec/manifest.rs: ## @@ -1189,6 +1212,49 @@ impl DataFile { self.sort_order_id } } + +/// Convert data files to avro bytes and wr

Re: [PR] fix: parse var len of decimal for parquet statistic [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo commented on PR #837: URL: https://github.com/apache/iceberg-rust/pull/837#issuecomment-2567369698 > Hi @liurenjie1024 @Xuanwo , is there other question about this PR? Hi, I don't quite understand what issue this PR is trying to address. The test added in this PR is incorrect,

Re: [PR] fix: parse var len of decimal for parquet statistic [iceberg-rust]

2025-01-01 Thread via GitHub
ZENOTME commented on PR #837: URL: https://github.com/apache/iceberg-rust/pull/837#issuecomment-2567382899 > > Hi @liurenjie1024 @Xuanwo , is there other question about this PR? > > Hi, I don't quite understand what issue this PR is trying to address. The test added in this PR is inco

Re: [PR] fix: parse var len of decimal for parquet statistic [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo commented on PR #837: URL: https://github.com/apache/iceberg-rust/pull/837#issuecomment-2567390885 Thank you, @ZENOTME, for the clarification. Tagging @liurenjie1024 and @Fokko to review this again. -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900615479 ## cmake_modules/BuildUtils.cmake: ## @@ -201,23 +202,29 @@ function(ADD_ICEBERG_LIB LIB_NAME) PUBLIC "$") endif() -install(TARGET

Re: [PR] feat(puffin): Parse Puffin FileMetadata [iceberg-rust]

2025-01-01 Thread via GitHub
liurenjie1024 commented on code in PR #765: URL: https://github.com/apache/iceberg-rust/pull/765#discussion_r1900504741 ## crates/iceberg/src/puffin/metadata.rs: ## @@ -0,0 +1,797 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license a

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900516441 ## cmake_modules/BuildUtils.cmake: ## @@ -201,23 +202,29 @@ function(ADD_ICEBERG_LIB LIB_NAME) PUBLIC "$") endif() -install(TARGETS $

Re: [I] Snapshot Testing for Integration Tests [iceberg-rust]

2025-01-01 Thread via GitHub
liurenjie1024 commented on issue #803: URL: https://github.com/apache/iceberg-rust/issues/803#issuecomment-2567254972 Thanks @feniljain for raising this. I'm fine with snapshot testing, but currently didn't come up with a concrete example to do it. -- This is an automated message from the

Re: [PR] feat: support serialize/deserialize DataFile into avro bytes [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo commented on code in PR #797: URL: https://github.com/apache/iceberg-rust/pull/797#discussion_r1900575269 ## crates/iceberg/src/spec/manifest.rs: ## @@ -1237,7 +1237,7 @@ pub fn write_data_files_to_avro( } /// Parse data files from avro bytes. -pub fn parse_data_file_

Re: [PR] feat: support serialize/deserialize DataFile into avro bytes [iceberg-rust]

2025-01-01 Thread via GitHub
ZENOTME commented on code in PR #797: URL: https://github.com/apache/iceberg-rust/pull/797#discussion_r1900577097 ## crates/iceberg/src/spec/manifest.rs: ## @@ -1237,7 +1237,7 @@ pub fn write_data_files_to_avro( } /// Parse data files from avro bytes. -pub fn parse_data_file

Re: [PR] Doc: Do Not Modify the Source Data Table During MergeIntoCommand Exec… [iceberg]

2025-01-01 Thread via GitHub
BsoBird closed pull request #11787: Doc: Do Not Modify the Source Data Table During MergeIntoCommand Exec… URL: https://github.com/apache/iceberg/pull/11787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Data loss bug in MergeIntoCommand [iceberg]

2025-01-01 Thread via GitHub
BsoBird closed issue #11765: Data loss bug in MergeIntoCommand URL: https://github.com/apache/iceberg/issues/11765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] Spark 3.5, Core: Supplement test case for metadata_log_entries after expire snapshot [iceberg]

2025-01-01 Thread via GitHub
hantangwangd opened a new pull request, #11901: URL: https://github.com/apache/iceberg/pull/11901 When we query `metadata_log_entries` after expiring some intermediate snapshots, the log entries before the expired snapshot would not get their corresponding snapshots and then would show `nul

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900516042 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTIC

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900516103 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTIC

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-01 Thread via GitHub
wombatu-kun commented on code in PR #11883: URL: https://github.com/apache/iceberg/pull/11883#discussion_r1900557397 ## kafka-connect/kafka-connect-runtime/src/integration/java/org/apache/iceberg/connect/IntegrationMultiTableTest.java: ## @@ -20,55 +20,27 @@ import static org

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-01 Thread via GitHub
wombatu-kun commented on code in PR #11883: URL: https://github.com/apache/iceberg/pull/11883#discussion_r1900557538 ## kafka-connect/kafka-connect-runtime/src/integration/java/org/apache/iceberg/connect/IntegrationTestBase.java: ## @@ -84,10 +101,17 @@ public void baseBefore()

Re: [I] Fields are out of order in equality delete files if equality fields are not together [iceberg]

2025-01-01 Thread via GitHub
beyond-up commented on issue #11891: URL: https://github.com/apache/iceberg/issues/11891#issuecomment-2567262917 > > But this equality delete file is out of order and this record and still be read in iceberg table > > Equality delete file written had ptr as **111** instead of **202412

Re: [PR] feat: Support metadata table "Manifests" [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo merged PR #861: URL: https://github.com/apache/iceberg-rust/pull/861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] feat: support serialize/deserialize DataFile into avro bytes [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo merged PR #797: URL: https://github.com/apache/iceberg-rust/pull/797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-01-01 Thread via GitHub
Xuanwo commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1900592922 ## crates/iceberg/src/metadata_table.rs: ## @@ -0,0 +1,1031 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [I] [Question] Why does plan_files not seem to get multi-threading improvement [iceberg-python]

2025-01-01 Thread via GitHub
gitzwz commented on issue #1479: URL: https://github.com/apache/iceberg-python/issues/1479#issuecomment-2567363410 The most time-consuming process is this : ```Python for manifest_entry in chain( *executor.map( lambda args: _open_manifest(*arg

[PR] Doc: Add missing fields to metadata tables [iceberg]

2025-01-01 Thread via GitHub
ebyhr opened a new pull request, #11897: URL: https://github.com/apache/iceberg/pull/11897 ``` spark-sql (default)> DESC default.test.manifests; content int path string length bigint partition_spec_idint added_snapshot_idbig

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900350764 ## cmake_modules/BuildUtils.cmake: ## @@ -182,13 +183,7 @@ function(ADD_ICEBERG_LIB LIB_NAME) target_include_directories(${LIB_NAME}_static PRIVATE ${ARG_PRIVATE_I

Re: [I] Support for timestamp downcasting when loading data to iceberg tables [iceberg-python]

2025-01-01 Thread via GitHub
rotem-ad commented on issue #1045: URL: https://github.com/apache/iceberg-python/issues/1045#issuecomment-2566918400 I've faced the same issue when loading data using [Table.add_files](https://py.iceberg.apache.org/reference/pyiceberg/table/#pyiceberg.table.Table.add_files) method. It fail

Re: [PR] Core: add variant type support [iceberg]

2025-01-01 Thread via GitHub
rdblue commented on code in PR #11831: URL: https://github.com/apache/iceberg/pull/11831#discussion_r1900477885 ## api/src/main/java/org/apache/iceberg/transforms/Identity.java: ## @@ -39,7 +42,7 @@ class Identity implements Transform { @Deprecated public static Identity

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
lidavidm commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900474985 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-01 Thread via GitHub
bryanck commented on code in PR #11883: URL: https://github.com/apache/iceberg/pull/11883#discussion_r1900479843 ## kafka-connect/kafka-connect-runtime/src/integration/java/org/apache/iceberg/connect/IntegrationMultiTableTest.java: ## @@ -20,55 +20,27 @@ import static org.ass

Re: [PR] Flink: Replace use of deprecated methods [iceberg]

2025-01-01 Thread via GitHub
github-actions[bot] commented on PR #11658: URL: https://github.com/apache/iceberg/pull/11658#issuecomment-2567194839 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] REST Spec: Server-side Metadata Tables [iceberg]

2025-01-01 Thread via GitHub
github-actions[bot] commented on issue #10645: URL: https://github.com/apache/iceberg/issues/10645#issuecomment-2567194812 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-01 Thread via GitHub
bryanck commented on code in PR #11883: URL: https://github.com/apache/iceberg/pull/11883#discussion_r1900481220 ## kafka-connect/kafka-connect-runtime/src/integration/java/org/apache/iceberg/connect/IntegrationTestBase.java: ## @@ -84,10 +101,17 @@ public void baseBefore() {

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-01 Thread via GitHub
bryanck commented on PR #11883: URL: https://github.com/apache/iceberg/pull/11883#issuecomment-2567193909 Thanks @wombatu-kun , it mostly looks good, I added a couple of comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900344597 ## cmake_modules/BuildUtils.cmake: ## @@ -182,13 +183,7 @@ function(ADD_ICEBERG_LIB LIB_NAME) target_include_directories(${LIB_NAME}_static PRIVATE ${ARG_PRIVATE_INCL

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900359799 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900359799 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900360320 ## cmake_modules/BuildUtils.cmake: ## @@ -182,13 +183,7 @@ function(ADD_ICEBERG_LIB LIB_NAME) target_include_directories(${LIB_NAME}_static PRIVATE ${ARG_PRIVATE_I

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900361184 ## cmake_modules/BuildUtils.cmake: ## @@ -182,13 +183,7 @@ function(ADD_ICEBERG_LIB LIB_NAME) target_include_directories(${LIB_NAME}_static PRIVATE ${ARG_PRIVATE_INCL

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900361644 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTIC

[I] [Java API] Rough edges when partitioning by time types [iceberg]

2025-01-01 Thread via GitHub
ahmedabu98 opened a new issue, #11899: URL: https://github.com/apache/iceberg/issues/11899 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine Other ### Please describe the bug 🐞 We've been developing an Iceberg connector at [Apache Beam](htt

Re: [PR] ParallelIterable: Queue Size w/ O(1) [iceberg]

2025-01-01 Thread via GitHub
shanielh commented on PR #11895: URL: https://github.com/apache/iceberg/pull/11895#issuecomment-2566977078 > LGTM as well ! Thank you for the fix ! > > > have a JFR dump that shows this method uses 35% CPU utilization, this > > is why I think this commit is important > > inte

[I] [Java API] Rough edges when recreating a DataFile that is partitioned by month or hour [iceberg]

2025-01-01 Thread via GitHub
ahmedabu98 opened a new issue, #11900: URL: https://github.com/apache/iceberg/issues/11900 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug 🐞 Part of our workflow in Apache Beam's Iceberg connector requires

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
kou commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900355716 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTIC

Re: [PR] Add iceberg_arrow library [iceberg-cpp]

2025-01-01 Thread via GitHub
wgtmac commented on code in PR #6: URL: https://github.com/apache/iceberg-cpp/pull/6#discussion_r1900363610 ## api/iceberg/visibility.h: ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] Spec: Support geo type [iceberg]

2025-01-01 Thread via GitHub
paleolimbot commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1900452764 ## format/spec.md: ## @@ -584,8 +589,8 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | _optional