Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
Xuanwo commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2712914649 1. iceberg-rust needs to work under stable rust, so we can't. 2. `eval-macro` is relatively new. Do we have any other options? 3. It looks like we only use `paste!()` here:

Re: [PR] chore(deps): Bump aws-sdk-glue from 1.82.0 to 1.84.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on code in PR #1057: URL: https://github.com/apache/iceberg-rust/pull/1057#discussion_r1988540238 ## Cargo.toml: ## @@ -53,7 +53,7 @@ async-stream = "0.3.5" async-trait = "0.1.86" async-std = "1.12" aws-config = "1" -aws-sdk-glue = "1.39" +aws-sdk-glue

Re: [PR] Flink: Support source watermark for flink sql windows [iceberg]

2025-03-11 Thread via GitHub
pvary commented on code in PR #12191: URL: https://github.com/apache/iceberg/pull/12191#discussion_r1988457157 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceSql.java: ## @@ -53,7 +55,11 @@ public class TestIcebergSourceSql extends TestSqlBas

[PR] Parquet: Fix Reader leak by removing useless copy [iceberg]

2025-03-11 Thread via GitHub
zizon opened a new pull request, #12079: URL: https://github.com/apache/iceberg/pull/12079 The ReadConf copy constructor will nullify the reader of source, leaving the reader of original unclosed -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
Xuanwo commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2712784549 > Does the crate really need maintain? It's really easy, outputs deterministic results, and has no runtime dependencies. Hi, I fully agree with your statements here.

Re: [PR] chore(deps): Bump aws-sdk-glue from 1.82.0 to 1.84.0 [iceberg-rust]

2025-03-11 Thread via GitHub
Xuanwo commented on code in PR #1057: URL: https://github.com/apache/iceberg-rust/pull/1057#discussion_r1988493490 ## Cargo.toml: ## @@ -53,7 +53,7 @@ async-stream = "0.3.5" async-trait = "0.1.86" async-std = "1.12" aws-config = "1" -aws-sdk-glue = "1.39" +aws-sdk-glue = "1.8

Re: [I] Missing records after compaction using `rewrite_data_files` [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #11014: URL: https://github.com/apache/iceberg/issues/11014#issuecomment-2712135191 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
sundy-li commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2712896274 There are three approaches 1. Use rust nightly features [concat_idents!](https://doc.rust-lang.org/std/macro.concat_idents.html) instead 2. Use other similar crates li

Re: [PR] Build: Bump getdaft from 0.4.4 to 0.4.6 [iceberg-python]

2025-03-11 Thread via GitHub
samster25 commented on PR #1758: URL: https://github.com/apache/iceberg-python/pull/1758#issuecomment-2712903045 @Fokko https://github.com/apache/iceberg-python/pull/1780 should have that fix! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] fix: refine doc for write support [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 merged PR #999: URL: https://github.com/apache/iceberg-rust/pull/999 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] Replace parquet metadata thrift version with in memory version. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on issue #1004: URL: https://github.com/apache/iceberg-rust/issues/1004#issuecomment-2713399676 Hi, @jonathanc-n I found [this method](https://docs.rs/parquet/latest/parquet/file/metadata/struct.ParquetMetaDataReader.html#method.decode_metadata) in `parquet` crate. I

Re: [PR] Core: Add KLL Datasketch as standard blob types to puffin file [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2713099103 hi @nastra, could you please help push it forward? I've updated the PR to include the KLL sketch only. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12117: URL: https://github.com/apache/iceberg/pull/12117#discussion_r1989631278 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/functions/DaysFunction.java: ## @@ -31,6 +31,8 @@ * A Spark function implementation for the Iceber

Re: [I] How to understand "Partition evolution is a metadata operation and does not eagerly rewrite files." [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on issue #12492: URL: https://github.com/apache/iceberg/issues/12492#issuecomment-2714892908 Yep rewrite_data_files or doing a Copy on Write update. Anything that would modify or rewrite the old datafiles would default to writing new versions of the files in the tab

Re: [I] Transactions do not support Upsert [iceberg-python]

2025-03-11 Thread via GitHub
koenvo commented on issue #1776: URL: https://github.com/apache/iceberg-python/issues/1776#issuecomment-2714877509 I would like to take a look at this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on PR #12117: URL: https://github.com/apache/iceberg/pull/12117#issuecomment-2714905429 I took a brief look through the whole PR. I really am a bit worried about the scope of code required to make this change and I'm not sure it matches the benefit of being able to

Re: [PR] API: Speed up Timestamps#toHumanString [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on PR #12447: URL: https://github.com/apache/iceberg/pull/12447#issuecomment-2714915616 @suneet-s What is the motivation for this change? Is toHumanString on a critical path somewhere? I don't think I have an issue with putting in a faster implementation as long as

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-11 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r1989287785 ## hive-metastore/src/main/java/org/apache/iceberg/hive/IcebergTableConverter.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-11 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r1989297201 ## hive-metastore/src/main/java/org/apache/iceberg/hive/IcebergTableConverter.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Core: Bulk deletion in RemoveSnapshots [iceberg]

2025-03-11 Thread via GitHub
pvary commented on code in PR #11837: URL: https://github.com/apache/iceberg/pull/11837#discussion_r1989403001 ## core/src/main/java/org/apache/iceberg/FileCleanupStrategy.java: ## @@ -23,15 +23,25 @@ import java.util.function.Consumer; import org.apache.iceberg.avro.Avro; im

Re: [I] Spark can't get information from metadata tables [iceberg]

2025-03-11 Thread via GitHub
singhpk234 commented on issue #12466: URL: https://github.com/apache/iceberg/issues/12466#issuecomment-2714622891 rest spec, just states 404 for table not found : https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L967 But I see where you are coming from you a

Re: [PR] Core: lazy init workerPool [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r1989472124 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -197,7 +198,7 @@ protected String targetBranch() { } protected ExecutorService workerP

Re: [PR] Core: lazy init workerPool [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r1989474677 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -197,7 +198,7 @@ protected String targetBranch() { } protected ExecutorService workerP

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12117: URL: https://github.com/apache/iceberg/pull/12117#discussion_r1989620464 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownDQL.java: ## @@ -84,20 +84,24 @@ public void removeT

Re: [PR] chore(deps): Bump tokio from 1.43.0 to 1.44.0 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] commented on PR #1058: URL: https://github.com/apache/iceberg-rust/pull/1058#issuecomment-2713194331 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version

Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
TennyZhuang commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2712476091 Does the crate really need maintain? It's really easy, outputs deterministic results, and has no runtime dependencies. -- This is an automated message from the Apache Git

Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
TennyZhuang commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2713596239 4. Fork the crate and maintain it. (In fact, itโ€™s likely that no work need to be done.) -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Spark Procedure Azure Exception Signed expiry time must be after signed start time [iceberg]

2025-03-11 Thread via GitHub
David-N-Perkins commented on issue #12446: URL: https://github.com/apache/iceberg/issues/12446#issuecomment-2713875111 After more testing, this seems to be related to how long the Spark job takes. I don't know the exact cutoff, but shorter jobs that under 30 minutes work fine. Longer ones f

[PR] chore(deps): Bump once_cell from 1.20.3 to 1.21.0 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1070: URL: https://github.com/apache/iceberg-rust/pull/1070 Bumps [once_cell](https://github.com/matklad/once_cell) from 1.20.3 to 1.21.0. Changelog Sourced from https://github.com/matklad/once_cell/blob/master/CHANGELOG.md";>once_cell's

Re: [PR] Update dependabot to update lock file only [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 merged PR #1068: URL: https://github.com/apache/iceberg-rust/pull/1068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[I] Provide access to `context` of `iceberg::error::Error` [iceberg-rust]

2025-03-11 Thread via GitHub
jshmchenxi opened a new issue, #1071: URL: https://github.com/apache/iceberg-rust/issues/1071 ### Is your feature request related to a problem or challenge? Currently, only `kind` and `message` of `iceberg::error::Error` are accessible by users. Some users might want to handle errors

Re: [I] Provide access to `context` of `iceberg::error::Error` [iceberg-rust]

2025-03-11 Thread via GitHub
Xuanwo commented on issue #1071: URL: https://github.com/apache/iceberg-rust/issues/1071#issuecomment-2714693408 Hi, context is not designed to be accessible to users, and if we expose it this way, we may unintentionally introduce breaking changes. > retry based on status code of REST

Re: [PR] refactor(manifests): consolidate ManifestEntryV1 and V2 [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade commented on code in PR #327: URL: https://github.com/apache/iceberg-go/pull/327#discussion_r1989727738 ## manifest.go: ## @@ -1477,131 +1470,68 @@ func (d *dataFile) EqualityFieldIDs() []int { func (d *dataFile) SortOrderID() *int { return d.SortOrder } -// Manif

Re: [PR] feat(table): Add computation of iceberg stats from parquet files [iceberg-go]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #329: URL: https://github.com/apache/iceberg-go/pull/329#discussion_r1989722553 ## table/arrow_utils.go: ## @@ -892,3 +899,356 @@ func ToRequestedSchema(ctx context.Context, requested, fileSchema *iceberg.Schem return out, nil } + +

Re: [PR] refactor(manifests): consolidate ManifestEntryV1 and V2 [iceberg-go]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #327: URL: https://github.com/apache/iceberg-go/pull/327#discussion_r1989735973 ## manifest.go: ## @@ -1477,131 +1470,68 @@ func (d *dataFile) EqualityFieldIDs() []int { func (d *dataFile) SortOrderID() *int { return d.SortOrder } -// Mani

Re: [PR] chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.2 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 merged PR #1069: URL: https://github.com/apache/iceberg-rust/pull/1069 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] feat: add apply in transaction to support stack action [iceberg-rust]

2025-03-11 Thread via GitHub
ZENOTME commented on PR #949: URL: https://github.com/apache/iceberg-rust/pull/949#issuecomment-2713273786 I think this PR is ready to review. cc @Fokko @liurenjie1024 @Xuanwo @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Docs: Update Iceberg talks with recent Iceberg meetup sessions [iceberg]

2025-03-11 Thread via GitHub
nastra commented on code in PR #12481: URL: https://github.com/apache/iceberg/pull/12481#discussion_r1988594983 ## site/docs/talks.md: ## @@ -21,6 +21,101 @@ title: "Talks" ## Iceberg Talks Here is a list of talks and other videos related to Iceberg. +### [Supporting S3 Tabl

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
nastra commented on PR #12117: URL: https://github.com/apache/iceberg/pull/12117#issuecomment-2713074660 @RussellSpitzer I'm also a +0 but I currently don't have capacity to review this PR, so feel free to merge if you're happy with the changes -- This is an automated message from the Apa

Re: [PR] Core: Add KLL Datasketch as standard blob types to puffin file [iceberg]

2025-03-11 Thread via GitHub
nastra commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2713113344 @deniskuzZ this is a spec change and needs to go through a DISCUSS & VOTE thread on the mailing list -- This is an automated message from the Apache Git Service. To respond to the messag

[PR] Update dependabot to update lock file only [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 opened a new pull request, #1068: URL: https://github.com/apache/iceberg-rust/pull/1068 ## What changes are included in this PR? Update dependabot to update lock file only, as discussed in community sync meeting. -- This is an automated message from the Apache Git

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-11 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r1989176991 ## hive-metastore/src/main/java/org/apache/iceberg/hive/IcebergTableConverter.java: ## @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-11 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r1989173702 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -230,12 +220,12 @@ protected void doCommit(TableMetadata base, TableMetadata

Re: [I] Move JUnit4 tests to JUnit5 [iceberg]

2025-03-11 Thread via GitHub
tomtongue commented on issue #7160: URL: https://github.com/apache/iceberg/issues/7160#issuecomment-2713013874 @nastra Thank you. Yes, it's still a big task and I understand the current prioritization. I will start it again gradually from adding the base tests like other versions for the fu

Re: [PR] chore(deps): Bump tokio from 1.43.0 to 1.44.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on PR #1058: URL: https://github.com/apache/iceberg-rust/pull/1058#issuecomment-2713194233 Can't update due to version conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] chore(deps): Bump mockito from 1.6.1 to 1.7.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 closed pull request #1056: chore(deps): Bump mockito from 1.6.1 to 1.7.0 URL: https://github.com/apache/iceberg-rust/pull/1056 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Core, Spark 3.5: Apply Equality Deletes when Doing Copy on Write [iceberg]

2025-03-11 Thread via GitHub
pvary commented on PR #12479: URL: https://github.com/apache/iceberg/pull/12479#issuecomment-2713693155 > I checked out this change and ran the newly added test with and without the fix. The behavior is as expected, except that even without the fix, two of the six cases in the parameterized

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-03-11 Thread via GitHub
zratkai commented on PR #12461: URL: https://github.com/apache/iceberg/pull/12461#issuecomment-2714028265 @gaborkaszab here is the origin of the PR: https://github.com/apache/hive/pull/5628#discussion_r1979230748 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Docs: Update Iceberg talks with recent Iceberg meetup sessions [iceberg]

2025-03-11 Thread via GitHub
sida-shen commented on code in PR #12481: URL: https://github.com/apache/iceberg/pull/12481#discussion_r1987886044 ## site/docs/talks.md: ## @@ -21,6 +21,86 @@ title: "Talks" ## Iceberg Talks Here is a list of talks and other videos related to Iceberg. +### [Supporting S3 Ta

Re: [PR] chore(deps): Bump aws-sdk-glue from 1.82.0 to 1.84.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on code in PR #1057: URL: https://github.com/apache/iceberg-rust/pull/1057#discussion_r1988690352 ## Cargo.toml: ## @@ -53,7 +53,7 @@ async-stream = "0.3.5" async-trait = "0.1.86" async-std = "1.12" aws-config = "1" -aws-sdk-glue = "1.39" +aws-sdk-glue

Re: [PR] chore(deps): Bump tokio from 1.43.0 to 1.44.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 closed pull request #1058: chore(deps): Bump tokio from 1.43.0 to 1.44.0 URL: https://github.com/apache/iceberg-rust/pull/1058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] chore(deps): Bump mockito from 1.6.1 to 1.7.0 [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on PR #1056: URL: https://github.com/apache/iceberg-rust/pull/1056#issuecomment-2713196367 Close it as we should only update lock file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Provide differert read interface for reader [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on issue #1047: URL: https://github.com/apache/iceberg-rust/issues/1047#issuecomment-2713343172 Thanks @ZENOTME for raising this. I think what's missing is a `FileReader` which accepts following arguements: 1. File path 2. File range 3. Expected schema

Re: [PR] chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.1 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] commented on PR #1054: URL: https://github.com/apache/iceberg-rust/pull/1054#issuecomment-2713346178 Superseded by #1069. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.1 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] closed pull request #1054: chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.1 URL: https://github.com/apache/iceberg-rust/pull/1054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] doc: run doc test [iceberg-rust]

2025-03-11 Thread via GitHub
de-sh closed pull request #1066: doc: run doc test URL: https://github.com/apache/iceberg-rust/pull/1066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] AWS: Add View support for Glue Catalog [iceberg]

2025-03-11 Thread via GitHub
hussein-awala commented on issue #12488: URL: https://github.com/apache/iceberg/issues/12488#issuecomment-2713725476 > While AWS Glue now provides a REST Catalog endpoint FYI, even the REST catalog endpoint does not support creating Iceberg views, which makes this feature interesting

Re: [PR] doc: run doc test [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on PR #1066: URL: https://github.com/apache/iceberg-rust/pull/1066#issuecomment-2713292241 Thanks @de-sh , it duplicates with #999 , should we close this now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Remove `paste` dependency. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on issue #1064: URL: https://github.com/apache/iceberg-rust/issues/1064#issuecomment-2713173278 > 1. iceberg-rust needs to work under stable rust, so we can't. > 2. `eval-macro` is relatively new. Do we have any other options? > 3. It looks like we only use `past

Re: [PR] Core: Add KLL Datasketch as standard blob types to puffin file [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2713305090 > @deniskuzZ this is a spec change and needs to go through a DISCUSS & VOTE thread on the mailing list This was a thread: https://lists.apache.org/thread/ws63loz1snsrwtdl7f9yqgr

Re: [PR] feat: Make duplicate check optional for adding parquet files [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on code in PR #1034: URL: https://github.com/apache/iceberg-rust/pull/1034#discussion_r1988854780 ## crates/iceberg/src/transaction.rs: ## @@ -236,57 +240,59 @@ impl<'a> FastAppendAction<'a> { self.add_data_files(data_files)?; -self.a

Re: [PR] doc: run doc test [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on code in PR #1066: URL: https://github.com/apache/iceberg-rust/pull/1066#discussion_r1988775552 ## crates/iceberg/src/writer/mod.rs: ## @@ -26,23 +26,58 @@ //! 2. IcebergWriter: Focus on the logical format of iceberg table. It will write the data usin

Re: [PR] chore(deps): Bump mockito from 1.6.1 to 1.7.0 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] commented on PR #1056: URL: https://github.com/apache/iceberg-rust/pull/1056#issuecomment-2713196468 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version

Re: [I] Can not create partiton on timestamp field two fields i.e with month(), year(). RUNNING DOCKER WITH tabulario/spark-iceberg:3.5.1_1.5.0 [iceberg]

2025-03-11 Thread via GitHub
Fokko commented on issue #12442: URL: https://github.com/apache/iceberg/issues/12442#issuecomment-2711856445 @dvnageshpatil If you have a timestamp value: ```sql ts = '2024-10-22T19:25:00' ``` Then the transforms will produce: ```sql month(ts) = '2024-10-00' y

Re: [PR] Spark: Detect dangling DVs properly [iceberg]

2025-03-11 Thread via GitHub
nastra commented on code in PR #12270: URL: https://github.com/apache/iceberg/pull/12270#discussion_r1987589793 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java: ## @@ -124,10 +125,15 @@ private List findDanglingDeletes() {

[I] Partition Pruning not working with different data type [iceberg]

2025-03-11 Thread via GitHub
shivdeep-singh3 opened a new issue, #12491: URL: https://github.com/apache/iceberg/issues/12491 ### Apache Iceberg version 1.6.1 ### Query engine Spark ### Please describe the bug ๐Ÿž When querying iceberg table on a partitioned column, the partition pruning

Re: [I] Support metadata compaction [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on issue #270: URL: https://github.com/apache/iceberg-python/issues/270#issuecomment-2715129289 Hi @ZENOTME thanks for bringing this up. In pyiceberg, `_SnapshotProducer` defines the general structure of "things that are changed to produce a new snapshot." The `

Re: [I] Add files to add existing Parquet files to a table [iceberg-rust]

2025-03-11 Thread via GitHub
jonathanc-n commented on issue #932: URL: https://github.com/apache/iceberg-rust/issues/932#issuecomment-2711368305 Don't believe so, there a bunch of follow up prs that should be done before this is closed -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] Support full table scanning of partitioned table is prohibited [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on issue #12474: URL: https://github.com/apache/iceberg/issues/12474#issuecomment-279346 This probably couldn't be done at the table format level since engines are the ones that decide what files they need to plan a query. This means you could do this but it wou

Re: [PR] Add pull-request template [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #1777: URL: https://github.com/apache/iceberg-python/pull/1777#discussion_r1989768908 ## .github/pull_request_template.md: ## Review Comment: Need the Apache license headers in the file. -- This is an automated message from the Apache

Re: [I] Refactor setting the `max_changed_partitions_for_summaries` [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on issue #1779: URL: https://github.com/apache/iceberg-python/issues/1779#issuecomment-2715112227 @stevie9868 assigned to you, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] feat(internal): Adding BinPacker [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade merged PR #321: URL: https://github.com/apache/iceberg-go/pull/321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Data Integrity Issue with DELETE Operation Using Copy-on-Write (COW) and Equality Deletes [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on issue #12467: URL: https://github.com/apache/iceberg/issues/12467#issuecomment-2704499284 Checking this out in the debugger, the Scan produced by CopyOnWrite is not including the equality delete in it's Row level Operation. Will update when I find out why. --

Re: [PR] Build: Bump getdaft from 0.4.4 to 0.4.7 [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on PR #1780: URL: https://github.com/apache/iceberg-python/pull/1780#issuecomment-2715094167 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Spark: Add some tests for variant fixup [iceberg]

2025-03-11 Thread via GitHub
XBaith commented on code in PR #12497: URL: https://github.com/apache/iceberg/pull/12497#discussion_r1988669139 ## core/src/test/java/org/apache/iceberg/RandomVariants.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contr

[PR] Parquet: Implement Variant metrics [iceberg]

2025-03-11 Thread via GitHub
rdblue opened a new pull request, #12496: URL: https://github.com/apache/iceberg/pull/12496 This implements metrics for Variant types stored in Parquet files, using new visitors to produce the metrics. This also refactors the existing metrics code to use a visitor. If I remember corr

Re: [PR] fix: refine doc for write support [iceberg-rust]

2025-03-11 Thread via GitHub
ZENOTME commented on code in PR #999: URL: https://github.com/apache/iceberg-rust/pull/999#discussion_r1987508957 ## crates/iceberg/src/lib.rs: ## @@ -50,6 +50,87 @@ //! Ok(()) //! } //! ``` +//! +//! ## Fast append data to table +//! +//! ```rust, no_run Review Comment:

Re: [PR] Core: lazy init workerPool [iceberg]

2025-03-11 Thread via GitHub
pvary commented on PR #12427: URL: https://github.com/apache/iceberg/pull/12427#issuecomment-2714565202 Any idea how to add a unit test to prevent regression here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Core: Apply correct metric configs in GenericAppenderFactory [iceberg]

2025-03-11 Thread via GitHub
pvary commented on PR #12366: URL: https://github.com/apache/iceberg/pull/12366#issuecomment-2714578594 Merged to main. Thanks for the PR @XBaith! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Core: Apply correct metric configs in GenericAppenderFactory [iceberg]

2025-03-11 Thread via GitHub
pvary merged PR #12366: URL: https://github.com/apache/iceberg/pull/12366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Core: Bulk deletion in RemoveSnapshots [iceberg]

2025-03-11 Thread via GitHub
pvary commented on code in PR #11837: URL: https://github.com/apache/iceberg/pull/11837#discussion_r1989419495 ## core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java: ## @@ -1621,6 +1627,82 @@ public void testRetainFilesOnRetainedBranches() { assertThat(deletedFi

Re: [PR] Add unit test for AddFilesProcedure to check invalid column in partition filter [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on PR #12456: URL: https://github.com/apache/iceberg/pull/12456#issuecomment-2714661123 Thanks @ebyhr and @anuragmantri for reviewing! And thank you @bharos for the PR. Merging! -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Core: lazy init workerPool [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r1989472124 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -197,7 +198,7 @@ protected String targetBranch() { } protected ExecutorService workerP

Re: [PR] Add unit test for AddFilesProcedure to check invalid column in partition filter [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer merged PR #12456: URL: https://github.com/apache/iceberg/pull/12456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Core: lazy init workerPool [iceberg]

2025-03-11 Thread via GitHub
deniskuzZ commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r1989474677 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -197,7 +198,7 @@ protected String targetBranch() { } protected ExecutorService workerP

Re: [I] Bug: Flink data loss after failed to refresh table [iceberg]

2025-03-11 Thread via GitHub
eugeny-stoyka commented on issue #9753: URL: https://github.com/apache/iceberg/issues/9753#issuecomment-2714663613 @Aireed do you know did they fixed it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] feat: Introduce C FFI for iceberg rust [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on PR #966: URL: https://github.com/apache/iceberg-rust/pull/966#issuecomment-2713350950 > cc @manuzhang @liurenjie1024 @Fokko @sdd, what do you think about merging this PR and enabling the community to build on it? I'm fine with this initial version. But still

Re: [I] Hive 4.0.0 can't read Iceberg table while inserting data works fine [iceberg]

2025-03-11 Thread via GitHub
eugeny-stoyka commented on issue #10506: URL: https://github.com/apache/iceberg/issues/10506#issuecomment-2714412976 @nadialeiden have you fixed it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Core: Bulk deletion in RemoveSnapshots [iceberg]

2025-03-11 Thread via GitHub
pvary commented on code in PR #11837: URL: https://github.com/apache/iceberg/pull/11837#discussion_r1989420919 ## core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java: ## @@ -1621,6 +1627,82 @@ public void testRetainFilesOnRetainedBranches() { assertThat(deletedFi

Re: [PR] Core: Bulk deletion in RemoveSnapshots [iceberg]

2025-03-11 Thread via GitHub
pvary commented on PR #11837: URL: https://github.com/apache/iceberg/pull/11837#issuecomment-2714512533 I'm fine with the current approach. Could you please rebase, and address the little nit? Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to

Re: [I] I do not understand the partition error: ValueError: Could not find in old schema: 2: {field}: identity(2) [iceberg-python]

2025-03-11 Thread via GitHub
christophediprima commented on issue #1100: URL: https://github.com/apache/iceberg-python/issues/1100#issuecomment-2714636858 This is not ideal as it always add a partition evolution to the tables... I would like to use StarRocks materialized views that does not support those. -- This is

Re: [PR] API: Speed up Timestamps#toHumanString [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12447: URL: https://github.com/apache/iceberg/pull/12447#discussion_r1989658472 ## api/src/test/java/org/apache/iceberg/transforms/TestTransformUtil.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] API: Speed up Timestamps#toHumanString [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12447: URL: https://github.com/apache/iceberg/pull/12447#discussion_r1989657396 ## api/src/test/java/org/apache/iceberg/transforms/TestTransformUtil.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [I] How to understand "Partition evolution is a metadata operation and does not eagerly rewrite files." [iceberg]

2025-03-11 Thread via GitHub
madeirak commented on issue #12492: URL: https://github.com/apache/iceberg/issues/12492#issuecomment-2714926910 thanks a lot sir Replied Message | From | Russell ***@***.***> | | Date | 03/12/2025 00:02 | | To | apache/iceberg ***@***.***> | | Cc | madeirak ***@**

Re: [PR] API: Speed up Timestamps#toHumanString [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12447: URL: https://github.com/apache/iceberg/pull/12447#discussion_r1989662498 ## api/src/test/java/org/apache/iceberg/transforms/TestTransformUtil.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Spark: Add some tests for variant fixup [iceberg]

2025-03-11 Thread via GitHub
sfc-gh-aixu commented on code in PR #12497: URL: https://github.com/apache/iceberg/pull/12497#discussion_r1989664866 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkFixupTypes.java: ## @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] How to understand "Partition evolution is a metadata operation and does not eagerly rewrite files." [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer closed issue #12492: How to understand "Partition evolution is a metadata operation and does not eagerly rewrite files." URL: https://github.com/apache/iceberg/issues/12492 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Data Integrity Issue with DELETE Operation Using Copy-on-Write (COW) and Equality Deletes [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on issue #12467: URL: https://github.com/apache/iceberg/issues/12467#issuecomment-2714958210 https://github.com/apache/iceberg/pull/12479/files Fix posted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] Refactor setting the `max_changed_partitions_for_summaries` [iceberg-python]

2025-03-11 Thread via GitHub
stevie9868 commented on issue #1779: URL: https://github.com/apache/iceberg-python/issues/1779#issuecomment-2714793794 @Fokko Can I take this up? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Core, Spark 3.5: Apply Equality Deletes when Doing Copy on Write [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on PR #12479: URL: https://github.com/apache/iceberg/pull/12479#issuecomment-2714784523 > For my edification, can you explain why the bug doesn't affect Avro? :) Good eyes. The reason is that Avro files don't have metrics saved! So it doesn't matter if we ign

  1   2   3   >