Re: [PR] feat(core): expose remove_snapshots at Transaction API [iceberg-rust]

2025-03-11 Thread via GitHub
ZENOTME commented on PR #884: URL: https://github.com/apache/iceberg-rust/pull/884#issuecomment-2716708182 > > This is a hard limit in the server implementation. I am not sure about the reason for this design, as I am not familiar enough with the specification of iceberg. > > +1, I a

Re: [PR] feat(core): expose remove_snapshots at Transaction API [iceberg-rust]

2025-03-11 Thread via GitHub
Li0k commented on PR #884: URL: https://github.com/apache/iceberg-rust/pull/884#issuecomment-2716700631 > This is a hard limit in the server implementation. I am not sure about the reason for this design, as I am not familiar enough with the specification of iceberg. +1, I also encou

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 merged PR #12299: URL: https://github.com/apache/iceberg/pull/12299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-11 Thread via GitHub
helmiazizm commented on PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#issuecomment-2716592898 Local test result for `s3fs.S3FileSystem` ![image](https://github.com/user-attachments/assets/7a78dccf-0cf3-403b-a26a-69a309eb27d9) -- This is an automated message

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2716569441 Looks like all comments are addressed, thanks @SanjayMarreddi for all the work! Let us know when you have the follow up PRs for async client configs and doc update! -- This is an

[PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-11 Thread via GitHub
helmiazizm opened a new pull request, #1788: URL: https://github.com/apache/iceberg-python/pull/1788 This pull request introduced `FsspecFileIo` for OSS configuration method as a backup when `PyArrowFileIO` fail. Using `S3FileSystem` class, the method should work as long as the virtual host

Re: [I] Spark mistakenly cleanup written file with successful IRC commits [iceberg]

2025-03-11 Thread via GitHub
puchengy commented on issue #12499: URL: https://github.com/apache/iceberg/issues/12499#issuecomment-2716503961 @nastra thank you very much, I will conduct the relevant back ports. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Spark mistakenly cleanup written file with successful IRC commits [iceberg]

2025-03-11 Thread via GitHub
puchengy closed issue #12499: Spark mistakenly cleanup written file with successful IRC commits URL: https://github.com/apache/iceberg/issues/12499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] [feat] Ability to read table using `version-hint.txt` [iceberg-python]

2025-03-11 Thread via GitHub
djouallah commented on issue #763: URL: https://github.com/apache/iceberg-python/issues/763#issuecomment-2716383319 fwiw, I just gave up and I am using duckdb to read iceberg table , pycieberg is clearly not interested in this scenario -- This is an automated message from the Apache Git S

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990534076 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -121,23 +136,49 @@ public S3FileIO(SerializableSupplier s3) { * @param s3FileIOProperties S

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990522002 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -36,20 +37,53 @@ public static S3InputFile fromLocation( MetricsContext metrics) {

Re: [PR] Scan Delete Support Part 2: introduce `DeleteFileManager` skeleton. Use in `ArrowReader` [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on code in PR #950: URL: https://github.com/apache/iceberg-rust/pull/950#discussion_r1990499327 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [I] when drop a non-Iceberg table , the directory associated with the table was not deleted [iceberg]

2025-03-11 Thread via GitHub
terrytlu commented on issue #11820: URL: https://github.com/apache/iceberg/issues/11820#issuecomment-2716347264 I think there are some confusions in `purge` meaning: https://spark.apache.org/docs/3.5.3/sql-ref-syntax-ddl-drop-table.html in spark reference, purge means skipping hdfs tras

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
HonahX commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990498449 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputFile.java: ## @@ -36,20 +37,53 @@ public static S3InputFile fromLocation( MetricsContext metrics) { r

Re: [PR] Make willingness to contribute in pr template a dropdown [iceberg-rust]

2025-03-11 Thread via GitHub
Xuanwo merged PR #1076: URL: https://github.com/apache/iceberg-rust/pull/1076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] feat: Make duplicate check optional for adding parquet files [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 merged PR #1034: URL: https://github.com/apache/iceberg-rust/pull/1034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] Add an option to skip checking duplicated files when adding existing file in `FastAppendAction`. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 closed issue #1031: Add an option to skip checking duplicated files when adding existing file in `FastAppendAction`. URL: https://github.com/apache/iceberg-rust/issues/1031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] Add an option to skip checking duplicated files when adding existing file in `FastAppendAction`. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 closed issue #1031: Add an option to skip checking duplicated files when adding existing file in `FastAppendAction`. URL: https://github.com/apache/iceberg-rust/issues/1031 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] feat: Make duplicate check optional for adding parquet files [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on PR #1034: URL: https://github.com/apache/iceberg-rust/pull/1034#issuecomment-2716252382 Thanks @jonathanc-n for this pr! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] Make willingness to contribute in pr template a dropdown [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 opened a new pull request, #1076: URL: https://github.com/apache/iceberg-rust/pull/1076 ## What changes are included in this PR? Make willingness in pr template an dropdown rather a checkbox, since it only allows one selection. -- This is an automated message

Re: [PR] Support In and notIn operators in ParquetFilters.ConvertFilterToParquet [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on code in PR #12449: URL: https://github.com/apache/iceberg/pull/12449#discussion_r1990442821 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -204,6 +204,10 @@ public Schema(int schemaId, NestedField... columns) { this(schemaId, Arrays.asList(co

Re: [I] Replace `humantime` crate with `jiff`. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 commented on issue #1075: URL: https://github.com/apache/iceberg-rust/issues/1075#issuecomment-2716239620 Seems there is not much we can do here, it introduced by `object_store` and `env_logger`, we could only wait for upstream to upgrade. -- This is an automated message fro

Re: [PR] Support In and notIn operators in ParquetFilters.ConvertFilterToParquet [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on code in PR #12449: URL: https://github.com/apache/iceberg/pull/12449#discussion_r1990431986 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquetFilters.java: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[I] Replace `humantime` crate with `jiff`. [iceberg-rust]

2025-03-11 Thread via GitHub
liurenjie1024 opened a new issue, #1075: URL: https://github.com/apache/iceberg-rust/issues/1075 ### Is your feature request related to a problem or challenge? Related #1073, `humantime` is no longer maintained. ### Describe the solution you'd like Replace with `jiff`

Re: [PR] Spark: Structured Streaming read limit support follow-up [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on PR #12260: URL: https://github.com/apache/iceberg/pull/12260#issuecomment-2716153564 @RussellSpitzer would you mind reviewing this when you have some time? It is a small change which @singhpk234 has already reviewed and approved. -- This is an automated message from th

Re: [PR] Core: Add KLL Datasketch as standard blob types to puffin file [iceberg]

2025-03-11 Thread via GitHub
nastra commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2713408790 @deniskuzZ you might just want to follow-up on the dev list to revive the discussion. Once it's clear in which direction the proposal is going and what Spec changes are required, you'd go

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on PR #12117: URL: https://github.com/apache/iceberg/pull/12117#issuecomment-2711124370 @nastra I'm a +0 on this, i'm not sure we really are making the situation better since I don't like having two methods that do the same thing (especially when it's just a single

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on PR #12117: URL: https://github.com/apache/iceberg/pull/12117#issuecomment-2716148461 @RussellSpitzer thank you for reviewing the PR. I understand that you're not thrilled with the idea of two functions to do the same thing. However, this is already the case with the part

[PR] feat: (catalog/glue) Add support for CreateTable [iceberg-go]

2025-03-11 Thread via GitHub
dttung2905 opened a new pull request, #326: URL: https://github.com/apache/iceberg-go/pull/326 Hi team, This PR aims to support CreateTable for glue catalog. Below are the list (I think) to be done: - [x] Tested out on a real Glue Catalog .Table was created successfully - [ ] Add un

Re: [PR] Spark: Support singular form of years, months, days, and hours functions [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on code in PR #12117: URL: https://github.com/apache/iceberg/pull/12117#discussion_r1990386780 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownDQL.java: ## @@ -84,20 +84,24 @@ public void removeTables()

Re: [I] Consolidate methods of converting parquet file to data file builder. [iceberg-rust]

2025-03-11 Thread via GitHub
jonathanc-n commented on issue #1033: URL: https://github.com/apache/iceberg-rust/issues/1033#issuecomment-2716131268 @mnpw This pull request should be completed by #1074. Sorry about that, the two issues were intertwined. I was only able to test the metadata conversion by completing this a

[PR] feat: Add conversion from `FileMetaData` to `ParquetMetadata` [iceberg-rust]

2025-03-11 Thread via GitHub
jonathanc-n opened a new pull request, #1074: URL: https://github.com/apache/iceberg-rust/pull/1074 ## Which issue does this PR close? - Closes #1033 and #1004. ## What changes are included in this PR? Add conversion from filemetadat to parquet metadata using thrift

Re: [PR] feat: Make duplicate check optional for adding parquet files [iceberg-rust]

2025-03-11 Thread via GitHub
jonathanc-n commented on code in PR #1034: URL: https://github.com/apache/iceberg-rust/pull/1034#discussion_r1990344431 ## crates/iceberg/src/transaction.rs: ## @@ -236,57 +240,59 @@ impl<'a> FastAppendAction<'a> { self.add_data_files(data_files)?; -self.app

Re: [I] PyIceberg - MetaException(message='java.lang.IllegalArgumentException: bucket is null/empty') [iceberg-python]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #1165: URL: https://github.com/apache/iceberg-python/issues/1165#issuecomment-2716022203 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [I] Iceberg defaulting to URLConnectionHttpClient instead of Apache HTTP Client [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #6: URL: https://github.com/apache/iceberg/issues/6#issuecomment-2716018779 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Kafka Connect: Add table to topics mapping property [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on PR #10422: URL: https://github.com/apache/iceberg/pull/10422#issuecomment-2716018681 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] support equality/positional deletes in vectorized arrow reader [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #11120: URL: https://github.com/apache/iceberg/issues/11120#issuecomment-2716018807 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Support queries all branches and tags java api [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] closed issue #11042: Support queries all branches and tags java api URL: https://github.com/apache/iceberg/issues/11042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Support partial insert in merge into command [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #8199: URL: https://github.com/apache/iceberg/issues/8199#issuecomment-2716018617 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Support queries all branches and tags java api [iceberg]

2025-03-11 Thread via GitHub
github-actions[bot] commented on issue #11042: URL: https://github.com/apache/iceberg/issues/11042#issuecomment-2716018745 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Auth Manager API part 6: API enablement [iceberg]

2025-03-11 Thread via GitHub
danielcweeks commented on code in PR #12197: URL: https://github.com/apache/iceberg/pull/12197#discussion_r1990291751 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -200,86 +151,40 @@ private RESTClient httpClient() { return httpClien

Re: [PR] Support for REPLACE TABLE operation [iceberg-python]

2025-03-11 Thread via GitHub
srilman commented on PR #433: URL: https://github.com/apache/iceberg-python/pull/433#issuecomment-2715949876 @anupam-saini Yep was planning to. Feel free to close this one -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[I] Spark mistakenly cleanup written file with successful IRC commits [iceberg]

2025-03-11 Thread via GitHub
puchengy opened a new issue, #12499: URL: https://github.com/apache/iceberg/issues/12499 ### Apache Iceberg version 1.3.0 ### Query engine Spark ### Please describe the bug 🐞 When OOM happens with IRC successful commits, Spark will mistakenly cleanup commit

Re: [I] Applying Filter on Top-Level Struct Columns Throws Error [iceberg-python]

2025-03-11 Thread via GitHub
srilman commented on issue #1778: URL: https://github.com/apache/iceberg-python/issues/1778#issuecomment-2715886974 Sounds good, here is the full stacktrace just in case. Sorry about that, I truncated it to keep the issue description short. ``` /Users/slade/bodo/mono/develop/.pix

Re: [PR] Flink: Support source watermark for flink sql windows [iceberg]

2025-03-11 Thread via GitHub
swapna267 commented on code in PR #12191: URL: https://github.com/apache/iceberg/pull/12191#discussion_r1990251133 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceSql.java: ## @@ -53,7 +55,11 @@ public class TestIcebergSourceSql extends TestSq

Re: [PR] Flink: Support source watermark for flink sql windows [iceberg]

2025-03-11 Thread via GitHub
swapna267 commented on code in PR #12191: URL: https://github.com/apache/iceberg/pull/12191#discussion_r1990246723 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/IcebergTableSource.java: ## @@ -175,6 +178,18 @@ public Result applyFilters(List flinkFilters) {

Re: [I] [feat] Ability to read table using `version-hint.txt` [iceberg-python]

2025-03-11 Thread via GitHub
srilman commented on issue #763: URL: https://github.com/apache/iceberg-python/issues/763#issuecomment-2715883331 @Fokko is this issue still open for working on? For context, we had to build a PyIceberg-based Hadoop Catalog with a subset of features for backwards compatibility when moving B

[PR] Build: Bump sqlalchemy from 2.0.38 to 2.0.39 [iceberg-python]

2025-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1787: URL: https://github.com/apache/iceberg-python/pull/1787 Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.38 to 2.0.39. Release notes Sourced from https://github.com/sqlalchemy/sqlalchemy/releases";>sqlalchemy's

[PR] Build: Bump mkdocstrings-python from 1.16.2 to 1.16.5 [iceberg-python]

2025-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1786: URL: https://github.com/apache/iceberg-python/pull/1786 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.16.2 to 1.16.5. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstr

[PR] feat: Add `NameMapping` [iceberg-rust]

2025-03-11 Thread via GitHub
jonathanc-n opened a new pull request, #1072: URL: https://github.com/apache/iceberg-rust/pull/1072 ## Which issue does this PR close? - Related to #1030. ## What changes are included in this PR? Added `NameMapping` implementation. Includes updating, creating, and ap

Re: [I] Issue during Upsert [iceberg-python]

2025-03-11 Thread via GitHub
mattmartin14 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2715872005 Hey @kevinjqliu , From my original testing, insert filters were not affected by this problem. It was only the overwrite filters that were an issue. Has somethin

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-03-11 Thread via GitHub
dramaticlly commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1990229315 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -344,6 +344,24 @@ default void invalidateTable(TableIdentifier identifier) {} * @throws Al

Re: [PR] Spark: Rewrite V2 deletes to V3 DVs [iceberg]

2025-03-11 Thread via GitHub
danielcweeks merged PR #12250: URL: https://github.com/apache/iceberg/pull/12250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Auth Manager API part 6: API enablement [iceberg]

2025-03-11 Thread via GitHub
adutra commented on code in PR #12197: URL: https://github.com/apache/iceberg/pull/12197#discussion_r1990205373 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -200,86 +151,40 @@ private RESTClient httpClient() { return httpClient;

Re: [PR] Auth Manager API part 6: API enablement [iceberg]

2025-03-11 Thread via GitHub
danielcweeks commented on code in PR #12197: URL: https://github.com/apache/iceberg/pull/12197#discussion_r1990185526 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -200,86 +151,40 @@ private RESTClient httpClient() { return httpClien

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1990190441 ## table/table_test.go: ## @@ -128,3 +138,235 @@ func (t *TableTestSuite) TestSnapshotByName() { t.True(testSnapshot.Equals(*t.tbl.SnapshotByName("test")))

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1990183802 ## table/table_test.go: ## @@ -128,3 +138,235 @@ func (t *TableTestSuite) TestSnapshotByName() { t.True(testSnapshot.Equals(*t.tbl.SnapshotByName("test"))

Re: [PR] feat: add support for azure blob with connection string/sas token/account key [iceberg-go]

2025-03-11 Thread via GitHub
xuhui-lu commented on PR #313: URL: https://github.com/apache/iceberg-go/pull/313#issuecomment-2709438488 > @kevinjqliu @Fokko do you know of any equivalent to running Minio that we could use via a docker image to test the ADLS integration? I am not sure if I could just use the https:

[PR] chore(deps): Bump crate-ci/typos from 1.30.0 to 1.30.2 [iceberg-rust]

2025-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1069: URL: https://github.com/apache/iceberg-rust/pull/1069 Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.30.0 to 1.30.2. Release notes Sourced from https://github.com/crate-ci/typos/releases";>crate-ci/typos's release

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1990162684 ## core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java: ## @@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) { } @Over

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1990161331 ## core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java: ## @@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) { } @Over

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-03-11 Thread via GitHub
RussellSpitzer commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1990157004 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -344,6 +344,24 @@ default void invalidateTable(TableIdentifier identifier) {} * @throws

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715732944 @HonahX could you take a look? Given the fact that we plan to refactor the HTTPClientProperties and other related classes as the next step, it's probably good for you to take a look

Re: [PR] Spark: Use correct statistics file in SparkScan::estimateStatistics(Snapshot) [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on PR #12482: URL: https://github.com/apache/iceberg/pull/12482#issuecomment-2711246177 @huaxingao can you please review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Support metadata compaction [iceberg-python]

2025-03-11 Thread via GitHub
ZENOTME commented on issue #270: URL: https://github.com/apache/iceberg-python/issues/270#issuecomment-2713768454 Hi, recently I'm trying to investigate support rewrite manifest in iceberg-rust. And the design of iceberg-rust is following iceberg-python, basically, but for now, rewrite mani

[PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade opened a new pull request, #330: URL: https://github.com/apache/iceberg-go/pull/330 And here we go! Snapshot producers, basic Transaction object, and a basic implementation of adding files to a table! Complete with unit tests and a general TableWritingTestSuite. --

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade commented on PR #330: URL: https://github.com/apache/iceberg-go/pull/330#issuecomment-2715602441 We're in the home stretch @Fokko @kevinjqliu!! Thanks so much for the quick reviews and feedback on all of these. -- This is an automated message from the Apache Git Service. To res

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990064772 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-03-11 Thread via GitHub
malhotrashivam commented on code in PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#discussion_r1989951348 ## pyiceberg/table/update/schema.py: ## @@ -414,6 +416,7 @@ def update_column( doc=doc if doc is not None else updated.doc,

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715593232 > Are there plans to replace the current s3 client with the async client? Maybe after many versions, once we have enough confidence that it is stable. But probably not in the s

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990058529 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715582900 > It would also be great to outline the migration path going forward. Yes, I think in general there is data point supporting using async client & CRT client makes the performa

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990053279 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputFile.java: ## @@ -75,7 +95,7 @@ public PositionOutputStream createOrOverwrite() { @Override public I

Re: [PR] feat(table): Add computation of iceberg stats from parquet files [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade merged PR #329: URL: https://github.com/apache/iceberg-go/pull/329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715574083 I verified that the async client should only affect S3 FileIO when the feature flag is enabled. `s3Async()` is the factory function that returns a `S3AsyncClient`. It is ca

[PR] Build: Bump getdaft from 0.4.4 to 0.4.7 [iceberg-python]

2025-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1780: URL: https://github.com/apache/iceberg-python/pull/1780 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.4.4 to 0.4.7. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] Parquet: Implement Variant metrics [iceberg]

2025-03-11 Thread via GitHub
rdblue commented on code in PR #12496: URL: https://github.com/apache/iceberg/pull/12496#discussion_r1988118292 ## parquet/src/main/java/org/apache/iceberg/parquet/TypeWithSchemaVisitor.java: ## @@ -211,13 +211,13 @@ private static List visitFields( } private static T

Re: [PR] feat: (catalog/glue) Add support for CreateTable [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade commented on code in PR #326: URL: https://github.com/apache/iceberg-go/pull/326#discussion_r1987481597 ## catalog/glue/glue.go: ## @@ -582,3 +633,16 @@ func filterDatabaseListByType(databases []types.Database, databaseType string) [ return filtered } + +fu

Re: [PR] refactor(manifests): consolidate ManifestEntryV1 and V2 [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade merged PR #327: URL: https://github.com/apache/iceberg-go/pull/327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Core, Spark 3.5: Apply Equality Deletes when Doing Copy on Write [iceberg]

2025-03-11 Thread via GitHub
wypoon commented on PR #12479: URL: https://github.com/apache/iceberg/pull/12479#issuecomment-2715528579 @pvary @RussellSpitzer thanks for answering my question! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989998534 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -72,6 +72,32 @@ public class S3FileIOProperties implements Serializable { pu

Re: [PR] Flink: Support source watermark for flink sql windows [iceberg]

2025-03-11 Thread via GitHub
pvary commented on code in PR #12191: URL: https://github.com/apache/iceberg/pull/12191#discussion_r1988451569 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/source/IcebergTableSource.java: ## @@ -175,6 +178,18 @@ public Result applyFilters(List flinkFilters) {

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1990004839 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -255,6 +256,48 @@ public void testNewInputStreamWithMultiRegionAccessPoin

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715494642 > If you are using the normal code path today with the feature off, with all the separated code paths, you should not be affected at all. yea looking at `aws/src/main/java/org/

Re: [I] Add files to add existing Parquet files to a table [iceberg-rust]

2025-03-11 Thread via GitHub
mkarbo commented on issue #932: URL: https://github.com/apache/iceberg-rust/issues/932#issuecomment-2710478698 @liurenjie1024 @jonathanc-n should this be closed now that https://github.com/apache/iceberg-rust/pull/960 is in? -- This is an automated message from the Apache Git Service. To

Re: [PR] feat(table): Adds updateSnapshotSummary internal function [iceberg-go]

2025-03-11 Thread via GitHub
zeroshade merged PR #317: URL: https://github.com/apache/iceberg-go/pull/317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989992832 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -255,6 +256,48 @@ public void testNewInputStreamWithMultiRegionAccess

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989988680 ## kafka-connect/build.gradle: ## Review Comment: Yeah sure, noted. Thanks -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989988126 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Overrid

Re: [PR] [Do not merge] Iterative `bind` with a stack instead of recursion [iceberg-python]

2025-03-11 Thread via GitHub
Fokko commented on PR #1783: URL: https://github.com/apache/iceberg-python/pull/1783#issuecomment-2715461531 I like the solution! > changing the visitor to an iterative approach seems like a sound solution. are there any reasons we dont want to do this? My only concern is perfo

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989985092 ## kafka-connect/build.gradle: ## Review Comment: We are not adding this to the aws-bundle yet, so it should be fine, but @SanjayMarreddi we should probably

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on PR #12299: URL: https://github.com/apache/iceberg/pull/12299#issuecomment-2715451404 > Have you posted this on the iceberg devlist? Not really, I did not really expect it to be a community discussion since this is a very vendor specific integration for S3 (alth

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
kevinjqliu commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989925583 ## gradle/libs.versions.toml: ## @@ -22,6 +22,7 @@ [versions] activation = "1.1.1" aliyun-sdk-oss = "3.10.2" +analyticsaccelerator = "1.0.0" Review Comment:

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
jackye1995 commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989966659 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Override +

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-03-11 Thread via GitHub
Fokko commented on code in PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#discussion_r1989963103 ## pyiceberg/table/update/schema.py: ## @@ -338,6 +363,7 @@ def _set_column_requirement(self, path: Union[str, Tuple[str, ...]], required: b fiel

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-03-11 Thread via GitHub
malhotrashivam commented on code in PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#discussion_r1989951348 ## pyiceberg/table/update/schema.py: ## @@ -414,6 +416,7 @@ def update_column( doc=doc if doc is not None else updated.doc,

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-03-11 Thread via GitHub
geruh commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1989952400 ## aws/src/main/java/org/apache/iceberg/aws/AwsClientFactories.java: ## @@ -118,6 +119,14 @@ public S3Client s3() { .build(); } +@Override +pu

Re: [PR] API, Core: Add geometry and geography types support [iceberg]

2025-03-11 Thread via GitHub
szehon-ho commented on PR #12346: URL: https://github.com/apache/iceberg/pull/12346#issuecomment-2705309035 Also, (as can't comment on files that are not in the change) Do we need to add Geo types to following places? 1. Types.java TYPES constant? 2. TestSchemaUnionByFieldNa

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-03-11 Thread via GitHub
Fokko commented on code in PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#discussion_r1989938216 ## pyiceberg/table/update/schema.py: ## @@ -212,13 +215,34 @@ def add_column( # assign new IDs in order new_id = self.assign_new_column_id() +

Re: [I] [bug] `bind` visitor causes `RecursionError: maximum recursion depth exceeded` [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on issue #1785: URL: https://github.com/apache/iceberg-python/issues/1785#issuecomment-2715331385 Perhaps we'd want to convert the visitor to an iterative approach, for example #1783 -- This is an automated message from the Apache Git Service. To respond to the messa

  1   2   3   >