Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
HonahX commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992593830 ## docs/docs/aws.md: ## @@ -565,6 +565,81 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata For more details on using S3 Accelera

Re: [I] IRC does not support creating V1 table via snapshot procedure (stage create) [iceberg]

2025-03-12 Thread via GitHub
puchengy commented on issue #12500: URL: https://github.com/apache/iceberg/issues/12500#issuecomment-2720114245 @nastra yes, sorry about that, I confirmed backport fixed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-03-12 Thread via GitHub
Xuanwo commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2720050597 Before continuing the discussion, I would like to first explain our current error design goals. In general, error design needs to serve the following objectives: - The imp

Re: [PR] pass proxy configuration from environment vars to http client [iceberg]

2025-03-12 Thread via GitHub
akhilputhiry commented on PR #12406: URL: https://github.com/apache/iceberg/pull/12406#issuecomment-2718419918 @amogh-jahagirdar Could you please take a look at this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] IRC does not support creating V1 table via snapshot procedure (stage create) [iceberg]

2025-03-12 Thread via GitHub
nastra commented on issue #12500: URL: https://github.com/apache/iceberg/issues/12500#issuecomment-2720039171 @puchengy can we close this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] support create table like in flink catalog and watermark in windows [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on PR #12116: URL: https://github.com/apache/iceberg/pull/12116#issuecomment-2719403005 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] AWS: Don't fetch credential from endpoint if properties contain a valid credential [iceberg]

2025-03-12 Thread via GitHub
nastra merged PR #12504: URL: https://github.com/apache/iceberg/pull/12504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] fix: fix delete files sequence comparison [iceberg-rust]

2025-03-12 Thread via GitHub
chenzl25 commented on code in PR #1077: URL: https://github.com/apache/iceberg-rust/pull/1077#discussion_r1992778027 ## crates/iceberg/src/delete_file_index.rs: ## @@ -147,21 +147,21 @@ impl PopulatedDeleteFileIndex { self.global_deletes .iter() -

Re: [I] Use spark sql to insert data into iceberg table will change InputFormat&OutputFormat of the table catalog [iceberg]

2025-03-12 Thread via GitHub
b7wch commented on issue #12510: URL: https://github.com/apache/iceberg/issues/12510#issuecomment-2719933662 I have found the reason. If I want make the table hive readable, I should add 'engine.hive.enabled'='true' in the table DDL. or execute `ALTER TABLE x SET TBLPROPERTIES ('en

Re: [I] Use spark sql to insert data into iceberg table will change InputFormat&OutputFormat of the table catalog [iceberg]

2025-03-12 Thread via GitHub
b7wch closed issue #12510: Use spark sql to insert data into iceberg table will change InputFormat&OutputFormat of the table catalog URL: https://github.com/apache/iceberg/issues/12510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] fix: fix delete files sequence comparison [iceberg-rust]

2025-03-12 Thread via GitHub
chenzl25 commented on code in PR #1077: URL: https://github.com/apache/iceberg-rust/pull/1077#discussion_r1992778027 ## crates/iceberg/src/delete_file_index.rs: ## @@ -147,21 +147,21 @@ impl PopulatedDeleteFileIndex { self.global_deletes .iter() -

Re: [PR] Core: JDBCCatalog's dropView() should purge metadata files if GC is enabled [iceberg]

2025-03-12 Thread via GitHub
hsiang-c commented on PR #12511: URL: https://github.com/apache/iceberg/pull/12511#issuecomment-2719907863 cc @jbonofre @ajantha-bhat Please take a look for me, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] feat: Add conversion from `FileMetaData` to `ParquetMetadata` [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on code in PR #1074: URL: https://github.com/apache/iceberg-rust/pull/1074#discussion_r1992696521 ## Cargo.toml: ## @@ -94,6 +94,7 @@ serde_json = "1.0.138" serde_repr = "0.1.16" serde_with = "3.4" tempfile = "3.18" +thrift = "0.17.0" Review Comment

Re: [PR] refactor: Split transaction module [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on code in PR #1080: URL: https://github.com/apache/iceberg-rust/pull/1080#discussion_r1992675152 ## crates/iceberg/src/transaction/snapshot.rs: ## @@ -0,0 +1,309 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [PR] feat: Add `NameMapping` [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on code in PR #1072: URL: https://github.com/apache/iceberg-rust/pull/1072#discussion_r1992654031 ## crates/iceberg/src/spec/name_mapping.rs: ## @@ -33,14 +44,631 @@ pub struct NameMapping { #[serde(rename_all = "kebab-case")] pub struct MappedField {

Re: [I] De-Duping Rows While Compacting [iceberg]

2025-03-12 Thread via GitHub
haggy commented on issue #8702: URL: https://github.com/apache/iceberg/issues/8702#issuecomment-2719655248 @zenfenan Was there ever any progress made on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[PR] refactor: Split transaction module [iceberg-rust]

2025-03-12 Thread via GitHub
jonathanc-n opened a new pull request, #1080: URL: https://github.com/apache/iceberg-rust/pull/1080 ## Which issue does this PR close? - Closes #980 . ## What changes are included in this PR? Split transactions module -- This is an automated message from the Ap

[PR] Core: JDBCCatalog's dropView() should purge metadata files if GC is enabled [iceberg]

2025-03-12 Thread via GitHub
hsiang-c opened a new pull request, #12511: URL: https://github.com/apache/iceberg/pull/12511 - `HiveCatalog` implemented `dropView` in https://github.com/apache/iceberg/pull/9852 and the view metadata will be purged from storage if `TableProperties.GC_ENABLED` is `true`(default) - I t

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-03-12 Thread via GitHub
connortsui20 commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2719630195 I'll circle back to one of the ideas I brought up: using `thiserror` or `snafu` to help build better error handling. Using either would provide strictly more functionality

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2719611356 I think this is an interesting problem to discuss. `iceberg-rust` is now a multi crate project, so maybe it's worth rethinking the design of error handling. But I think we

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-03-12 Thread via GitHub
rshkv commented on PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#issuecomment-2719624698 With field ids and serving an Iceberg schema (as opposed to Arrow) addressed, this is ready for another view. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] fix: fix delete files sequence comparison [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on code in PR #1077: URL: https://github.com/apache/iceberg-rust/pull/1077#discussion_r1992561376 ## crates/iceberg/src/delete_file_index.rs: ## @@ -147,21 +147,21 @@ impl PopulatedDeleteFileIndex { self.global_deletes .iter() -

[PR] build(deps): bump golang.org/x/net from 0.35.0 to 0.36.0 [iceberg-go]

2025-03-12 Thread via GitHub
dependabot[bot] opened a new pull request, #331: URL: https://github.com/apache/iceberg-go/pull/331 Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.36.0. Commits https://github.com/golang/net/commit/85d1d54551b68719346cb9fec24b911da4e452a1";>85d1d54 go

[I] Use spark sql to insert data into iceberg table will change InputFormat&OutputFormat of the table catalog [iceberg]

2025-03-12 Thread via GitHub
b7wch opened a new issue, #12510: URL: https://github.com/apache/iceberg/issues/12510 ### Query engine **Versions:** Spark Version: version 3.5.5 Iceberg Version: iceberg-spark-runtime-3.5_2.12:1.8.1 Hive Version: 4.0.0 Catalog: HMT Spark-shell Command: ``` spa

Re: [I] Allow non offset based timezones [iceberg-rust]

2025-03-12 Thread via GitHub
liurenjie1024 commented on issue #1078: URL: https://github.com/apache/iceberg-rust/issues/1078#issuecomment-2719602844 Seems this is cause by `arrow`, see https://github.com/apache/arrow-rs/blob/d5339f31a60a4bd8a4256e7120fe32603249d88e/Cargo.toml#L98C51-L98C56 . The temporal transfo

Re: [PR] Auth Manager API part 6: API enablement [iceberg]

2025-03-12 Thread via GitHub
jbonofre commented on code in PR #12197: URL: https://github.com/apache/iceberg/pull/12197#discussion_r1991068708 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -200,86 +151,40 @@ private RESTClient httpClient() { return httpClient;

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-12 Thread via GitHub
helmiazizm commented on code in PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#discussion_r1992537452 ## pyiceberg/io/pyarrow.py: ## @@ -398,31 +402,13 @@ def _initialize_oss_fs(self) -> FileSystem: from pyarrow.fs import S3FileSystem clien

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-12 Thread via GitHub
helmiazizm commented on code in PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#discussion_r1992533665 ## pyiceberg/io/fsspec.py: ## @@ -124,6 +128,22 @@ def _file(_: Properties) -> LocalFileSystem: return LocalFileSystem(auto_mkdir=True) +def _oss(pro

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-03-12 Thread via GitHub
rshkv commented on code in PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#discussion_r1992530783 ## crates/iceberg/src/inspect/metadata_table.rs: ## @@ -59,12 +70,14 @@ pub mod tests { /// or use rust-analyzer (see [video](https://github.com/rust-analyzer/

[PR] feat: Infer partition values statistics [iceberg-rust]

2025-03-12 Thread via GitHub
jonathanc-n opened a new pull request, #1079: URL: https://github.com/apache/iceberg-rust/pull/1079 ## Which issue does this PR close? - Closes #1035. ## What changes are included in this PR? Added API for creating partition struct from statistics ## Are th

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r1992243026 ## core/src/main/java/org/apache/iceberg/AllManifestsTable.java: ## @@ -192,13 +191,11 @@ public List deletes() { @Override public CloseableIterable ro

Re: [PR] Core: Add list/map block sizes [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on PR #10973: URL: https://github.com/apache/iceberg/pull/10973#issuecomment-2719402646 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Adding new rewrite manifest spark action to accept custom partition o… [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on PR #11881: URL: https://github.com/apache/iceberg/pull/11881#issuecomment-2719402957 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Flink: Replace use of deprecated methods [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] closed pull request #11658: Flink: Replace use of deprecated methods URL: https://github.com/apache/iceberg/pull/11658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-03-12 Thread via GitHub
rshkv commented on PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#issuecomment-2719489203 The `arrow-rs` change we needed [(here)][1] got shipped in `54.2.0` which we already picked up [here][2]. That means the `MapBuilder` instances have a key field that preserves our

Re: [I] Cannot commit identity partition on datatypes time,timestamp* using 'fromPartitionString' [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on issue #11085: URL: https://github.com/apache/iceberg/issues/11085#issuecomment-2719402755 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Can we enable adaptive clustering? [iceberg-python]

2025-03-12 Thread via GitHub
myz540 commented on issue #1790: URL: https://github.com/apache/iceberg-python/issues/1790#issuecomment-2719439867 > [@myz540](https://github.com/myz540) can you double check if you're using the latest version? my `pyarrow==16.0.0` and `pyiceberg==0.8.1` I upgraded them to the

Re: [I] Spark mistakenly cleanup written file with successful IRC commits [iceberg]

2025-03-12 Thread via GitHub
nastra commented on issue #12499: URL: https://github.com/apache/iceberg/issues/12499#issuecomment-2716499329 This is something that was fixed by https://github.com/apache/iceberg/issues/8397 and went into Iceberg 1.4.0 -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2025-03-12 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1990928479 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] Having 'object name contains unsupported characters' when inserting using partitionedWriter [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on issue #11051: URL: https://github.com/apache/iceberg/issues/11051#issuecomment-2719402717 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] How to get the specific catalog config from Iceberg REST get config interface? [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on issue #11124: URL: https://github.com/apache/iceberg/issues/11124#issuecomment-2719402826 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Flink: Replace use of deprecated methods [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on PR #11658: URL: https://github.com/apache/iceberg/pull/11658#issuecomment-2719402896 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core, Spark 3.5: Apply Equality Deletes when Doing Copy on Write [iceberg]

2025-03-12 Thread via GitHub
aokolnychyi commented on code in PR #12479: URL: https://github.com/apache/iceberg/pull/12479#discussion_r1992466085 ## core/src/main/java/org/apache/iceberg/DeleteFileIndex.java: ## @@ -541,6 +547,10 @@ private void add( } private Iterable>> deleteManifestReaders()

Re: [PR] Adding new rewrite manifest spark action to accept custom partition o… [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] closed pull request #11881: Adding new rewrite manifest spark action to accept custom partition o… URL: https://github.com/apache/iceberg/pull/11881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Having 'object name contains unsupported characters' when inserting using partitionedWriter [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] closed issue #11051: Having 'object name contains unsupported characters' when inserting using partitionedWriter URL: https://github.com/apache/iceberg/issues/11051 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Huge amount of Aws s3 Exception "Unable to execute HTTP request: The target server failed to respond" during Iceberg v2 table merge with some DeleteFiles + DataFiles in a partition [iceberg]

2025-03-12 Thread via GitHub
github-actions[bot] commented on issue #8218: URL: https://github.com/apache/iceberg/issues/8218#issuecomment-2719402503 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
SanjayMarreddi commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1991923527 ## docs/docs/aws.md: ## @@ -565,6 +565,29 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata For more details on using S3

Re: [PR] Spark 3.4: Backport partition spec inference in spark ADD_FILES procedure [iceberg]

2025-03-12 Thread via GitHub
bharos commented on PR #12508: URL: https://github.com/apache/iceberg/pull/12508#issuecomment-2719308215 @RussellSpitzer PTAL at this backport, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Add pull-request template [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #1777: URL: https://github.com/apache/iceberg-python/pull/1777#discussion_r1992355841 ## .github/pull_request_template.md: ## Review Comment: Since we don't include the `.github/*` directory in the source release tar, we can ignore this. I don

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1991195280 ## table/table_test.go: ## @@ -128,3 +138,236 @@ func (t *TableTestSuite) TestSnapshotByName() { t.True(testSnapshot.Equals(*t.tbl.SnapshotByName("test"))) }

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
zeroshade commented on PR #330: URL: https://github.com/apache/iceberg-go/pull/330#issuecomment-2718373140 > There is quite a bit code here, but if I understand it correctly. We first write Parquet files, and then add them to the table using `AddFiles`. I think this is wrong, since `AddFile

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2025-03-12 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1990926030 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-03-12 Thread via GitHub
Xuanwo commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2717517040 Adding a separate error type for the catalog doesn't make it easier for users; instead, it adds more work on their end. At this stage, I think it would be good to have `Er

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-03-12 Thread via GitHub
jonathanc-n commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2719288938 I agree with @connortsui20 's idea. The current possible enums for ErrorKind is too bare, I think a `NotFound` error would be pretty useful (works well as a generalized erro

Re: [PR] Spark: Remove unused methods from SparkTableUtil [iceberg]

2025-03-12 Thread via GitHub
bharos closed pull request #12509: Spark: Remove unused methods from SparkTableUtil URL: https://github.com/apache/iceberg/pull/12509 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[PR] Spark: Remove unused methods from SparkTableUtil [iceberg]

2025-03-12 Thread via GitHub
bharos opened a new pull request, #12509: URL: https://github.com/apache/iceberg/pull/12509 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[PR] Bump version to 0.10.0 [iceberg-python]

2025-03-12 Thread via GitHub
Fokko opened a new pull request, #1791: URL: https://github.com/apache/iceberg-python/pull/1791 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] [Feature] Add Support for Distributed Write [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on issue #1751: URL: https://github.com/apache/iceberg-python/issues/1751#issuecomment-2719219162 Hey @andormarkus Thanks for sharing. that looks great! I'm all in favor of supporting this. Very much looking forward to the PR Should we support `__bytes__` to return th

[I] Allow non offset based timezones [iceberg-rust]

2025-03-12 Thread via GitHub
Fokko opened a new issue, #1078: URL: https://github.com/apache/iceberg-rust/issues/1078 ### Apache Iceberg Rust version None ### Describe the bug I was trying to make PyIceberg rely solely on Iceberg-rust for the partition transforms, but I ran into the following:

Re: [PR] feat: Add unknown type [iceberg-python]

2025-03-12 Thread via GitHub
Fokko merged PR #1681: URL: https://github.com/apache/iceberg-python/pull/1681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [I] Add V3 type `unknown` [iceberg-python]

2025-03-12 Thread via GitHub
Fokko closed issue #1553: Add V3 type `unknown` URL: https://github.com/apache/iceberg-python/issues/1553 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [I] Can we enable adaptive clustering? [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on issue #1790: URL: https://github.com/apache/iceberg-python/issues/1790#issuecomment-2719153302 @myz540 can you double check if you're using the latest version? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[PR] [Spark] Backport partition spec inference in spark ADD_FILES procedure to spark3.4 [iceberg]

2025-03-12 Thread via GitHub
bharos opened a new pull request, #12508: URL: https://github.com/apache/iceberg/pull/12508 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r1992243026 ## core/src/main/java/org/apache/iceberg/AllManifestsTable.java: ## @@ -192,13 +191,11 @@ public List deletes() { @Override public CloseableIterable ro

Re: [PR] Auth Manager API part 6: API enablement [iceberg]

2025-03-12 Thread via GitHub
jbonofre commented on code in PR #12197: URL: https://github.com/apache/iceberg/pull/12197#discussion_r1992288478 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -200,86 +151,40 @@ private RESTClient httpClient() { return httpClient;

Re: [PR] feat: Add unknown type [iceberg-python]

2025-03-12 Thread via GitHub
kaushiksrini commented on PR #1681: URL: https://github.com/apache/iceberg-python/pull/1681#issuecomment-2719104837 @Fokko, thanks - linted :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
geruh commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992204113 ## docs/docs/aws.md: ## @@ -565,6 +565,81 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata For more details on using S3 Accelerat

Re: [I] Can we enable adaptive clustering? [iceberg-python]

2025-03-12 Thread via GitHub
myz540 commented on issue #1790: URL: https://github.com/apache/iceberg-python/issues/1790#issuecomment-2719096727 > [@myz540](https://github.com/myz540) That's not in there today. However, if you pre-cluster the table before writing, it should maintain order. thanks for your reply.

Re: [I] catalog table-default and table-override properties are not supported in CREATE_OR_REPLACE operation in IRC [iceberg]

2025-03-12 Thread via GitHub
puchengy commented on issue #12506: URL: https://github.com/apache/iceberg/issues/12506#issuecomment-2719094343 Hi @nastra, thank you so much for all the help in the previous issues. I wonder if you can help take a look at this one as well? Thanks -- This is an automated message from the

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r1992257554 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -133,12 +131,18 @@ private > PartitionSpec readPartitionSpec(InputFile inp private static

Re: [PR] Incremental Append Scan [iceberg-python]

2025-03-12 Thread via GitHub
glesperance commented on PR #533: URL: https://github.com/apache/iceberg-python/pull/533#issuecomment-2719072412 @mrendi29 unclear. For now we're running with this: https://github.com/apache/iceberg-python/issues/240#issuecomment-2248323987 . @Fokko would be code in my comment

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r1992228059 ## core/src/main/java/org/apache/iceberg/AllManifestsTable.java: ## @@ -192,13 +191,11 @@ public List deletes() { @Override public CloseableIterable ro

Re: [PR] Core: Use InternalData with avro and common DataIterable for readers. [iceberg]

2025-03-12 Thread via GitHub
danielcweeks commented on code in PR #12476: URL: https://github.com/apache/iceberg/pull/12476#discussion_r1992245698 ## core/src/main/java/org/apache/iceberg/InternalData.java: ## @@ -163,6 +180,11 @@ public interface ReadBuilder { /** Set a custom class for in-memory obje

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
SanjayMarreddi commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992237078 ## docs/docs/aws.md: ## @@ -565,6 +565,81 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata For more details on using S3

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992184908 ## table/arrow_utils_internal_test.go: ## @@ -408,3 +478,158 @@ func TestStatsTypes(t *testing.T) { iceberg.PrimitiveTypes.Int32, }, actual) } +

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2025-03-12 Thread via GitHub
nastra commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r1990998684 ## azure/src/test/java/org/apache/iceberg/azure/AzurePropertiesTest.java: ## @@ -71,6 +78,49 @@ public void testWithSasToken() { verify(clientBuilder, never()).c

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992169082 ## table/table_test.go: ## @@ -128,3 +138,236 @@ func (t *TableTestSuite) TestSnapshotByName() { t.True(testSnapshot.Equals(*t.tbl.SnapshotByName("test"))) }

Re: [I] Issue during Upsert [iceberg-python]

2025-03-12 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2718968518 @mattmartin14 its based on the scale of the input. For insert filters, it depends on the output of `create_match_filter` and the table schema. Essentially anytime `bind

Re: [I] Issue during Upsert [iceberg-python]

2025-03-12 Thread via GitHub
mattmartin14 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2718975527 What if instead of using a filter for the insert rows, we instead do a pyarrow compute anti-left join to identify the rows needing to be inserted? That might help avoid

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
zeroshade commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992213770 ## table/table_test.go: ## @@ -128,3 +138,236 @@ func (t *TableTestSuite) TestSnapshotByName() { t.True(testSnapshot.Equals(*t.tbl.SnapshotByName("test")))

Re: [I] IcebergSinkConnector java.lang.NoClassDefFoundError: org/apache/iceberg/IcebergBuild [iceberg]

2025-03-12 Thread via GitHub
RussellSpitzer commented on issue #12507: URL: https://github.com/apache/iceberg/issues/12507#issuecomment-2718967020 @bryanck Could you confirm on the above points? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] IcebergSinkConnector java.lang.NoClassDefFoundError: org/apache/iceberg/IcebergBuild [iceberg]

2025-03-12 Thread via GitHub
RussellSpitzer commented on issue #12507: URL: https://github.com/apache/iceberg/issues/12507#issuecomment-2718953625 I believe we typically use the *-runtime to indicate shaded version of the jar which doesn't require external dependencies. So in this case it would be iceberg-kafka-connect

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
geruh commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992204113 ## docs/docs/aws.md: ## @@ -565,6 +565,81 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata For more details on using S3 Accelerat

Re: [PR] feat: Add unknown type [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on PR #1681: URL: https://github.com/apache/iceberg-python/pull/1681#issuecomment-2718949160 @kaushiksrini can you run `make lint`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#discussion_r1992191172 ## pyiceberg/io/pyarrow.py: ## @@ -398,31 +402,13 @@ def _initialize_oss_fs(self) -> FileSystem: from pyarrow.fs import S3FileSystem client_kwa

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#discussion_r1992187967 ## pyiceberg/io/fsspec.py: ## @@ -124,6 +128,22 @@ def _file(_: Properties) -> LocalFileSystem: return LocalFileSystem(auto_mkdir=True) +def _oss(properti

Re: [PR] Added `FsspecFileIO` method for OSS, virtual hosted style default to true, standardized key configurations for OSS [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #1788: URL: https://github.com/apache/iceberg-python/pull/1788#discussion_r1992187633 ## pyiceberg/io/fsspec.py: ## @@ -124,6 +128,22 @@ def _file(_: Properties) -> LocalFileSystem: return LocalFileSystem(auto_mkdir=True) +def _oss(properti

Re: [I] Can we enable adaptive clustering? [iceberg-python]

2025-03-12 Thread via GitHub
Fokko commented on issue #1790: URL: https://github.com/apache/iceberg-python/issues/1790#issuecomment-2718898713 @myz540 That's not in there today. However, if you pre-cluster the table before writing, it should maintain order. -- This is an automated message from the Apache Git Service

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992170224 ## table/arrow_utils.go: ## @@ -908,6 +915,260 @@ func must[T any](v T, err error) T { return v } +func primitiveToPhysicalType(typ iceberg.Type) string { +

Re: [I] IcebergSinkConnector java.lang.NoClassDefFoundError: org/apache/iceberg/IcebergBuild [iceberg]

2025-03-12 Thread via GitHub
lk-1984 commented on issue #12507: URL: https://github.com/apache/iceberg/issues/12507#issuecomment-2718880954 As a generic feedback, your webpage and kafka-connect.md has zero references to the JARs required for using the IcebergSinkConnector. https://iceberg.apache.org/docs/nightly/

[I] Correctly handle decimal physicial type mapping [iceberg-python]

2025-03-12 Thread via GitHub
Fokko opened a new issue, #1789: URL: https://github.com/apache/iceberg-python/issues/1789 ### Apache Iceberg version None ### Please describe the bug 🐞 According to the spec: ![image](https://github.com/user-attachments/assets/51c3a4b9-8966-4ad4-bef0-a63b6bb3f5e4)

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992165663 ## table/transaction.go: ## @@ -0,0 +1,346 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE

[I] IcebergSinkConnector java.lang.NoClassDefFoundError: org/apache/iceberg/IcebergBuild [iceberg]

2025-03-12 Thread via GitHub
lk-1984 opened a new issue, #12507: URL: https://github.com/apache/iceberg/issues/12507 ### Apache Iceberg version 1.8.1 (latest release) ### Query engine None ### Please describe the bug 🐞 This class is in the `iceberg-api` which is the compile time depende

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
SanjayMarreddi commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992150004 ## aws/src/main/java/org/apache/iceberg/aws/s3/DefaultS3FileIOAwsClientFactory.java: ## @@ -63,8 +63,15 @@ public S3Client s3() { @Override public S3Asyn

[I] Can we enable adaptive clustering? [iceberg-python]

2025-03-12 Thread via GitHub
myz540 opened a new issue, #1790: URL: https://github.com/apache/iceberg-python/issues/1790 ### Question Does pyiceberg allow us to enable adaptive clustering when creating a table or enable it on an existing table. The relevant sql would be something like ```sql ALT

Re: [PR] feat(table): Basic Transaction and AddFiles [iceberg-go]

2025-03-12 Thread via GitHub
Fokko commented on code in PR #330: URL: https://github.com/apache/iceberg-go/pull/330#discussion_r1992135200 ## table/arrow_utils.go: ## @@ -908,6 +915,260 @@ func must[T any](v T, err error) T { return v } +func primitiveToPhysicalType(typ iceberg.Type) string { +

Re: [I] Status Code for `NamespaceNotEmpty` exception? [iceberg]

2025-03-12 Thread via GitHub
Fokko commented on issue #12502: URL: https://github.com/apache/iceberg/issues/12502#issuecomment-2718819564 I checked using PyIceberg against the Java `rest-fixture` (which comes from this repo), and it returns a 400: ![Image](https://github.com/user-attachments/assets/9fc64eff-150e

[I] catalog table-default and table-override properties are not supported in CREATE_OR_REPLACE operation in IRC [iceberg]

2025-03-12 Thread via GitHub
puchengy opened a new issue, #12506: URL: https://github.com/apache/iceberg/issues/12506 ### Feature Request / Improvement Currently catalog default/ override table properties is supported when: * Iceberg client is using Hive Catalog, or * Iceberg client is using Rest Catalog wi

Re: [PR] pass proxy configuration from environment vars to http client [iceberg]

2025-03-12 Thread via GitHub
flyrain commented on code in PR #12406: URL: https://github.com/apache/iceberg/pull/12406#discussion_r1992052511 ## core/src/main/java/org/apache/iceberg/rest/RESTCatalog.java: ## @@ -55,7 +59,50 @@ public class RESTCatalog public RESTCatalog() { this( SessionCa

Re: [PR] AWS: Update S3 async client configurations and docs for analytics-accelerator-s3 [iceberg]

2025-03-12 Thread via GitHub
geruh commented on code in PR #12503: URL: https://github.com/apache/iceberg/pull/12503#discussion_r1992036454 ## aws/src/main/java/org/apache/iceberg/aws/s3/DefaultS3FileIOAwsClientFactory.java: ## @@ -63,8 +63,15 @@ public S3Client s3() { @Override public S3AsyncClient s

  1   2   >