Re: [PR] feat: Allow reuse http client in rest catalog [iceberg-rust]

2025-04-15 Thread via GitHub
liurenjie1024 merged PR #1221: URL: https://github.com/apache/iceberg-rust/pull/1221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-15 Thread via GitHub
talatuyarer commented on PR #12808: URL: https://github.com/apache/iceberg/pull/12808#issuecomment-2808517721 @nastra Would you like to review a brand new Catalog which uses CatalogTests 😃 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] GCP: Add Iceberg Catalog for GCP BigQuery Metastore [iceberg]

2025-04-15 Thread via GitHub
talatuyarer commented on PR #11039: URL: https://github.com/apache/iceberg/pull/11039#issuecomment-2808489191 Thank you @hesham-medhat for this PR. I created new pr to address comments. Lets close this and continue on this pr: https://github.com/apache/iceberg/pull/12808 -- This is an a

[PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-15 Thread via GitHub
talatuyarer opened a new pull request, #12808: URL: https://github.com/apache/iceberg/pull/12808 This PR addresses comments from [PR's](https://github.com/apache/iceberg/pull/11039) that introduces initial support for using Google BigQuery as a Metastore Catalog for Apache Iceberg. Unfort

[PR] feat: Allow reuse http client in rest catalog [iceberg-rust]

2025-04-15 Thread via GitHub
Xuanwo opened a new pull request, #1221: URL: https://github.com/apache/iceberg-rust/pull/1221 ## Which issue does this PR close? Creating a reqwest client is not without cost; it involves DNS caching, HTTP connection pooling, and TLS certificate loading. The previous design also made

Re: [PR] Migrate Spark 3.4 ExtensionsTestBase-related tests for Snapshot manipulation, ChangeLogView and Distribution/Ordering [iceberg]

2025-04-15 Thread via GitHub
tomtongue commented on PR #12807: URL: https://github.com/apache/iceberg/pull/12807#issuecomment-2808322499 The test `TestSerializedMetadata.testEmptyVariantMetadata` failed, which seems not related to this commit. Let me retry CI tests -- This is an automated message from the Apache Git

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-15 Thread via GitHub
huaxingao closed pull request #12494: Spark 4.0 integration URL: https://github.com/apache/iceberg/pull/12494 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[PR] Migrate Spark 3.4 ExtensionsTestBase-related tests for Snapshot manipulation, ChangeLogView and Distribution/Ordering [iceberg]

2025-04-15 Thread via GitHub
tomtongue opened a new pull request, #12807: URL: https://github.com/apache/iceberg/pull/12807 *Migrate Spark 3.4 tests based on JUnit 4 to Junit5 with AssertJ style. This is related to https://github.com/apache/iceberg/issues/7160* This PR migrates the below tests, which are related

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-04-15 Thread via GitHub
manuzhang commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2808297951 My example might not be valid, but I'm just thinking whether parsing from a string is a solid solution. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Implement basic support for `rest.sigv4_enabled` for the Iceberg REST catalog [iceberg-rust]

2025-04-15 Thread via GitHub
phillipleblanc commented on PR #917: URL: https://github.com/apache/iceberg-rust/pull/917#issuecomment-2808148922 > Hi @phillipleblanc > > I found this PR while digging into how to query AWS' REST catalog implementation for S3 Tables. What would it take to get this PR merged? How can

Re: [PR] [1.8.x] Build: Bump Parquet from 1.15.0 to 1.15.1 (#12749) [iceberg]

2025-04-15 Thread via GitHub
manuzhang commented on PR #12767: URL: https://github.com/apache/iceberg/pull/12767#issuecomment-2808142028 > If we do, please remove the version numbers from the LICENSE and NOTICE. @jbonofre WDYT? -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-04-15 Thread via GitHub
cgpoh commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2808141309 Thanks @manuzhang for the clarification. Curious, when will there be cases of non-iceberg tables? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-15 Thread via GitHub
aihuaxu commented on PR #12658: URL: https://github.com/apache/iceberg/pull/12658#issuecomment-2808132998 > > @aihuaxu and @rdblue is there a reason we need to explicitly restrict the lower/upper bounds to shredded fields? I would think that the stats pruning would be useful for any field t

Re: [PR] feat: add json serde for table metadata [iceberg-cpp]

2025-04-15 Thread via GitHub
wgtmac commented on code in PR #75: URL: https://github.com/apache/iceberg-cpp/pull/75#discussion_r2045935217 ## src/iceberg/json_internal.h: ## @@ -74,44 +76,32 @@ Result> SortOrderFromJson(const nlohmann::json& json) /// /// \param[in] schema The Iceberg schema to convert.

Re: [PR] Spark 3.5: Use ProcedureInput for MigrateTableProcedure. [iceberg]

2025-04-15 Thread via GitHub
manuzhang commented on code in PR #12782: URL: https://github.com/apache/iceberg/pull/12782#discussion_r2045934410 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java: ## @@ -107,12 +105,10 @@ public InternalRow[] call(InternalRow arg

Re: [PR] Spark 3.5: Use ProcedureInput for SnapshotTableProcedure. [iceberg]

2025-04-15 Thread via GitHub
manuzhang commented on code in PR #12783: URL: https://github.com/apache/iceberg/pull/12783#discussion_r2045934050 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/SnapshotTableProcedure.java: ## @@ -104,11 +106,9 @@ public InternalRow[] call(InternalRow arg

Re: [PR] Build: Specify -XX:-OmitStackTraceInFastThrow in tests [iceberg]

2025-04-15 Thread via GitHub
ebyhr commented on code in PR #12806: URL: https://github.com/apache/iceberg/pull/12806#discussion_r2045903617 ## build.gradle: ## @@ -414,6 +414,7 @@ project(':iceberg-data') { useJUnitPlatform() // Only for TestSplitScan as of Gradle 5.0+ maxHeapSize '1500m' +

Re: [PR] feat: add json serde for table metadata [iceberg-cpp]

2025-04-15 Thread via GitHub
zhjwpku commented on code in PR #75: URL: https://github.com/apache/iceberg-cpp/pull/75#discussion_r2045922052 ## src/iceberg/json_internal.h: ## @@ -74,44 +76,32 @@ Result> SortOrderFromJson(const nlohmann::json& json) /// /// \param[in] schema The Iceberg schema to convert.

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-04-15 Thread via GitHub
manuzhang commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2808120693 @cgpoh I mean even with this change, it looks there are cases non-Iceberg tables can be taken as Iceberg tables. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-04-15 Thread via GitHub
cgpoh commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2808117493 > @cgpoh The problem is that existing Iceberg users are relying on the current implementation of `toString`, changing it to align with Delta would probably break it for the existing users

Re: [PR] Spark: prefix SparkTable with 'iceberg' to clearly identify Iceberg table [iceberg]

2025-04-15 Thread via GitHub
cgpoh commented on PR #12543: URL: https://github.com/apache/iceberg/pull/12543#issuecomment-2808115017 @manuzhang , from DataHub [code](https://github.com/datahub-project/datahub/blob/eb1cd7f38c25ca9c782142d09823eaefd87bbd44/metadata-integration/java/acryl-spark-lineage/src/main/java/datahu

Re: [PR] Build: Specify -XX:-OmitStackTraceInFastThrow in tests [iceberg]

2025-04-15 Thread via GitHub
ebyhr commented on PR #12806: URL: https://github.com/apache/iceberg/pull/12806#issuecomment-2808102126 CI hit #11047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-15 Thread via GitHub
helmiazizm opened a new pull request, #1923: URL: https://github.com/apache/iceberg-python/pull/1923 # Rationale for this change This fix changed the behavior for both `_oss_fs` and `_s3_fs` to be able to parse `s3.force-virtual-addressing` correctly. # Are

Re: [PR] Build: Specify -XX:-OmitStackTraceInFastThrow in tests [iceberg]

2025-04-15 Thread via GitHub
ebyhr commented on code in PR #12806: URL: https://github.com/apache/iceberg/pull/12806#discussion_r2045903617 ## build.gradle: ## @@ -414,6 +414,7 @@ project(':iceberg-data') { useJUnitPlatform() // Only for TestSplitScan as of Gradle 5.0+ maxHeapSize '1500m' +

Re: [PR] fix(catalog/rest): Fix concurrency bug in REST catalog request signing [iceberg-go]

2025-04-15 Thread via GitHub
jhump commented on code in PR #384: URL: https://github.com/apache/iceberg-go/pull/384#discussion_r2045891208 ## catalog/rest/rest.go: ## @@ -221,12 +222,12 @@ func (s *sessionTransport) RoundTrip(r *http.Request) (*http.Response, error) { return

Re: [PR] fix(catalog/rest): Fix concurrency bug in REST catalog request signing [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade commented on PR #384: URL: https://github.com/apache/iceberg-go/pull/384#issuecomment-2808048353 Perfect! I'll take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] fix(catalog/rest): Fix concurrency bug in REST catalog request signing [iceberg-go]

2025-04-15 Thread via GitHub
jhump commented on PR #384: URL: https://github.com/apache/iceberg-go/pull/384#issuecomment-2808047190 > You should be able to inject credentials with the WithAwsConfig option for constructing the catalog, right? Yep. Though this wasn't actually wired up. So I wired it up in 80b3f50b

Re: [PR] fix(catalog/rest): Fix concurrency bug in REST catalog request signing [iceberg-go]

2025-04-15 Thread via GitHub
jhump commented on PR #384: URL: https://github.com/apache/iceberg-go/pull/384#issuecomment-2808046336 I've added a test. The test does not actually verify anything and passes, even if there's a concurrency bug, unless the race detector is enabled. With the race detector enabled, it did fai

Re: [PR] Core: Fix deprecated FileSystem.isDirectory warning and remove redundant code [iceberg]

2025-04-15 Thread via GitHub
ebyhr commented on PR #12805: URL: https://github.com/apache/iceberg/pull/12805#issuecomment-2807948737 CI hit #12676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[PR] Core: Fix deprecated FileSystem.isDirectory warning and remove redundant code [iceberg]

2025-04-15 Thread via GitHub
ebyhr opened a new pull request, #12805: URL: https://github.com/apache/iceberg/pull/12805 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Introduce datafusion engine for sqllogictests. [iceberg-rust]

2025-04-15 Thread via GitHub
jonathanc-n commented on code in PR #1215: URL: https://github.com/apache/iceberg-rust/pull/1215#discussion_r2045784398 ## crates/sqllogictest/src/engine/datafusion.rs: ## @@ -0,0 +1,67 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] docs: fix typo in docstrings [iceberg-rust]

2025-04-15 Thread via GitHub
liurenjie1024 merged PR #1219: URL: https://github.com/apache/iceberg-rust/pull/1219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Spark 3.4: Migrate integration test to JUnit5 [iceberg]

2025-04-15 Thread via GitHub
huaxingao commented on PR #12796: URL: https://github.com/apache/iceberg/pull/12796#issuecomment-2807839272 Merged. Thanks @nastra for the PR! Also thanks @ebyhr for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Spark 3.4: Migrate integration test to JUnit5 [iceberg]

2025-04-15 Thread via GitHub
huaxingao merged PR #12796: URL: https://github.com/apache/iceberg/pull/12796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Document table properties [iceberg-python]

2025-04-15 Thread via GitHub
github-actions[bot] commented on issue #1231: URL: https://github.com/apache/iceberg-python/issues/1231#issuecomment-2807836623 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [I] write.metadata.metrics.max-inferred-column-defaults doesn't respect nested columns [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] commented on issue #11253: URL: https://github.com/apache/iceberg/issues/11253#issuecomment-2807832831 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] write.metadata.metrics.max-inferred-column-defaults doesn't respect nested columns [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] closed issue #11253: write.metadata.metrics.max-inferred-column-defaults doesn't respect nested columns URL: https://github.com/apache/iceberg/issues/11253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Snapshot chain getting broken - data incorrectly removed [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] commented on issue #11243: URL: https://github.com/apache/iceberg/issues/11243#issuecomment-2807832782 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Add User Interface to Iceberg based lakehouse [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] closed issue #10980: Add User Interface to Iceberg based lakehouse URL: https://github.com/apache/iceberg/issues/10980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] Inexplainable behavior for SQLCatalog with Postgres and MinIO [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] commented on issue #11250: URL: https://github.com/apache/iceberg/issues/11250#issuecomment-2807832804 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Inexplainable behavior for SQLCatalog with Postgres and MinIO [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] closed issue #11250: Inexplainable behavior for SQLCatalog with Postgres and MinIO URL: https://github.com/apache/iceberg/issues/11250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] Snapshot chain getting broken - data incorrectly removed [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] closed issue #11243: Snapshot chain getting broken - data incorrectly removed URL: https://github.com/apache/iceberg/issues/11243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Disaster Recovery Options for AWS Athena/Iceberg Integration [iceberg]

2025-04-15 Thread via GitHub
github-actions[bot] closed issue #6619: Disaster Recovery Options for AWS Athena/Iceberg Integration URL: https://github.com/apache/iceberg/issues/6619 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045752422 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -174,6 +174,13 @@ private TableProperties() {} public static final String PARQUET_BLOOM_FI

[I] Cannot query iceberg tables through thrift server with odbc, but maintenance procedures work fine [iceberg]

2025-04-15 Thread via GitHub
fuzing opened a new issue, #12804: URL: https://github.com/apache/iceberg/issues/12804 ### Query engine spark 3.5 ### Question I'm having difficulty querying iceberg tables using thriftserver with odbc. There's no issue calling procedures such as system.rewrite_data

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045745717 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +221,50 @@ public void testTwoLevelList() throws IOException { assertThat(

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045743493 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -477,6 +513,9 @@ private static class Context { private final int bloomFilterMaxByt

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045743113 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -306,33 +308,29 @@ private WriteBuilder createContextFunc( return this; } +

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-15 Thread via GitHub
danielcweeks commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2045730155 ## format/spec.md: ## @@ -418,7 +416,7 @@ Engines may model operations as deleting/inserting rows or as modifications to r This example demonstrates how `_ro

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045731814 ## docs/docs/configuration.md: ## @@ -52,6 +52,8 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.bloom-filte

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-15 Thread via GitHub
danielcweeks commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2045730155 ## format/spec.md: ## @@ -418,7 +416,7 @@ Engines may model operations as deleting/inserting rows or as modifications to r This example demonstrates how `_ro

Re: [PR] Spark 3.5: Use ProcedureInput for MigrateTableProcedure. [iceberg]

2025-04-15 Thread via GitHub
slfan1989 commented on PR #12782: URL: https://github.com/apache/iceberg/pull/12782#issuecomment-2807771479 @manuzhang Could you take some time to review this PR? Thank you very much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxingao commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045696395 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +221,50 @@ public void testTwoLevelList() throws IOException { assertThat(re

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxingao commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045689382 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -477,6 +513,9 @@ private static class Context { private final int bloomFilterMaxBytes

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
dramaticlly commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045632526 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -174,6 +174,13 @@ private TableProperties() {} public static final String PARQUET_BLOOM_FI

[I] Dynamodb write/commit support [iceberg-python]

2025-04-15 Thread via GitHub
mbtx2 opened a new issue, #1921: URL: https://github.com/apache/iceberg-python/issues/1921 ### Feature Request / Improvement It looks like there is no support yet for write operations with dynamodb catalog. Are there any plans on the roadmap to add this functionality? -- This is a

[PR] Adds support for creating a GlueCatalog with own client [iceberg-python]

2025-04-15 Thread via GitHub
rchowell opened a new pull request, #1920: URL: https://github.com/apache/iceberg-python/pull/1920 Closes #1910 # Rationale for this change When working with the GlueCatalog, I may already have a GlueClient that I've instantiated from elsewhere, and perhaps wish to keep. This

Re: [PR] fix: Don't use avro.DefaultSchemaCache [iceberg-go]

2025-04-15 Thread via GitHub
jhump commented on code in PR #385: URL: https://github.com/apache/iceberg-go/pull/385#discussion_r2045597490 ## manifest.go: ## @@ -450,16 +450,13 @@ func fetchManifestEntries(m ManifestFile, fs iceio.IO, discardDeleted bool) ([]M } defer f.Close() - de

Re: [PR] fix: Don't use avro.DefaultSchemaCache [iceberg-go]

2025-04-15 Thread via GitHub
jhump commented on PR #385: URL: https://github.com/apache/iceberg-go/pull/385#issuecomment-2807615906 @zeroshade, I added a test case, but just for manifest lists. Let me know what you think. This test would fail on main. If one removes the newly added options (and let it use the gl

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-15 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2045584975 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDataSourceV2Strategy.scala: ## @@ -148,9 +151,29 @@ case

Re: [I] PyIceberg-Core: Push down Parquet reading to Iceberg-Rust [iceberg-rust]

2025-04-15 Thread via GitHub
ion-elgreco commented on issue #1144: URL: https://github.com/apache/iceberg-rust/issues/1144#issuecomment-2807591131 This can be reasonably straight forward, depending on how you want to handle predicates. SQL strings or expressions. A lazy record batch iterator can be done with this

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807572543 That's still using HadoopFileIO in the trace. Maybe switch to config("spark.sql.catalog.local.io-impl") instead of spark.conf.set -- This is an automated message from the

Re: [I] Add support for pyarrow DurationType [iceberg-python]

2025-04-15 Thread via GitHub
0x26res commented on issue #1900: URL: https://github.com/apache/iceberg-python/issues/1900#issuecomment-2807570132 I guess in python a `datetime.timedelta` (aka duration) is like a `datetime.time`, except a timedelta value can be negative and be greater than a day. In pyarrow, the

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
knowxyz commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807565338 Error Text file [icebergerror.txt](https://github.com/user-attachments/files/19765876/icebergerror.txt) https://github.com/user-attachments/assets/649db579-63ab-4426-

Re: [I] Support creating a GlueCatalog with own client. [iceberg-python]

2025-04-15 Thread via GitHub
rchowell commented on issue #1910: URL: https://github.com/apache/iceberg-python/issues/1910#issuecomment-2807531072 I'll give it a go 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807521025 What error? It shouldn't be touching any hadoop classes now unless the config wasn't set properly. -- This is an automated message from the Apache Git Service. To respond

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
knowxyz commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807511348 i tried use spark.conf.set("spark.sql.catalog.local.io-impl", "org.apache.iceberg.azure.adlsv2.ADLSFileIO") but same error -- This is an automated message from the Apache Git Se

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on PR #12771: URL: https://github.com/apache/iceberg/pull/12771#issuecomment-2807455534 @szehon-ho @dramaticlly @aokolnychyi @flyrain Appreciate if you can review the PR, thanks. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on PR #12771: URL: https://github.com/apache/iceberg/pull/12771#issuecomment-2807446263 @pvary I uploaded a new patch to address review comments, mind to take another look? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message

[PR] docs: fix typo in docstrings [iceberg-rust]

2025-04-15 Thread via GitHub
floscha opened a new pull request, #1219: URL: https://github.com/apache/iceberg-rust/pull/1219 Simply fixes a typo in some docstrings within the *file_io.rs* file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] chore: update avro-cpp for fixes from the upstream [iceberg-cpp]

2025-04-15 Thread via GitHub
Fokko merged PR #76: URL: https://github.com/apache/iceberg-cpp/pull/76 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] feat: snapshot serde [iceberg-cpp]

2025-04-15 Thread via GitHub
Fokko commented on PR #74: URL: https://github.com/apache/iceberg-cpp/pull/74#issuecomment-2807429010 Thanks for working on this @zhjwpku, and thanks for the review @lidavidm, @yingcai-cy and @wgtmac 🙌 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] feat: snapshot serde [iceberg-cpp]

2025-04-15 Thread via GitHub
Fokko merged PR #74: URL: https://github.com/apache/iceberg-cpp/pull/74 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045428396 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkParquetWriter.java: ## @@ -151,4 +152,27 @@ public void testFpp() throws IOException, No

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-15 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2045427291 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -419,9 +445,15 @@ public FileAppender build() throws IOException { .with

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807422641 ``` spark.conf.set("spark.sql.catalog.spark_catalog.io-impl", "org.apache.iceberg.azure.adlsv2.ADLSFileIO") ``` Sets the FileIO for the catalog named "spark_catalog"

Re: [I] Unable to run show command on dataframe on iceberg format data [iceberg]

2025-04-15 Thread via GitHub
knowxyz commented on issue #12802: URL: https://github.com/apache/iceberg/issues/12802#issuecomment-2807400456 # Databricks notebook source from pyspark.sql import SparkSession # Azure Storage Account Details container_name = "csv" mount_point = "/mnt/icebergdata9" # Choose a moun

Re: [PR] ci(catalog): Enable Glue RenameTable test [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade merged PR #383: URL: https://github.com/apache/iceberg-go/pull/383 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] feat: glue table creation with some docs on testing [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade closed pull request #59: feat: glue table creation with some docs on testing URL: https://github.com/apache/iceberg-go/pull/59 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Feat: Add Azure support [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade commented on PR #278: URL: https://github.com/apache/iceberg-go/pull/278#issuecomment-2807329679 Azure support was added in #313 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] feat: glue table creation with some docs on testing [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade commented on PR #59: URL: https://github.com/apache/iceberg-go/pull/59#issuecomment-2807330674 glue table creation was added by a different PR, so this can be closed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Feat: Add Azure support [iceberg-go]

2025-04-15 Thread via GitHub
zeroshade closed pull request #278: Feat: Add Azure support URL: https://github.com/apache/iceberg-go/pull/278 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Fix for metadata entries table for MOR tables containing Delete Files. [iceberg-python]

2025-04-15 Thread via GitHub
guptaakashdeep commented on PR #1902: URL: https://github.com/apache/iceberg-python/pull/1902#issuecomment-2807327843 @Fokko Alright done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2045335206 ## .github/workflows/java-ci.yml: ## @@ -95,7 +95,7 @@ jobs: runs-on: ubuntu-22.04 strategy: matrix: -jvm: [11, 17, 21] Review Commen

Re: [I] Pyiceberg 9.0 not adding files with hour(timestamp) partitions [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on issue #1917: URL: https://github.com/apache/iceberg-python/issues/1917#issuecomment-2807283855 Thanks @MrDerecho for raising this. I checked, and the numbers map to the same hour: ![Image](https://github.com/user-attachments/assets/3f6fa780-48de-4657-86f5-04174d2b

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on code in PR #12644: URL: https://github.com/apache/iceberg/pull/12644#discussion_r2045308728 ## format/spec.md: ## @@ -1605,13 +1611,8 @@ All readers are required to read tables with unknown partition transforms, ignor Writing v3 metadata: * Parti

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-15 Thread via GitHub
RussellSpitzer commented on code in PR #12644: URL: https://github.com/apache/iceberg/pull/12644#discussion_r2045308728 ## format/spec.md: ## @@ -1605,13 +1611,8 @@ All readers are required to read tables with unknown partition transforms, ignor Writing v3 metadata: * Parti

Re: [I] What if identifier_field got deleted or updated in iceberg table? [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on issue #1904: URL: https://github.com/apache/iceberg-python/issues/1904#issuecomment-2807219289 Hey @dvnageshpatil, that's an excellent question. Since we have a pretty strict API for updating the schema, this will be caught when you try to do this: ![Image](https:

Re: [I] What if identifier_field got deleted or updated in iceberg table? [iceberg-python]

2025-04-15 Thread via GitHub
Fokko closed issue #1904: What if identifier_field got deleted or updated in iceberg table? URL: https://github.com/apache/iceberg-python/issues/1904 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Add support for pyarrow DurationType [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on issue #1900: URL: https://github.com/apache/iceberg-python/issues/1900#issuecomment-2807207251 @0x26res Thanks for raising this issue. From what I understand, a `duration` is different from a `time`. Could you elaborate how this would map onto `time`? -- This is an aut

Re: [I] Support creating a GlueCatalog with own client. [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on issue #1910: URL: https://github.com/apache/iceberg-python/issues/1910#issuecomment-2807203122 @rchowell Thanks for raising this, I think it is reasonable to pass in your own client. Are you interested in raising a PR? Happy to review! -- This is an automated message f

[I] Allow casting from `StringType` to `{FloatType,DoubleType}` [iceberg-python]

2025-04-15 Thread via GitHub
Fokko opened a new issue, #1919: URL: https://github.com/apache/iceberg-python/issues/1919 ### Feature Request / Improvement Would be nice to allow this: ![Image](https://github.com/user-attachments/assets/c6c4ac48-1316-4d91-8663-62181b57c2ae) -- This is an automated message

Re: [PR] Documented `row_filter` expressions [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on code in PR #1862: URL: https://github.com/apache/iceberg-python/pull/1862#discussion_r2045260318 ## mkdocs/docs/expression-dsl.md: ## @@ -0,0 +1,259 @@ + + +# Expression DSL + +The PyIceberg library provides a powerful expression DSL (Domain Specific Language

Re: [PR] Documented `row_filter` expressions [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on PR #1862: URL: https://github.com/apache/iceberg-python/pull/1862#issuecomment-2807162341 @norton120 no worries, thanks for following up! 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Fix for metadata entries table for MOR tables containing Delete Files. [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on PR #1902: URL: https://github.com/apache/iceberg-python/pull/1902#issuecomment-2807160265 @guptaakashdeep The CI should be fixed? Can you pull in `main` once more? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Ignore duckdb test [iceberg-python]

2025-04-15 Thread via GitHub
Fokko commented on PR #1918: URL: https://github.com/apache/iceberg-python/pull/1918#issuecomment-2807146577 > unblocks CI :) Indeed! :D Thanks for the quick review @kevinjqliu 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Ignore duckdb test [iceberg-python]

2025-04-15 Thread via GitHub
Fokko merged PR #1918: URL: https://github.com/apache/iceberg-python/pull/1918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Ignore duckdb test [iceberg-python]

2025-04-15 Thread via GitHub
kevinjqliu commented on PR #1918: URL: https://github.com/apache/iceberg-python/pull/1918#issuecomment-2807131676 unblocks CI :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] CatalogTests: Fix listNamespaces Check, Avoid Reserved Keyword, Allow Configurable Location [iceberg]

2025-04-15 Thread via GitHub
talatuyarer commented on code in PR #12768: URL: https://github.com/apache/iceberg/pull/12768#discussion_r2045212696 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -1256,37 +1255,37 @@ public void testTableAuth( required(2, "data", Types.S

Re: [PR] feat: update the supported catalog operations with some operation implementation [iceberg-go]

2025-04-15 Thread via GitHub
xuhui-lu commented on code in PR #396: URL: https://github.com/apache/iceberg-go/pull/396#discussion_r2045211100 ## README.md: ## @@ -60,26 +60,28 @@ $ cd iceberg-go/cmd/iceberg && go build . ### Catalog Support -| Operation| REST | Hive | DynamoDB | Glue |

  1   2   3   >