Re: [PR] Filter rows directly from pa.RecordBatch [iceberg-python]

2025-02-08 Thread via GitHub
gabeiglio commented on PR #1621: URL: https://github.com/apache/iceberg-python/pull/1621#issuecomment-2646111306 Yes, I think it would be better to split these changes in separate PRs since there are a lot of changes to be made to tests specially. (If thats okay ill open the other PR for sc

Re: [PR] Build: Bump mkdocs-material from 9.6.1 to 9.6.3 [iceberg]

2025-02-08 Thread via GitHub
Fokko merged PR #12205: URL: https://github.com/apache/iceberg/pull/12205 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Build: Bump datamodel-code-generator from 0.26.5 to 0.27.2 [iceberg]

2025-02-08 Thread via GitHub
Fokko merged PR #12204: URL: https://github.com/apache/iceberg/pull/12204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Build: Bump com.gradleup.shadow:shadow-gradle-plugin from 8.3.5 to 8.3.6 [iceberg]

2025-02-08 Thread via GitHub
Fokko merged PR #12210: URL: https://github.com/apache/iceberg/pull/12210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-02-08 Thread via GitHub
gruuya commented on PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#issuecomment-2646104599 > The existing get batch stream is designed for simple workloads and I'm guessing query engines need to build its own part distribution logic instead. Got it, makes sense. I'm won

Re: [PR] Fix LICENSE and NOTICE for the kafka-connect-runtime distributions [iceberg]

2025-02-08 Thread via GitHub
jbonofre commented on PR #12195: URL: https://github.com/apache/iceberg/pull/12195#issuecomment-2646099280 I added: 1. Cleanup of ASF projects mentions from `NOTICE` 2. Add formatting in `NOTICE` -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Fix LICENSE and NOTICE for the kafka-connect-runtime distributions [iceberg]

2025-02-08 Thread via GitHub
jbonofre commented on code in PR #12195: URL: https://github.com/apache/iceberg/pull/12195#discussion_r1948022697 ## kafka-connect/kafka-connect-runtime/hive/NOTICE: ## @@ -1517,7 +1317,7 @@ The Apache Software Foundation (http://www.apache.org/). ---

Re: [PR] Fix LICENSE and NOTICE for the kafka-connect-runtime distributions [iceberg]

2025-02-08 Thread via GitHub
jbonofre commented on code in PR #12195: URL: https://github.com/apache/iceberg/pull/12195#discussion_r1948019789 ## kafka-connect/kafka-connect-runtime/hive/NOTICE: ## @@ -120,18 +120,7 @@ The Apache Software Foundation (http://www.apache.org/).

Re: [PR] Fix LICENSE and NOTICE for the kafka-connect-runtime distributions [iceberg]

2025-02-08 Thread via GitHub
jbonofre commented on code in PR #12195: URL: https://github.com/apache/iceberg/pull/12195#discussion_r1948019732 ## kafka-connect/kafka-connect-runtime/hive/LICENSE: ## @@ -1079,27 +1103,14 @@ Project URL (from POM): http://java.sun.com/products/jta

Re: [PR] WIP: explore nanoarrow and sparrow [iceberg-cpp]

2025-02-08 Thread via GitHub
wgtmac commented on PR #44: URL: https://github.com/apache/iceberg-cpp/pull/44#issuecomment-2646093255 I have managed to add `sparrow` as a vendored thirdparty dependency to `libiceberg`. However, there are still two issues that break CI: 1. It cannot compile on Windows due to `int128

Re: [I] `py-io-impl` config propagation [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1589: URL: https://github.com/apache/iceberg-python/issues/1589#issuecomment-2646079484 @bigluck / @smaheshwar-pltr does the explanation make sense? Please LMK if im missing something -- This is an automated message from the Apache Git Service. To respond t

Re: [I] `py-io-impl` config propagation [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1589: URL: https://github.com/apache/iceberg-python/issues/1589#issuecomment-2646079224 Created https://github.com/projectnessie/nessie/issues/10363 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] `py-io-impl` config propagation [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1589: URL: https://github.com/apache/iceberg-python/issues/1589#issuecomment-2646078141 Thanks for chiming in here! I took some time to look into this. Here's what I found: ### On Catalog Configuration The configuration precedence is described in the

Re: [PR] Kafka Connect: Add table to topics mapping property [iceberg]

2025-02-08 Thread via GitHub
igorvoltaic commented on PR #10422: URL: https://github.com/apache/iceberg/pull/10422#issuecomment-2646072018 Ping -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[PR] Build: Bump com.gradleup.shadow:shadow-gradle-plugin from 8.3.5 to 8.3.6 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12210: URL: https://github.com/apache/iceberg/pull/12210 Bumps [com.gradleup.shadow:shadow-gradle-plugin](https://github.com/GradleUp/shadow) from 8.3.5 to 8.3.6. Release notes Sourced from https://github.com/GradleUp/shadow/releases";>c

[PR] Build: Bump software.amazon.awssdk:bom from 2.30.11 to 2.30.16 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12208: URL: https://github.com/apache/iceberg/pull/12208 Bumps software.amazon.awssdk:bom from 2.30.11 to 2.30.16. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=soft

[PR] Build: Bump org.apache.httpcomponents.client5:httpclient5 from 5.4.1 to 5.4.2 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12209: URL: https://github.com/apache/iceberg/pull/12209 Bumps [org.apache.httpcomponents.client5:httpclient5](https://github.com/apache/httpcomponents-client) from 5.4.1 to 5.4.2. Changelog Sourced from https://github.com/apache/httpcom

[PR] Build: Bump com.google.cloud:libraries-bom from 26.53.0 to 26.54.0 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12207: URL: https://github.com/apache/iceberg/pull/12207 Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.53.0 to 26.54.0. Release notes Sourced from https://github.com/googleapis/java-cloud-bo

[PR] Build: Bump org.xerial:sqlite-jdbc from 3.48.0.0 to 3.49.0.0 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12206: URL: https://github.com/apache/iceberg/pull/12206 Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.48.0.0 to 3.49.0.0. Release notes Sourced from https://github.com/xerial/sqlite-jdbc/releases";>org.xeri

[PR] Build: Bump mkdocs-material from 9.6.1 to 9.6.3 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12205: URL: https://github.com/apache/iceberg/pull/12205 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.6.1 to 9.6.3. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdocs-

[PR] Build: Bump datamodel-code-generator from 0.26.5 to 0.27.2 [iceberg]

2025-02-08 Thread via GitHub
dependabot[bot] opened a new pull request, #12204: URL: https://github.com/apache/iceberg/pull/12204 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.26.5 to 0.27.2. Release notes Sourced from https://github.com/koxudaxi/datamodel-code-

Re: [I] [bug] Transaction new requirements handling [iceberg-python]

2025-02-08 Thread via GitHub
HonahX commented on issue #1628: URL: https://github.com/apache/iceberg-python/issues/1628#issuecomment-2646063357 Good catch! That should be "new_requirement" instead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] chore: tweak dependabot to bundle all go mod upgrades into the same PR [iceberg-go]

2025-02-08 Thread via GitHub
zeroshade commented on PR #289: URL: https://github.com/apache/iceberg-go/pull/289#issuecomment-2646049606 Should we use the `patterns` argument so that it bundles these particular dependencies rather than all of them? -- This is an automated message from the Apache Git Service. To respon

Re: [PR] build(deps): bump github.com/uptrace/bun/extra/bundebug from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] closed pull request #281: build(deps): bump github.com/uptrace/bun/extra/bundebug from 1.2.8 to 1.2.9 URL: https://github.com/apache/iceberg-go/pull/281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/mssqldialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] commented on PR #285: URL: https://github.com/apache/iceberg-go/pull/285#issuecomment-2646049271 Looks like github.com/uptrace/bun/dialect/mssqldialect is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/mysqldialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] commented on PR #284: URL: https://github.com/apache/iceberg-go/pull/284#issuecomment-2646049262 Looks like github.com/uptrace/bun/dialect/mysqldialect is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/pgdialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] commented on PR #283: URL: https://github.com/apache/iceberg-go/pull/283#issuecomment-2646049264 Looks like github.com/uptrace/bun/dialect/pgdialect is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] build(deps): bump github.com/uptrace/bun/extra/bundebug from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] commented on PR #281: URL: https://github.com/apache/iceberg-go/pull/281#issuecomment-2646049258 Looks like github.com/uptrace/bun/extra/bundebug is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/sqlitedialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] commented on PR #282: URL: https://github.com/apache/iceberg-go/pull/282#issuecomment-2646049279 Looks like github.com/uptrace/bun/dialect/sqlitedialect is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/sqlitedialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] closed pull request #282: build(deps): bump github.com/uptrace/bun/dialect/sqlitedialect from 1.2.8 to 1.2.9 URL: https://github.com/apache/iceberg-go/pull/282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/mssqldialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] closed pull request #285: build(deps): bump github.com/uptrace/bun/dialect/mssqldialect from 1.2.8 to 1.2.9 URL: https://github.com/apache/iceberg-go/pull/285 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/pgdialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] closed pull request #283: build(deps): bump github.com/uptrace/bun/dialect/pgdialect from 1.2.8 to 1.2.9 URL: https://github.com/apache/iceberg-go/pull/283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] build(deps): bump github.com/uptrace/bun/dialect/mysqldialect from 1.2.8 to 1.2.9 [iceberg-go]

2025-02-08 Thread via GitHub
dependabot[bot] closed pull request #284: build(deps): bump github.com/uptrace/bun/dialect/mysqldialect from 1.2.8 to 1.2.9 URL: https://github.com/apache/iceberg-go/pull/284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] chore: upgrade bun and dialects together [iceberg-go]

2025-02-08 Thread via GitHub
zeroshade merged PR #288: URL: https://github.com/apache/iceberg-go/pull/288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Fix LICENSE and NOTICE for the kafka-connect-runtime distributions [iceberg]

2025-02-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #12195: URL: https://github.com/apache/iceberg/pull/12195#discussion_r1947989141 ## kafka-connect/kafka-connect-runtime/hive/NOTICE: ## @@ -120,18 +120,7 @@ The Apache Software Foundation (http://www.apache.org/).

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
hantangwangd commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947986708 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, long

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
hantangwangd commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947986708 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, long

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2646012019 > Is time travel through snapshot_id or timestamp supported for all_* metadata tables? what i mean is that for `all_*` metadata tables, we're essentially doing some

Re: [PR] feat: Support metadata table "Entries" [iceberg-rust]

2025-02-08 Thread via GitHub
rshkv commented on PR #863: URL: https://github.com/apache/iceberg-rust/pull/863#issuecomment-2646010721 > BTW maybe it's easier to discuss if you can push some sample code. Pushed my work-in-progress. Currently failing here: ``` Incorrect datatype for StructArray field \"column_

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-02-08 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2646009466 Still waiting on response from maintainer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Add linter for Markdown files [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] closed issue #10790: Add linter for Markdown files URL: https://github.com/apache/iceberg/issues/10790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Add linter for Markdown files [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on issue #10790: URL: https://github.com/apache/iceberg/issues/10790#issuecomment-2645994236 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Create AWS Glue table from table JSON [iceberg-python]

2025-02-08 Thread via GitHub
github-actions[bot] commented on issue #1025: URL: https://github.com/apache/iceberg-python/issues/1025#issuecomment-2645995119 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [I] flink iceberg may occur duplication when succeed to write datafile and commit but checkpoint fail [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on issue #10765: URL: https://github.com/apache/iceberg/issues/10765#issuecomment-2645994229 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] flink iceberg may occur duplication when succeed to write datafile and commit but checkpoint fail [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] closed issue #10765: flink iceberg may occur duplication when succeed to write datafile and commit but checkpoint fail URL: https://github.com/apache/iceberg/issues/10765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] REST: Avoid deprecated execute without HttpClientResponseHandler [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on PR #11870: URL: https://github.com/apache/iceberg/pull/11870#issuecomment-2645994277 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2645994263 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Unable to merge CDC data into snapshot data. java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Long [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on issue #8333: URL: https://github.com/apache/iceberg/issues/8333#issuecomment-2645994206 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Flink: Fix flaky TestIcebergSourceFailover > testBoundedWithSavepoint [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] closed issue #10671: Flink: Fix flaky TestIcebergSourceFailover > testBoundedWithSavepoint URL: https://github.com/apache/iceberg/issues/10671 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Flink: Fix flaky TestIcebergSourceFailover > testBoundedWithSavepoint [iceberg]

2025-02-08 Thread via GitHub
github-actions[bot] commented on issue #10671: URL: https://github.com/apache/iceberg/issues/10671#issuecomment-2645994223 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] fsspec: Remove `botocore` as a module import [iceberg-python]

2025-02-08 Thread via GitHub
lamramsey commented on PR #1609: URL: https://github.com/apache/iceberg-python/pull/1609#issuecomment-2645972722 Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2645961048 For point 1 - will raise a different PR updating docs for these metadata tables. For point 2, 3, 4 - Is time travel through snapshot_id or timestamp supported f

Re: [PR] position_deletes metadata table [iceberg-python]

2025-02-08 Thread via GitHub
amitgilad3 commented on code in PR #1615: URL: https://github.com/apache/iceberg-python/pull/1615#discussion_r1947963484 ## pyiceberg/table/inspect.py: ## @@ -384,6 +384,41 @@ def _get_all_manifests_schema(self) -> "pa.Schema": all_manifests_schema = all_manifests_sche

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
RussellSpitzer commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947961340 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, lo

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
RussellSpitzer commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947961173 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, lo

Re: [PR] Fix: `SqlCatalog` list_namespaces() should return only sub-namespaces [iceberg-python]

2025-02-08 Thread via GitHub
alessandro-nori commented on PR #1629: URL: https://github.com/apache/iceberg-python/pull/1629#issuecomment-2645938752 @kevinjqliu while working on this I also noticed that `_namespace_exists()` is using an exact comparison instead of `LIKE` for the query. I opened a new issue to track it h

Re: [PR] feat(datafusion): Treat timestamp conversion functions like a cast. [iceberg-rust]

2025-02-08 Thread via GitHub
omerhadari commented on PR #945: URL: https://github.com/apache/iceberg-rust/pull/945#issuecomment-2645926244 > Hey @omerhadari Thanks for working on this. Unfortunately, I'm afraid that this is a pretty complex task. > > When we do like `to_date(ts)` in SQL, that maps to an Iceberg p

[PR] Fix: `SqlCatalog` list_namespaces() should return only sub-namespaces [iceberg-python]

2025-02-08 Thread via GitHub
alessandro-nori opened a new pull request, #1629: URL: https://github.com/apache/iceberg-python/pull/1629 Resolves: https://github.com/apache/iceberg-python/issues/1627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Custom fileio docs [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on PR #1238: URL: https://github.com/apache/iceberg-python/pull/1238#issuecomment-2645924589 @summermousa-vendia sure! thanks for your help -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] Support pushdown filters for non-cast date conversion functions (e.g. to_date) [iceberg-rust]

2025-02-08 Thread via GitHub
omerhadari commented on issue #933: URL: https://github.com/apache/iceberg-rust/issues/933#issuecomment-2645922282 Updated the PR according to this set of problems for now. It doesn't solve the entire issue, but I am not sure I feel comfortable with the approach @liurenjie1024 suggested for

Re: [PR] Added support for Polars DataFrame and LazyFarame [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #1614: URL: https://github.com/apache/iceberg-python/pull/1614#discussion_r1947932757 ## mkdocs/docs/api.md: ## @@ -1533,3 +1533,111 @@ df.show(2) (Showing first 2 rows) ``` + +### Polars + +PyIceberg interfaces closely with Polars Datafram

Re: [I] DayTransform failure for downcasted timestamp column [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1619: URL: https://github.com/apache/iceberg-python/issues/1619#issuecomment-2645906911 Currently Iceberg does not have nanosecond support https://py.iceberg.apache.org/configuration/#nanoseconds-support To compensate, we can automatically downcast t

Re: [PR] Filter rows directly from pa.RecordBatch [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on PR #1621: URL: https://github.com/apache/iceberg-python/pull/1621#issuecomment-2645904838 Looks like theres an issue in CI tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Filter rows directly from pa.RecordBatch [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on PR #1621: URL: https://github.com/apache/iceberg-python/pull/1621#issuecomment-2645904565 I believe so. We can also do this in a follow up PR! I just saw that comment during code review -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] [bug] Transaction new requirements handling [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1628: URL: https://github.com/apache/iceberg-python/issues/1628#issuecomment-2645896063 cc @Fokko @HonahX wanted to double check with yall -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] fsspec: Remove `botocore` as a module import [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu merged PR #1609: URL: https://github.com/apache/iceberg-python/pull/1609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Support partial deletes [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1947926049 ## pyiceberg/table/__init__.py: ## @@ -292,7 +303,13 @@ def _apply(self, updates: Tuple[TableUpdate, ...], requirements: Tuple[TableRequ requireme

[I] [bug] Transaction new requirements handling [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu opened a new issue, #1628: URL: https://github.com/apache/iceberg-python/issues/1628 ### Apache Iceberg version None ### Please describe the bug 🐞 Context: https://github.com/apache/iceberg-python/pull/569#discussion_r1946187013 https://github.com/ap

Re: [PR] Remove old metadata [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#discussion_r1947923652 ## mkdocs/docs/configuration.md: ## @@ -63,6 +63,7 @@ Iceberg tables support table properties to configure table behavior. | `write.parquet.page-row-limit`

Re: [PR] Add all filles metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #1626: URL: https://github.com/apache/iceberg-python/pull/1626#discussion_r1947923174 ## pyiceberg/table/inspect.py: ## @@ -523,7 +523,62 @@ def history(self) -> "pa.Table": return pa.Table.from_pylist(history, schema=history_schema)

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2645885731 Thanks for the contribution!! Appreciate it. Before we close out this issue, i want to double check a few things 1. Documentation, all tables are documented at https

Re: [PR] support all_entries in pyiceberg [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #1608: URL: https://github.com/apache/iceberg-python/pull/1608#discussion_r1947916711 ## tests/integration/test_inspect_table.py: ## @@ -938,3 +938,111 @@ def test_inspect_all_manifests(spark: SparkSession, session_catalog: Catalog, fo lh

Re: [PR] position_deletes metadata table [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on code in PR #1615: URL: https://github.com/apache/iceberg-python/pull/1615#discussion_r1947913465 ## pyiceberg/table/inspect.py: ## @@ -657,3 +717,19 @@ def all_manifests(self) -> "pa.Table": lambda args: self._generate_manifests_table(*args),

Re: [PR] position_deletes metadata table [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on PR #1615: URL: https://github.com/apache/iceberg-python/pull/1615#issuecomment-2645871771 also lets add this to new table to the docs as well https://py.iceberg.apache.org/api/#inspecting-tables -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
hantangwangd commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947899223 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, long

Re: [I] Incorrect Output from `list_namespaces` in `SqlCatalog` [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1627: URL: https://github.com/apache/iceberg-python/issues/1627#issuecomment-2645837770 This is another reason why I raised #813. It'll be great to modify our existing test suite to test behaviors across all catalogs -- This is an automated message from th

Re: [I] Incorrect Output from `list_namespaces` in `SqlCatalog` [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1627: URL: https://github.com/apache/iceberg-python/issues/1627#issuecomment-2645837366 assigned this issue to you. feel free to tag me for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] Incorrect Output from `list_namespaces` in `SqlCatalog` [iceberg-python]

2025-02-08 Thread via GitHub
kevinjqliu commented on issue #1627: URL: https://github.com/apache/iceberg-python/issues/1627#issuecomment-2645837123 Thanks for raising this issue! Yea, the behavior should align with the java implementation and other catalogs. -- This is an automated message from the Apache Git S

[I] Incorrect Output from `list_namespaces` in `SqlCatalog` [iceberg-python]

2025-02-08 Thread via GitHub
alessandro-nori opened a new issue, #1627: URL: https://github.com/apache/iceberg-python/issues/1627 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 When a namespace is provided as a parameter, SqlCatalog.list_namespaces(namespace) returns the

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-02-08 Thread via GitHub
Xuanwo commented on PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#issuecomment-2645818431 > It's our guess that this distinction might arise due to scanning primitives used. JanKaul/iceberg-rust leverages ParquetExec from DataFusion, which is at this point highly optimized,

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
RussellSpitzer commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947861228 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,6 +236,9 @@ public static long adjustSplitSize(long scanSize, int parallelism, lo

[I] How to Pass two default catalog in spark for iceberg usecase [iceberg]

2025-02-08 Thread via GitHub
AwasthiSomesh opened a new issue, #12203: URL: https://github.com/apache/iceberg/issues/12203 ### Query engine Hi Team, How to Pass two default catalog in spark for iceberg use case below example when we are passing it via , or semicolon its not allowed to create spark

Re: [PR] chore: fix Cargo.lock diff always present after `cargo build` [iceberg-rust]

2025-02-08 Thread via GitHub
VVKot commented on PR #952: URL: https://github.com/apache/iceberg-rust/pull/952#issuecomment-2645714793 @Xuanwo @Fokko OK to merge this? CI is happy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Accept date in literal [iceberg-python]

2025-02-08 Thread via GitHub
Fokko commented on PR #1618: URL: https://github.com/apache/iceberg-python/pull/1618#issuecomment-2645566072 Thanks @TennyZhuang for adding this, and thanks @kevinjqliu for the review 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Accept date in literal [iceberg-python]

2025-02-08 Thread via GitHub
Fokko merged PR #1618: URL: https://github.com/apache/iceberg-python/pull/1618 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Accept date in literal [iceberg-python]

2025-02-08 Thread via GitHub
Fokko commented on PR #1618: URL: https://github.com/apache/iceberg-python/pull/1618#issuecomment-2645478647 @TennyZhuang That looks unrelated to your PR, I've restarted the job -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] Enhance iceberg-go to Support Nessie API for All Catalog Operations [iceberg-go]

2025-02-08 Thread via GitHub
shubham-tomar opened a new issue, #291: URL: https://github.com/apache/iceberg-go/issues/291 ### Feature Request / Improvement ### Description The current Catalog interface in iceberg-go is designed for the Iceberg REST catalog, but it does not fully align with Project Nessie’s

[PR] Feat/build deletes row selection implementation [iceberg-rust]

2025-02-08 Thread via GitHub
sdd opened a new pull request, #951: URL: https://github.com/apache/iceberg-rust/pull/951 Third part of delete file read support. See https://github.com/apache/iceberg-rust/issues/630 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[PR] feat: introduce DeleteFileManager skeleton. Use in ArrowReader [iceberg-rust]

2025-02-08 Thread via GitHub
sdd opened a new pull request, #950: URL: https://github.com/apache/iceberg-rust/pull/950 Second part of delete file read support. See https://github.com/apache/iceberg-rust/issues/630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Add unit tests for ColumnarBatchUtil using mocking [iceberg]

2025-02-08 Thread via GitHub
Monika-Rajendran-97 commented on issue #12054: URL: https://github.com/apache/iceberg/issues/12054#issuecomment-2645351284 Hi @aokolnychyi I would like to add tests, if not already -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Accept date in literal [iceberg-python]

2025-02-08 Thread via GitHub
TennyZhuang commented on PR #1618: URL: https://github.com/apache/iceberg-python/pull/1618#issuecomment-2645345620 I have no idea why the CI failed again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2644902831 @kevinjqliu added PR - https://github.com/apache/iceberg-python/pull/1626 for `all_files`, `all_data_files` and `all_delete_files`. Have implemented them in single PR

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2645040103 @amitgilad3 Right back at you! I see you've raised PRs for the remaining ones, will take a look. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-02-08 Thread via GitHub
gruuya commented on PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#issuecomment-2645025164 > Hi, thank you @gruuya for working on this. Most changes look good to me. Waiting for @liurenjie1024 to take another look. Thank you for taking a look! > I also think some

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
amitgilad3 commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2644947259 Awesome work!! @soumya-ghosh - if all goes well next release will have all metadata tables acessable from pyiceberg 🚀 -- This is an automated message from the Apache G

[PR] Add all filles metadata tables [iceberg-python]

2025-02-08 Thread via GitHub
soumya-ghosh opened a new pull request, #1626: URL: https://github.com/apache/iceberg-python/pull/1626 - all_files - all_data_files - all_delete_files Refactored code for files metadata for better reusability -- This is an automated message from the Apache Git Service. To resp

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-08 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2644861873 OK, I have an improved design for loading of delete files in the read pgase that I'll share shortly. We introduce a DeleteFileManager, constructed when ArrowReader gets built a

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

2025-02-08 Thread via GitHub
gruuya commented on code in PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#discussion_r1947560166 ## crates/integration_tests/Cargo.toml: ## @@ -34,5 +34,6 @@ iceberg-catalog-rest = { workspace = true } iceberg-datafusion = { workspace = true } iceberg_test_util

Re: [PR] Core: Fix divide by zero when adjust split size [iceberg]

2025-02-08 Thread via GitHub
hantangwangd commented on code in PR #12201: URL: https://github.com/apache/iceberg/pull/12201#discussion_r1947556326 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -236,8 +236,12 @@ public static long adjustSplitSize(long scanSize, int parallelism, lon

Re: [PR] Docs: add apache amoro(incubating) with iceberg (#11965) [iceberg]

2025-02-08 Thread via GitHub
czy006 commented on code in PR #11966: URL: https://github.com/apache/iceberg/pull/11966#discussion_r1947555133 ## docs/docs/amoro.md: ## @@ -0,0 +1,89 @@ +--- +title: "Apache Amoro" +--- + + +# Apache Amoro With Iceberg + +**[Apache Amoro(incubating)](https://amoro.apache.org)*

  1   2   >