Re: [I] Enhancement of ViewMetadata [iceberg]

2024-01-18 Thread via GitHub
pvary commented on issue #9514: URL: https://github.com/apache/iceberg/issues/9514#issuecomment-1899924625 > @nk1506 which particular methods did you have in mind that would make sense to extract into `IcebergMetadata`? @nastra: When managing tables and views in the catalogs we often

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-01-18 Thread via GitHub
bknbkn commented on code in PR #9447: URL: https://github.com/apache/iceberg/pull/9447#discussion_r1458505983 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteManifestsProcedure.java: ## @@ -118,4 +126,15 @@ private InternalRow[] toOutputRows(RewriteM

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-01-18 Thread via GitHub
bknbkn commented on code in PR #9447: URL: https://github.com/apache/iceberg/pull/9447#discussion_r1458505983 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteManifestsProcedure.java: ## @@ -118,4 +126,15 @@ private InternalRow[] toOutputRows(RewriteM

Re: [I] Add update-statement support in the Flink engine [iceberg]

2024-01-18 Thread via GitHub
pvary commented on issue #9517: URL: https://github.com/apache/iceberg/issues/9517#issuecomment-1899911570 If you are updating based on a table key, then `upsert` mode can help you: https://iceberg.apache.org/docs/latest/flink-writes/#upsert ``` INSERT INTO tableName /*+ OPTIONS(

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-01-18 Thread via GitHub
bknbkn commented on code in PR #9447: URL: https://github.com/apache/iceberg/pull/9447#discussion_r1458505983 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteManifestsProcedure.java: ## @@ -118,4 +126,15 @@ private InternalRow[] toOutputRows(RewriteM

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-18 Thread via GitHub
Fokko commented on code in PR #51: URL: https://github.com/apache/iceberg-go/pull/51#discussion_r1458500051 ## catalog/catalog.go: ## @@ -0,0 +1,55 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file

Re: [PR] docs: Add release guide for iceberg-rust [iceberg-rust]

2024-01-18 Thread via GitHub
liurenjie1024 commented on PR #147: URL: https://github.com/apache/iceberg-rust/pull/147#issuecomment-1899908129 cc @Xuanwo Any updates? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-01-18 Thread via GitHub
pvary commented on PR #8907: URL: https://github.com/apache/iceberg/pull/8907#issuecomment-1899904674 > But I don't think we should drag the view support for Hive catalog since REST, Nessie, JDBC already supports it. I am not sure that I would like to support a feature in product

Re: [I] Add instructions on updating `doap.rdf` in the how-to-release guide [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on issue #9522: URL: https://github.com/apache/iceberg/issues/9522#issuecomment-1899900343 Agree. I was about to raise PR today for docs side about adding this info. But I can wait if anyone wants to contribute this. -- This is an automated message from the Apache G

[I] Add instructions on updating `doap.rdf` in the how-to-release guide [iceberg]

2024-01-18 Thread via GitHub
Fokko opened a new issue, #9522: URL: https://github.com/apache/iceberg/issues/9522 ### Feature Request / Improvement Thanks for updating it in https://github.com/apache/iceberg/pull/9507, but this should be done as part of the release. I think we should add it to the [instructions](

Re: [I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

2024-01-18 Thread via GitHub
Fokko commented on issue #278: URL: https://github.com/apache/iceberg-python/issues/278#issuecomment-1899892752 To add some more context. As also mentioned in the earlier conversation, I don't think assigning fresh IDs is safe: https://github.com/apache/iceberg-python/pull/219#discussion_r1

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-01-18 Thread via GitHub
pvary commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1458488742 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Build: Bump pyspark from 3.4.2 to 3.5.0 [iceberg-python]

2024-01-18 Thread via GitHub
Fokko commented on PR #283: URL: https://github.com/apache/iceberg-python/pull/283#issuecomment-1899890357 @HonahX Good one! Maybe we should pass that in through an environment variable? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Spark: Fix flaky TestSparkReaderDeletes tests due to metric not found [iceberg]

2024-01-18 Thread via GitHub
nastra merged PR #9445: URL: https://github.com/apache/iceberg/pull/9445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Flaky test: TestSparkReaderDeletes.testEqualityDeleteWithDeletedColumn [iceberg]

2024-01-18 Thread via GitHub
nastra closed issue #8855: Flaky test: TestSparkReaderDeletes.testEqualityDeleteWithDeletedColumn URL: https://github.com/apache/iceberg/issues/8855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458473153 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return res

Re: [PR] Update ASF DOAP rdf file [iceberg]

2024-01-18 Thread via GitHub
nastra merged PR #9507: URL: https://github.com/apache/iceberg/pull/9507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Bump actions/upload-artifact from 3 to 4 [iceberg]

2024-01-18 Thread via GitHub
nastra commented on PR #9319: URL: https://github.com/apache/iceberg/pull/9319#issuecomment-1899873790 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Spark: backport #8656 and update docs [iceberg]

2024-01-18 Thread via GitHub
nastra merged PR #9512: URL: https://github.com/apache/iceberg/pull/9512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458470028 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return

[I] Encountered ERROR RewriteDataFilesCommitManager: Cannot commit groups [iceberg]

2024-01-18 Thread via GitHub
a8356555 opened a new issue, #9521: URL: https://github.com/apache/iceberg/issues/9521 ### Query engine Spark ### Question I am currently using Flink to stream data into an Iceberg table. The Flink job writes to the Iceberg table every minute. Due to the presence of too

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458469458 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458455714 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return res

Re: [PR] Spark: Fix reading 2 level array issue [iceberg]

2024-01-18 Thread via GitHub
mathfool commented on PR #9515: URL: https://github.com/apache/iceberg/pull/9515#issuecomment-1899854105 @nastra Thanks for the comments, since the code in the lib suggested that 2-level array is supported, so it should work as expected. -- This is an automated message from the Apache Git

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458453032 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return res

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458451545 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcViewOperations.java: ## @@ -0,0 +1,190 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] Spark: backport #8656 and update docs [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on code in PR #9512: URL: https://github.com/apache/iceberg/pull/9512#discussion_r1458332276 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/CreateChangelogViewProcedure.java: ## @@ -49,8 +49,8 @@ /** * A procedure that creates a v

Re: [I] Caused by: java.net.SocketException: Connection reset [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1899706448 There's this old PR: https://github.com/apache/iceberg/pull/4912 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Caused by: java.net.SocketException: Connection reset [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1899697135 Let me see if I can find that discussion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Spec: add multi-arg transform support [iceberg]

2024-01-18 Thread via GitHub
advancedxy commented on PR #8579: URL: https://github.com/apache/iceberg/pull/8579#issuecomment-1899645844 > First of all, we should evaluate other hash functions apart from Murmur3. Parquet, for instance, uses xxHash that is supposed to be much faster > Second, Parquet avoids the modulo

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458258796 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Spark: backport #8656 and update docs [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9512: URL: https://github.com/apache/iceberg/pull/9512#discussion_r1458253122 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/CreateChangelogViewProcedure.java: ## @@ -49,8 +49,8 @@ /** * A procedure that creates

Re: [PR] Build: Bump griffe from 0.39.0 to 0.39.1 [iceberg-python]

2024-01-18 Thread via GitHub
HonahX merged PR #282: URL: https://github.com/apache/iceberg-python/pull/282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Build: Bump griffe from 0.39.0 to 0.39.1 [iceberg-python]

2024-01-18 Thread via GitHub
HonahX commented on PR #282: URL: https://github.com/apache/iceberg-python/pull/282#issuecomment-1899526932 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Core: rewrite should drop delete files by data sequence number partition wise [iceberg]

2024-01-18 Thread via GitHub
szehon-ho commented on code in PR #9454: URL: https://github.com/apache/iceberg/pull/9454#discussion_r1458190189 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -289,13 +321,38 @@ private void invalidateFilteredCache() { cleanUncommitted(SnapshotP

Re: [PR] Build: Bump pyspark from 3.4.2 to 3.5.0 [iceberg-python]

2024-01-18 Thread via GitHub
HonahX commented on PR #283: URL: https://github.com/apache/iceberg-python/pull/283#issuecomment-1899495834 Mark: we also need to update the iceberg-spark-runtime used here: https://github.com/apache/iceberg-python/blob/8f7927b840594adf44f74adaaea105c4cb241a42/tests/integration/test_write

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1458173630 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -303,6 +287,325 @@ public static Properties filterAndRemovePrefix(Map properties, S return

Re: [PR] Apply Name mapping [iceberg-python]

2024-01-18 Thread via GitHub
HonahX merged PR #219: URL: https://github.com/apache/iceberg-python/pull/219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Support 'schema.name-mapping.default' Column Projection property [iceberg-python]

2024-01-18 Thread via GitHub
HonahX closed issue #202: Support 'schema.name-mapping.default' Column Projection property URL: https://github.com/apache/iceberg-python/issues/202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Apply Name mapping [iceberg-python]

2024-01-18 Thread via GitHub
HonahX commented on PR #219: URL: https://github.com/apache/iceberg-python/pull/219#issuecomment-1899489715 All reviews related to "Apply Name Mapping" are resolved. Let's get this in and continue our discussion in https://github.com/apache/iceberg-python/issues/278 😊. Thanks @syun64 --

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
HonahX merged PR #280: URL: https://github.com/apache/iceberg-python/pull/280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
HonahX commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899486620 @syun64 Sorry I was interrupted by other things. Merging! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
manuzhang commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458161912 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map persistedP

Re: [PR] Pushed filters to Parquet file on best effort basis in Vectorized Reader [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on PR #9479: URL: https://github.com/apache/iceberg/pull/9479#issuecomment-1899455430 I think still there are checkstyle failure, revAPI failure (since public method params changed) and a test case failure. Can also locally run `./gradlew clean build -x test -x

Re: [PR] Spec: add multi-arg transform support [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on PR #8579: URL: https://github.com/apache/iceberg/pull/8579#issuecomment-1899444592 @rdblue recently pointed me to the Bloom filter [spec](https://github.com/apache/parquet-format/blob/master/BloomFilter.md) in Parquet. I think it contains a few interesting ideas tha

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458123535 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458123535 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458112966 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Core: rewrite should drop delete files by data sequence number partition wise [iceberg]

2024-01-18 Thread via GitHub
ajantha-bhat commented on code in PR #9454: URL: https://github.com/apache/iceberg/pull/9454#discussion_r1458121128 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -289,13 +321,38 @@ private void invalidateFilteredCache() { cleanUncommitted(Snapsh

Re: [I] Large number of external java packages are not relocated in iceberg-runtime.jar and iceberg-presto-runtime.jar [iceberg]

2024-01-18 Thread via GitHub
github-actions[bot] commented on issue #168: URL: https://github.com/apache/iceberg/issues/168#issuecomment-1899419888 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] Iceberg Table snapshots/manifests using relative path fails to read data [iceberg]

2024-01-18 Thread via GitHub
github-actions[bot] commented on issue #128: URL: https://github.com/apache/iceberg/issues/128#issuecomment-1899419862 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] Have tests which test against the shaded runtime artifacts [iceberg]

2024-01-18 Thread via GitHub
github-actions[bot] commented on issue #257: URL: https://github.com/apache/iceberg/issues/257#issuecomment-1899419948 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [I] Allow reading non-optional unions as struct of optional fields [iceberg]

2024-01-18 Thread via GitHub
github-actions[bot] commented on issue #189: URL: https://github.com/apache/iceberg/issues/189#issuecomment-1899419915 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. T

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458112966 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
manuzhang commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1458104584 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map persistedP

Re: [PR] feat: add support for catalogs with glue implementation to start [iceberg-go]

2024-01-18 Thread via GitHub
wolfeidau commented on PR #51: URL: https://github.com/apache/iceberg-go/pull/51#issuecomment-1899394856 @nastra @zeroshade not sure if either of you saw this, would love some feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Add UnboundSortOrder [iceberg-rust]

2024-01-18 Thread via GitHub
fqaiser94 commented on PR #115: URL: https://github.com/apache/iceberg-rust/pull/115#issuecomment-1899376623 @liurenjie1024 sorry, I think it might be a while before I get back to it. Feel free to take it over/create a new PR if you'd like, I don't want to be a blocker. -- This is an

[PR] Update deploy script to use remote branch and add 1.4.3 updates [iceberg]

2024-01-18 Thread via GitHub
bitsondatadev opened a new pull request, #9519: URL: https://github.com/apache/iceberg/pull/9519 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[PR] Build: Bump pyspark from 3.4.2 to 3.5.0 [iceberg-python]

2024-01-18 Thread via GitHub
dependabot[bot] opened a new pull request, #283: URL: https://github.com/apache/iceberg-python/pull/283 Bumps [pyspark](https://github.com/apache/spark) from 3.4.2 to 3.5.0. Commits https://github.com/apache/spark/commit/ce5ddad990373636e94071e7cef2f31021add07b";>ce5ddad Prepar

[PR] Build: Bump griffe from 0.39.0 to 0.39.1 [iceberg-python]

2024-01-18 Thread via GitHub
dependabot[bot] opened a new pull request, #282: URL: https://github.com/apache/iceberg-python/pull/282 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 0.39.0 to 0.39.1. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

Re: [PR] Apply Name mapping [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on PR #219: URL: https://github.com/apache/iceberg-python/pull/219#issuecomment-1899240128 Need help merging this in as well :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899239713 Thank you for the reviews @Fokko @HonahX . Could I ask for your help in merging it in? :) -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Apply Name mapping [iceberg-python]

2024-01-18 Thread via GitHub
HonahX commented on PR #219: URL: https://github.com/apache/iceberg-python/pull/219#issuecomment-1899230673 Thanks for the great work! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Parquet: Add system config for unsafe Parquet ID fallback. [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9324: URL: https://github.com/apache/iceberg/pull/9324#discussion_r1457974224 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -72,6 +72,19 @@ private SystemConfigs() {} 8, Integer::parseUnsignedInt);

Re: [PR] Parquet: Add system config for unsafe Parquet ID fallback. [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9324: URL: https://github.com/apache/iceberg/pull/9324#discussion_r1457974224 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -72,6 +72,19 @@ private SystemConfigs() {} 8, Integer::parseUnsignedInt);

Re: [PR] Parquet: Add system config for unsafe Parquet ID fallback. [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9324: URL: https://github.com/apache/iceberg/pull/9324#discussion_r1457972686 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -1119,27 +1120,29 @@ public CloseableIterable build() { ParquetReadOptions optio

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899177736 of course! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
Fokko commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899176368 @syun64 The other PR is in 👍 Can you rebase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Fix the CI [iceberg-python]

2024-01-18 Thread via GitHub
Fokko merged PR #279: URL: https://github.com/apache/iceberg-python/pull/279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899173158 No problem, and thank you! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
Fokko commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899169134 @syun64 Thanks, let me poke someone to get it in 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [Bug Fix] all TimeTransforms for falsey values [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on PR #280: URL: https://github.com/apache/iceberg-python/pull/280#issuecomment-1899161809 CI failure related to: https://github.com/apache/iceberg-python/pull/279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9447: URL: https://github.com/apache/iceberg/pull/9447#discussion_r1457935454 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteManifestsProcedure.java: ## @@ -118,4 +126,15 @@ private InternalRow[] toOutputRows(Rew

Re: [PR] Core: rewrite should drop delete files by data sequence number partition wise [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9454: URL: https://github.com/apache/iceberg/pull/9454#discussion_r1457923027 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -289,13 +321,38 @@ private void invalidateFilteredCache() { cleanUncommitted(Snapsho

Re: [PR] Core: rewrite should drop delete files by data sequence number partition wise [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on code in PR #9454: URL: https://github.com/apache/iceberg/pull/9454#discussion_r1457922682 ## core/src/main/java/org/apache/iceberg/ManifestFilterManager.java: ## @@ -289,13 +321,38 @@ private void invalidateFilteredCache() { cleanUncommitted(Snapsho

Re: [PR] Spark: propagate snapshot properties for RewriteDataFiles and RewritePositionDeleteFiles [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on PR #9449: URL: https://github.com/apache/iceberg/pull/9449#issuecomment-1899091569 Thanks, @advancedxy! Thanks for reviewing, @manuzhang @ajantha-bhat! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Spark: propagate snapshot properties for RewriteDataFiles and RewritePositionDeleteFiles [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi merged PR #9449: URL: https://github.com/apache/iceberg/pull/9449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark: Ensure that partition stats files are considered for GC procedures [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on PR #9284: URL: https://github.com/apache/iceberg/pull/9284#issuecomment-1899058600 Thanks, @ajantha-bhat! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Spark: Ensure that partition stats files are considered for GC procedures [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi merged PR #9284: URL: https://github.com/apache/iceberg/pull/9284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Ensure that partition stats files are considered for GC procedures [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi closed issue #9336: Ensure that partition stats files are considered for GC procedures URL: https://github.com/apache/iceberg/issues/9336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Spark: Ensure that partition stats files are considered for GC procedures [iceberg]

2024-01-18 Thread via GitHub
aokolnychyi commented on PR #9284: URL: https://github.com/apache/iceberg/pull/9284#issuecomment-1899057505 Yep, that's fair. Let's keep it as is then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Build: Define strict version for Flink / Jackson / Hive2 / Tez 0.8 [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar merged PR #9484: URL: https://github.com/apache/iceberg/pull/9484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Build: Define strict version for Flink / Jackson / Hive2 / Tez 0.8 [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on PR #9484: URL: https://github.com/apache/iceberg/pull/9484#issuecomment-1899027662 Yeah I think pinning these specific versions for these dependencies is a good idea. Silently pulling in newer dependencies and trying to debug why some test is failing is never f

Re: [PR] API, Core, Spark: Change behavior of fastForward/replace to create the from branch if it does not exist [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on PR #9196: URL: https://github.com/apache/iceberg/pull/9196#issuecomment-1899018685 Yeah that makes sense. The same rationale for changing the procedure behavior could just be applied to changing the API behavior since that's what users would want anyways.

Re: [PR] Spark: Fix reading 2 level array issue [iceberg]

2024-01-18 Thread via GitHub
nastra commented on PR #9515: URL: https://github.com/apache/iceberg/pull/9515#issuecomment-1898924695 Just FYI, the [spec](https://iceberg.apache.org/spec/#parquet) mentions that lists in Parquet must use the 3-level representation -- This is an automated message from the Apache Git Serv

Re: [PR] support python 3.12 [iceberg-python]

2024-01-18 Thread via GitHub
cclauss commented on code in PR #254: URL: https://github.com/apache/iceberg-python/pull/254#discussion_r145887 ## pyproject.toml: ## @@ -71,8 +71,8 @@ adlfs = { version = ">=2023.1.0,<2024.1.0", optional = true } gcsfs = { version = ">=2023.1.0,<2024.1.0", optional = true

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1457704960 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [PR] Spark: Add distribution mode not respected for CTAS/RTAS before 3.5.0 [iceberg]

2024-01-18 Thread via GitHub
nastra merged PR #9439: URL: https://github.com/apache/iceberg/pull/9439 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Document that distribution modes do not work with DataFrameWriterV2 API Create statements [Documentation] [iceberg]

2024-01-18 Thread via GitHub
nastra closed issue #8887: Document that distribution modes do not work with DataFrameWriterV2 API Create statements [Documentation] URL: https://github.com/apache/iceberg/issues/8887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Spark: Add distribution mode not respected for CTAS/RTAS before 3.5.0 [iceberg]

2024-01-18 Thread via GitHub
nastra commented on PR #9439: URL: https://github.com/apache/iceberg/pull/9439#issuecomment-1898834698 thanks @manuzhang for getting this done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1457704960 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map per

Re: [I] Cannot write nullable values to non-null column in the Iceberg Table [iceberg]

2024-01-18 Thread via GitHub
nastra commented on issue #9488: URL: https://github.com/apache/iceberg/issues/9488#issuecomment-1898787154 Sorry it wasn't clear from the description what the goal was and I overlooked the usage of `spark.sql.iceberg.check-nullability`. It's difficult to tell why it doesn't work with

Re: [PR] Core: Remove deprecated method from BaseMetadataTable [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar merged PR #9298: URL: https://github.com/apache/iceberg/pull/9298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
manuzhang commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1457640453 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map persistedP

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
manuzhang commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1457640453 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -90,10 +90,19 @@ private static Map unreservedProperties(Map rawP private static Map persistedP

Re: [I] Enhancement of ViewMetadata [iceberg]

2024-01-18 Thread via GitHub
nk1506 commented on issue #9514: URL: https://github.com/apache/iceberg/issues/9514#issuecomment-1898718003 cc: @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-01-18 Thread via GitHub
amogh-jahagirdar commented on code in PR #9503: URL: https://github.com/apache/iceberg/pull/9503#discussion_r1457616434 ## core/src/test/java/org/apache/iceberg/TestTableMetadata.java: ## @@ -1729,6 +1729,46 @@ public void testNoTrailingLocationSlash() { meta.location()

Re: [PR] Apply Name mapping, new_schema_for_table [iceberg-python]

2024-01-18 Thread via GitHub
syun64 commented on code in PR #219: URL: https://github.com/apache/iceberg-python/pull/219#discussion_r1457609164 ## pyiceberg/io/pyarrow.py: ## @@ -733,42 +854,178 @@ def _get_field_id(field: pa.Field) -> Optional[int]: ) -class _ConvertToIceberg(PyArrowSchemaVisitor[

Re: [PR] Spark: Add distribution mode not respected for CTAS/RTAS before 3.5.0 [iceberg]

2024-01-18 Thread via GitHub
nastra commented on code in PR #9439: URL: https://github.com/apache/iceberg/pull/9439#discussion_r1457608002 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/sql/TestCreateTableAsSelect.java: ## @@ -102,6 +104,18 @@ public void testPartitionedCTAS() { sql("SE

[I] Create Iceberg Table from pyarrow Schema with no IDs [iceberg-python]

2024-01-18 Thread via GitHub
syun64 opened a new issue, #278: URL: https://github.com/apache/iceberg-python/issues/278 ### Feature Request / Improvement I see three ways a user would want to create an Iceberg table: 1. Completely manual - by specifying the schema, field by field 2. By inferring the schema fr

  1   2   >