Re: [I] PyIceberg Cookbook [iceberg-python]

2024-10-19 Thread via GitHub
shiv-io commented on issue #1201: URL: https://github.com/apache/iceberg-python/issues/1201#issuecomment-2424052259 @kevinjqliu are you accepting contributions for this cookbook yet? Happy to help if so! -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Custom fileio docs [iceberg-python]

2024-10-19 Thread via GitHub
kevinjqliu commented on code in PR #1238: URL: https://github.com/apache/iceberg-python/pull/1238#discussion_r1807441718 ## mkdocs/docs/configuration.md: ## @@ -47,6 +55,8 @@ Iceberg tables support table properties to configure table behavior. | `commit.manifest.target-size-by

Re: [PR] Feature: Write to branches [iceberg-python]

2024-10-19 Thread via GitHub
kevinjqliu commented on PR #941: URL: https://github.com/apache/iceberg-python/pull/941#issuecomment-2424130438 Thanks for the contribution! I'll take a look. I remember adding support for branch is complicated since we need to consider different edge cases. -- This is an automated m

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
omkenge commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2424103667 Hey @kevinjqliu sorry for the late reply ... So talking about use cases .. Many legacy systems and industries (e.g., government sectors) store dates in compact MMDD format

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
kevinjqliu commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2424128401 Thanks for the context! My opinion is that it's best not to add the `MMDD` format to the pyiceberg library. Here are my reasoning. - I think date parsing can be do

Re: [PR] docs/configuration.md: Documented table properties (#1231) [iceberg-python]

2024-10-19 Thread via GitHub
kevinjqliu commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2424118781 Hi @mths1, Thanks for the feedback. You're right, `write.target-file-size-bytes` does not represent the resulting file's size on disk. It's based on the size of the in-memory ar

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
omkenge commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2424140033 Thank you for your feedback! I understand your concerns. Ok then lets keep it simple and we can closed this PR with your final comment ... -- This is an automated message from

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
omkenge commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2424141706 and also i have some suggestions on other parts also , i will create PR as soon as possible Plz also share your thoughts on that part also Thank you for your time -- This

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-19 Thread via GitHub
rdblue commented on code in PR #11067: URL: https://github.com/apache/iceberg/pull/11067#discussion_r1807491293 ## format/spec.md: ## @@ -158,27 +158,27 @@ Readers should be more permissive because v1 metadata files are allowed in v2 ta Readers may be more strict for metadat

Re: [PR] Docs: Fix incorrect wget command in Flink documentation [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9483: URL: https://github.com/apache/iceberg/pull/9483#issuecomment-2424318735 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] API, Core: Add scan planning apis to REST Catalog [iceberg]

2024-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #11180: URL: https://github.com/apache/iceberg/pull/11180#discussion_r1807542082 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -63,7 +63,7 @@ public class TestBase { public static final PartitionSpec SPEC = Parti

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
omkenge closed pull request #1234: feat: Add support for MMDD date formats URL: https://github.com/apache/iceberg-python/pull/1234 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] feat: Add support for YYYYMMDD date formats [iceberg-python]

2024-10-19 Thread via GitHub
kevinjqliu commented on PR #1234: URL: https://github.com/apache/iceberg-python/pull/1234#issuecomment-2424162564 @omkenge do you mind creating an issue first? We can have a discussion and allow others to chime in before starting the PR -- This is an automated message from the Apache Git

Re: [PR] Spark 3.5: Fix testDeleteFileThenMetadataDelete failure due to table not refreshed [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9551: Spark 3.5: Fix testDeleteFileThenMetadataDelete failure due to table not refreshed URL: https://github.com/apache/iceberg/pull/9551 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Hive: Refactor hive-table commit operation to be used for other operations like view [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9461: URL: https://github.com/apache/iceberg/pull/9461#issuecomment-2424318564 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [I] Concurrent writes failures [iceberg-python]

2024-10-19 Thread via GitHub
reinthal commented on issue #1084: URL: https://github.com/apache/iceberg-python/issues/1084#issuecomment-2423927862 Here's some code that worked for me for me ```python def append_to_table_with_retry(pa_df: pa.Table, table_name: str, catalog: Catalog) -> None: """Appends

[PR] Build: Bump mkdocs-macros-plugin from 1.2.0 to 1.3.6 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11357: URL: https://github.com/apache/iceberg/pull/11357 Bumps [mkdocs-macros-plugin](https://github.com/fralau/mkdocs_macros_plugin) from 1.2.0 to 1.3.6. Changelog Sourced from https://github.com/fralau/mkdocs-macros-plugin/blob/master/C

[PR] Build: Bump datamodel-code-generator from 0.26.1 to 0.26.2 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11356: URL: https://github.com/apache/iceberg/pull/11356 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.26.1 to 0.26.2. Release notes Sourced from https://github.com/koxudaxi/datamodel-code-

Re: [PR] Build: Bump mkdocs-material from 9.5.39 to 9.5.40 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] closed pull request #11309: Build: Bump mkdocs-material from 9.5.39 to 9.5.40 URL: https://github.com/apache/iceberg/pull/11309 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] Build: Bump com.google.cloud:libraries-bom from 26.48.0 to 26.49.0 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11363: URL: https://github.com/apache/iceberg/pull/11363 Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.48.0 to 26.49.0. Release notes Sourced from https://github.com/googleapis/java-cloud-bo

Re: [PR] Build: Bump mkdocs-macros-plugin from 1.2.0 to 1.3.5 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] commented on PR #11310: URL: https://github.com/apache/iceberg/pull/11310#issuecomment-2424571107 Superseded by #11357. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Build: Bump mkdocs-macros-plugin from 1.2.0 to 1.3.5 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] closed pull request #11310: Build: Bump mkdocs-macros-plugin from 1.2.0 to 1.3.5 URL: https://github.com/apache/iceberg/pull/11310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Build: Bump mkdocs-material from 9.5.39 to 9.5.41 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11358: URL: https://github.com/apache/iceberg/pull/11358 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.39 to 9.5.41. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdoc

Re: [PR] Build: Bump mkdocs-material from 9.5.39 to 9.5.40 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] commented on PR #11309: URL: https://github.com/apache/iceberg/pull/11309#issuecomment-2424571132 Superseded by #11358. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[PR] Build: Bump com.palantir.baseline:gradle-baseline-java from 5.69.0 to 5.72.0 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11362: URL: https://github.com/apache/iceberg/pull/11362 Bumps [com.palantir.baseline:gradle-baseline-java](https://github.com/palantir/gradle-baseline) from 5.69.0 to 5.72.0. Release notes Sourced from https://github.com/palantir/gradle

[PR] Build: Bump software.amazon.awssdk:bom from 2.28.21 to 2.28.26 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11359: URL: https://github.com/apache/iceberg/pull/11359 Bumps software.amazon.awssdk:bom from 2.28.21 to 2.28.26. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=soft

[PR] Build: Bump com.google.errorprone:error_prone_annotations from 2.33.0 to 2.34.0 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11360: URL: https://github.com/apache/iceberg/pull/11360 Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.33.0 to 2.34.0. Release notes Sourced from https://github.com/google/error-prone

[PR] Build: Bump calcite from 1.10.0 to 1.38.0 [iceberg]

2024-10-19 Thread via GitHub
dependabot[bot] opened a new pull request, #11361: URL: https://github.com/apache/iceberg/pull/11361 Bumps `calcite` from 1.10.0 to 1.38.0. Updates `org.apache.calcite:calcite-core` from 1.10.0 to 1.38.0 Commits https://github.com/apache/calcite/commit/e5e7faeff5985bc1b234214

Re: [I] Implement incremental update using commit stats (SnapshotSummary) [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed issue #8461: Implement incremental update using commit stats (SnapshotSummary) URL: https://github.com/apache/iceberg/issues/8461 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Implement incremental update using commit stats (SnapshotSummary) [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on issue #8461: URL: https://github.com/apache/iceberg/issues/8461#issuecomment-2424317210 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9447: URL: https://github.com/apache/iceberg/pull/9447#issuecomment-2424318521 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] AWS: Add Option to don't write non current columns in glue schema closes #7584 [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9420: URL: https://github.com/apache/iceberg/pull/9420#issuecomment-2424318504 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Build: Extract Gradle version from `gradle-wrapper.properties` [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9448: URL: https://github.com/apache/iceberg/pull/9448#issuecomment-2424318529 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] API, Core: Add Schema#withUpdatedDoc and View#updateColumnDoc APIs [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9414: API, Core: Add Schema#withUpdatedDoc and View#updateColumnDoc APIs URL: https://github.com/apache/iceberg/pull/9414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Spark 3.5: Support specifying filter in RewriteManifestsProcedure [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9447: Spark 3.5: Support specifying filter in RewriteManifestsProcedure URL: https://github.com/apache/iceberg/pull/9447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Pushed filters to Parquet file on best effort basis in Vectorized Reader [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9479: URL: https://github.com/apache/iceberg/pull/9479#issuecomment-2424318606 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: Fix reading 2 level array issue [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9515: URL: https://github.com/apache/iceberg/pull/9515#issuecomment-2424319453 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] API, Core: Add Schema#withUpdatedDoc and View#updateColumnDoc APIs [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9414: URL: https://github.com/apache/iceberg/pull/9414#issuecomment-2424318493 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Hive: Refactor hive-table commit operation to be used for other operations like view [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9461: Hive: Refactor hive-table commit operation to be used for other operations like view URL: https://github.com/apache/iceberg/pull/9461 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Feature: Write to branches [iceberg-python]

2024-10-19 Thread via GitHub
vinjai commented on PR #941: URL: https://github.com/apache/iceberg-python/pull/941#issuecomment-2424158737 I have mostly tried to cover all edge cases. The idea is that the branch is just another iceberg table where the snapshots append independently of the main branch. I also agr

[PR] feat: Implement list_views Method and __is_view Utility Function [iceberg-python]

2024-10-19 Thread via GitHub
omkenge opened a new pull request, #1239: URL: https://github.com/apache/iceberg-python/pull/1239 - Implemented the `list_views` method to retrieve a list of views from the specified Glue database. - Added a utility method `__is_view` to check if a table is a view based on its metadata.

Re: [PR] Spec: Support geo type [iceberg]

2024-10-19 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1807496639 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [PR] Spec: Support geo type [iceberg]

2024-10-19 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1807496639 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [PR] Spec: Support geo type [iceberg]

2024-10-19 Thread via GitHub
szehon-ho commented on code in PR #10981: URL: https://github.com/apache/iceberg/pull/10981#discussion_r1807496639 ## format/spec.md: ## @@ -483,6 +485,8 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaN

Re: [PR] Spec: Fix table of content generation [iceberg]

2024-10-19 Thread via GitHub
rdblue commented on PR #11067: URL: https://github.com/apache/iceberg/pull/11067#issuecomment-2424245333 I think this is fine. @danielcweeks is the organization what you want? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[PR] Snapshot `summary` map must have `operation` key [iceberg]

2024-10-19 Thread via GitHub
kevinjqliu opened a new pull request, #11354: URL: https://github.com/apache/iceberg/pull/11354 This PR adds additional constraint in the `SnapshotParser` and also includes tests to verify `Snapshot`'s `summary` field is optional in V1 but required in V2. See https://iceberg.apache.o

Re: [PR] Spark 3.5: Fix testDeleteFileThenMetadataDelete failure due to table not refreshed [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9551: URL: https://github.com/apache/iceberg/pull/9551#issuecomment-2424319486 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: Support min/max/count push down for Identity partition columns [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9457: URL: https://github.com/apache/iceberg/pull/9457#issuecomment-2424318549 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9455: URL: https://github.com/apache/iceberg/pull/9455#issuecomment-2424318541 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark: Fix reading 2 level array issue [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9515: Spark: Fix reading 2 level array issue URL: https://github.com/apache/iceberg/pull/9515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Spark 3.4: Cleanup the code branch for merge distribution mode conf which is no longer needed [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9561: URL: https://github.com/apache/iceberg/pull/9561#issuecomment-2424319504 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Build: Extract Gradle version from `gradle-wrapper.properties` [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9448: Build: Extract Gradle version from `gradle-wrapper.properties` URL: https://github.com/apache/iceberg/pull/9448 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] commented on PR #9503: URL: https://github.com/apache/iceberg/pull/9503#issuecomment-2424319115 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Pushed filters to Parquet file on best effort basis in Vectorized Reader [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9479: Pushed filters to Parquet file on best effort basis in Vectorized Reader URL: https://github.com/apache/iceberg/pull/9479 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spark 3.4: Cleanup the code branch for merge distribution mode conf which is no longer needed [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9561: Spark 3.4: Cleanup the code branch for merge distribution mode conf which is no longer needed URL: https://github.com/apache/iceberg/pull/9561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Docs: Fix incorrect wget command in Flink documentation [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9483: Docs: Fix incorrect wget command in Flink documentation URL: https://github.com/apache/iceberg/pull/9483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Core: Fix setting updated parquet compression property [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9503: Core: Fix setting updated parquet compression property URL: https://github.com/apache/iceberg/pull/9503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Spark: Fix SparkTable to use name and effective snapshotID for comparing [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9455: Spark: Fix SparkTable to use name and effective snapshotID for comparing URL: https://github.com/apache/iceberg/pull/9455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spark: Support min/max/count push down for Identity partition columns [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9457: Spark: Support min/max/count push down for Identity partition columns URL: https://github.com/apache/iceberg/pull/9457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] AWS: Add Option to don't write non current columns in glue schema closes #7584 [iceberg]

2024-10-19 Thread via GitHub
github-actions[bot] closed pull request #9420: AWS: Add Option to don't write non current columns in glue schema closes #7584 URL: https://github.com/apache/iceberg/pull/9420 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] [Feature] Provide Nightly Build to PyPi [iceberg-python]

2024-10-19 Thread via GitHub
djouallah commented on issue #872: URL: https://github.com/apache/iceberg-python/issues/872#issuecomment-2424388455 any news on this, currently pyiceberg is broken with polaris and will like to use the latest update that fix it -- This is an automated message from the Apache Git Service.