[PR] Build: Bump datamodel-code-generator from 0.25.8 to 0.25.9 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10917: URL: https://github.com/apache/iceberg/pull/10917 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.25.8 to 0.25.9. Release notes Sourced from https://github.com/koxudaxi/datamodel-code-

[PR] Build: Bump com.google.errorprone:error_prone_annotations from 2.29.2 to 2.30.0 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10915: URL: https://github.com/apache/iceberg/pull/10915 Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.29.2 to 2.30.0. Release notes Sourced from https://github.com/google/error-prone

[PR] Build: Bump org.apache.commons:commons-compress from 1.26.2 to 1.27.0 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10914: URL: https://github.com/apache/iceberg/pull/10914 Bumps org.apache.commons:commons-compress from 1.26.2 to 1.27.0. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-na

[PR] Build: Bump com.google.cloud:libraries-bom from 26.43.0 to 26.44.0 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10916: URL: https://github.com/apache/iceberg/pull/10916 Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.43.0 to 26.44.0. Release notes Sourced from https://github.com/googleapis/java-cloud-bo

[PR] Build: Bump software.amazon.awssdk:bom from 2.26.29 to 2.27.2 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10913: URL: https://github.com/apache/iceberg/pull/10913 Bumps software.amazon.awssdk:bom from 2.26.29 to 2.27.2. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=softw

[PR] Build: Bump org.awaitility:awaitility from 4.2.1 to 4.2.2 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10912: URL: https://github.com/apache/iceberg/pull/10912 Bumps [org.awaitility:awaitility](https://github.com/awaitility/awaitility) from 4.2.1 to 4.2.2. Changelog Sourced from https://github.com/awaitility/awaitility/blob/master/changelo

[PR] Build: Bump org.xerial.snappy:snappy-java from 1.1.10.5 to 1.1.10.6 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10911: URL: https://github.com/apache/iceberg/pull/10911 Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) from 1.1.10.5 to 1.1.10.6. Release notes Sourced from https://github.com/xerial/snappy-java/releases";>o

[PR] Build: Bump nessie from 0.94.4 to 0.95.0 [iceberg]

2024-08-10 Thread via GitHub
dependabot[bot] opened a new pull request, #10910: URL: https://github.com/apache/iceberg/pull/10910 Bumps `nessie` from 0.94.4 to 0.95.0. Updates `org.projectnessie.nessie:nessie-client` from 0.94.4 to 0.95.0 Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.94.4

[I] Support row filter & column masking in REST spec [iceberg]

2024-08-10 Thread via GitHub
shohamyamin opened a new issue, #10909: URL: https://github.com/apache/iceberg/issues/10909 ### Feature Request / Improvement ### Summary: We would like to request the addition of a new feature in the Iceberg REST catalog that would allow the catalog to return a row filter expressi

Re: [PR] Core, API: Add support for renaming tags [iceberg]

2024-08-10 Thread via GitHub
github-actions[bot] commented on PR #4936: URL: https://github.com/apache/iceberg/pull/4936#issuecomment-2282323091 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Can we specify the zorder field when creating a table? [iceberg]

2024-08-10 Thread via GitHub
github-actions[bot] commented on issue #4927: URL: https://github.com/apache/iceberg/issues/4927#issuecomment-2282323082 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] AWS: use pre-created IAM role for AssumeRole related integration tests [iceberg]

2024-08-10 Thread via GitHub
github-actions[bot] commented on PR #4909: URL: https://github.com/apache/iceberg/pull/4909#issuecomment-2282323076 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: new sink base on the unified sink API [iceberg]

2024-08-10 Thread via GitHub
github-actions[bot] commented on PR #4904: URL: https://github.com/apache/iceberg/pull/4904#issuecomment-2282323060 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented [iceberg-python]

2024-08-10 Thread via GitHub
djouallah commented on issue #1013: URL: https://github.com/apache/iceberg-python/issues/1013#issuecomment-2282312521 ah, I see , thank you , it was the catalog for some reason who added all those properties, all good -- This is an automated message from the Apache Git Service. To respo

[I] Fields with mixed datatypes [iceberg-python]

2024-08-10 Thread via GitHub
jayceslesar opened a new issue, #1037: URL: https://github.com/apache/iceberg-python/issues/1037 ### Question How would I go about using a field with mixed datatypes? Is that recommended/possible? I am a fan of tall-tidy data and am wondering how to properly go about the following?

Re: [I] NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented [iceberg-python]

2024-08-10 Thread via GitHub
sungwy commented on issue #1013: URL: https://github.com/apache/iceberg-python/issues/1013#issuecomment-2282181839 Hi @djouallah - could you try using the property `write.parquet.row-group-limit` instead? Unfortunately `write.parquet.row-group-size-bytes` isn't a supported property in PyIc

Re: [I] Prevent `add_files` from adding a file that's already referenced by the Iceberg Table [iceberg-python]

2024-08-10 Thread via GitHub
amitgilad3 commented on issue #998: URL: https://github.com/apache/iceberg-python/issues/998#issuecomment-2281990423 Hey @sungwy - just created my first pr #1036 , would really appreciate your review and if you have any suggestions or if i choose the wrong place to implement my checks.

[PR] prevent adding duplicate files [iceberg-python]

2024-08-10 Thread via GitHub
amitgilad3 opened a new pull request, #1036: URL: https://github.com/apache/iceberg-python/pull/1036 This resolves #998 , where duplicate files are added with add_files method, handles 2 cases: 1. Files list is not unique 2. One of the files added is already referenced by current

Re: [I] Peformance question for to_arrow, to_pandas, to_duckdb [iceberg-python]

2024-08-10 Thread via GitHub
jkleinkauff commented on issue #1032: URL: https://github.com/apache/iceberg-python/issues/1032#issuecomment-2281787859 Hi @kevinjqliu thank you for your time! Those are my findings: I've included a read_parquet method from awswrangler. Don't know why, but it's by far the fast

Re: [PR] Support convert orc timestamptz [iceberg]

2024-08-10 Thread via GitHub
tanvn commented on PR #9905: URL: https://github.com/apache/iceberg/pull/9905#issuecomment-2281371223 @ming95 Could you take another look at the comments when you have time? 🙇 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Plan file scan task according scan file size. [iceberg-rust]

2024-08-10 Thread via GitHub
rahull-p commented on issue #128: URL: https://github.com/apache/iceberg-rust/issues/128#issuecomment-2280596314 Can I work on this ? Additionally is the expectation here is to add multiple FileScanTask based on the split size and the scan file size -- This is an automated message from th

Re: [I] NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented [iceberg-python]

2024-08-10 Thread via GitHub
djouallah commented on issue #1013: URL: https://github.com/apache/iceberg-python/issues/1013#issuecomment-2280495129 same error with 0.7.1 rc1 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] HA HMS support [iceberg-python]

2024-08-10 Thread via GitHub
awdavidson commented on code in PR #752: URL: https://github.com/apache/iceberg-python/pull/752#discussion_r1712607393 ## tests/catalog/test_hive.py: ## @@ -1195,3 +1195,40 @@ def test_hive_wait_for_lock() -> None: with pytest.raises(WaitingForLockException): catal