Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560794468 Just to double check, with dropping the offending column, I was assuming that you would mutate an existing spec. But I think after going to V2, we should rewrite it into a new spec (that

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560757725 > > For any relative new clients(supports v2 format), it should produces specs with field id included. > > It should indeed, but you cannot guarantee that, and it is not enforc

[PR] REST: Avoid deprecated execute without HttpClientResponseHandler [iceberg]

2024-12-23 Thread via GitHub
ebyhr opened a new pull request, #11870: URL: https://github.com/apache/iceberg/pull/11870 https://hc.apache.org/httpcomponents-client-5.4.x/current/apidocs/org/apache/hc/client5/http/classic/HttpClient.html#execute-org.apache.hc.core5.http.ClassicHttpRequest- > Deprecated. It is stro

Re: [PR] Core, Spark: Avoid deprecated methods in Guava Files [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on code in PR #11865: URL: https://github.com/apache/iceberg/pull/11865#discussion_r189645 ## core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java: ## @@ -96,7 +95,8 @@ public int getFormatVersion() { @Benchmark @Threads(1) public void wr

[PR] Core: Replace deprecated Schema.toString with SchemaFormatter [iceberg]

2024-12-23 Thread via GitHub
ebyhr opened a new pull request, #11867: URL: https://github.com/apache/iceberg/pull/11867 The method is deprecated: https://avro.apache.org/docs/1.12.0/api/java/org/apache/avro/Schema.html#toString(boolean) > Deprecated. Use SchemaFormatter.format(Schema) instead, using the format j

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560713627 > For any relative new clients(supports v2 format), it should produces specs with field id included. It should indeed, but you cannot guarantee that, and it is not enforced by the

Re: [I] HiveTableOperations may incorrectly consider a successful commit as failed [iceberg]

2024-12-23 Thread via GitHub
lirui-apache commented on issue #11866: URL: https://github.com/apache/iceberg/issues/11866#issuecomment-2560714213 @pvary What do you think about the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[I] HiveTableOperations may incorrectly consider a successful commit as failed [iceberg]

2024-12-23 Thread via GitHub
lirui-apache opened a new issue, #11866: URL: https://github.com/apache/iceberg/issues/11866 ### Apache Iceberg version 1.4.3 ### Query engine Spark ### Please describe the bug 🐞 We are using `NoLock` for committing, and we recently hit an issue when HiveTa

Re: [PR] Integrate Test Framework [iceberg-cpp]

2024-12-23 Thread via GitHub
zhjwpku commented on code in PR #13: URL: https://github.com/apache/iceberg-cpp/pull/13#discussion_r1896438330 ## cmake_modules/FindGTestAlt.cmake: ## @@ -0,0 +1,28 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See t

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560708280 > I like that idea. Could you elaborate on that? Yes, the following is what I implemented in the internal repo: ```python def plan_scan_tasks( files: Iterable[Fi

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560708474 > So, we cannot alter existing partition specs. Even after upgrading to V2, the metadata is still in V1. The relevant part of the spec: I think we should be able to evolve the

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze commented on PR #11864: URL: https://github.com/apache/iceberg/pull/11864#issuecomment-2560706820 > Did you confirm the existing PR #11427? #11427 mentioned using SQL to check whether a table exists, but it seems more appropriate to use JDBC's native semantic -- This i

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896430906 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMet

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560684459 @advancedxy Sorry for ignoring comment 1, I had to think about that one a bit: > upgrade the v1 table to v2 and then remove the void transform in the old spec and produces a new on

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] Bump jinja2 from 3.1.4 to 3.1.5 [iceberg-python]

2024-12-23 Thread via GitHub
Fokko merged PR #1467: URL: https://github.com/apache/iceberg-python/pull/1467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
Fokko commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2560668984 @smaheshwar-pltr thanks for raising this. I haven't seen this before either. Can you check if your local hostname is configured correctly? -- This is an automated message from the

Re: [PR] Bump mypy-boto3-glue from 1.35.80 to 1.35.87 [iceberg-python]

2024-12-23 Thread via GitHub
Fokko merged PR #1468: URL: https://github.com/apache/iceberg-python/pull/1468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] ci: add sccache to speed up ci build [iceberg-rust]

2024-12-23 Thread via GitHub
liurenjie1024 commented on PR #824: URL: https://github.com/apache/iceberg-rust/pull/824#issuecomment-2560621001 > > I'm not a big fan of checking in Cargo.lock as it's an antipattern for library > > Just FYI that it's not considered an antipattern any more https://blog.rust-lang.org

Re: [PR] refactor: Remove spawn and channel inside arrow reader [iceberg-rust]

2024-12-23 Thread via GitHub
liurenjie1024 merged PR #806: URL: https://github.com/apache/iceberg-rust/pull/806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] feat: add s3tables catalog [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo commented on code in PR #807: URL: https://github.com/apache/iceberg-rust/pull/807#discussion_r1896346166 ## crates/catalog/s3tables/src/catalog.rs: ## @@ -0,0 +1,620 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [I] fix: resolve cyclical dev-dependency in `iceberg` [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo commented on issue #835: URL: https://github.com/apache/iceberg-rust/issues/835#issuecomment-2560596097 Let me help fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] chore: update download link to 0.4.0 [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo merged PR #836: URL: https://github.com/apache/iceberg-rust/pull/836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Core: Don't reset snapshotLog in `TableMetadata.removeRef` method [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on PR #11779: URL: https://github.com/apache/iceberg/pull/11779#issuecomment-2560579572 @amogh-jahagirdar Thanks for your review. Addressed comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[I] fix: resolve cyclical dev-dependency in `iceberg` [iceberg-rust]

2024-12-23 Thread via GitHub
sungwy opened a new issue, #835: URL: https://github.com/apache/iceberg-rust/issues/835 The github workflow for publishing Iceberg rust crates to cargo has failed: https://github.com/apache/iceberg-rust/actions/runs/12475862491/job/34819847073 ``` Run cargo publish --all-fea

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2560549414 This PR is ready for review now. Thanks very much and merry christmas! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze opened a new pull request, #11864: URL: https://github.com/apache/iceberg/pull/11864 Iceberg metadata table may not created automatically in some cases, see [#11423](https://github.com/apache/iceberg/issues/11423) and [#11862](https://github.com/apache/iceberg/issues/11862), In t

Re: [I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
liangyouze closed issue #11862: Cannot create DBMS Table automatically when JdbcCatalog initialize URL: https://github.com/apache/iceberg/issues/11862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560529630 > > Allow removing the partition source field in void transform in this PR. I think it requires extra work and we may need to keep the input type for void transform, which might lead

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on code in PR #11842: URL: https://github.com/apache/iceberg/pull/11842#discussion_r1896278567 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +534,25 @@ private static Schema applyChanges( } } +if (base != null) {

Re: [I] Writing an arrow table with date64 unsupported [iceberg-python]

2024-12-23 Thread via GitHub
github-actions[bot] commented on issue #830: URL: https://github.com/apache/iceberg-python/issues/830#issuecomment-2560461252 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [I] Getting schema of the all tables instead of actual data [iceberg]

2024-12-23 Thread via GitHub
github-actions[bot] closed issue #10483: Getting schema of the all tables instead of actual data URL: https://github.com/apache/iceberg/issues/10483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Getting schema of the all tables instead of actual data [iceberg]

2024-12-23 Thread via GitHub
github-actions[bot] commented on issue #10483: URL: https://github.com/apache/iceberg/issues/10483#issuecomment-2560459174 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2560428546 Thanks a lot for this explanation and suggestion @kevinjqliu! It sounds good. Had some time so I've made this change so tests pass - using `make_compatible_name` as a

Re: [I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on issue #11862: URL: https://github.com/apache/iceberg/issues/11862#issuecomment-2560392322 It seems there is an existing issue. #11423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] Bump mypy-boto3-glue from 1.35.80 to 1.35.87 [iceberg-python]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #1468: URL: https://github.com/apache/iceberg-python/pull/1468 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.80 to 1.35.87. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

[PR] Bump jinja2 from 3.1.4 to 3.1.5 [iceberg-python]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #1467: URL: https://github.com/apache/iceberg-python/pull/1467 Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. Release notes Sourced from https://github.com/pallets/jinja/releases";>jinja2's releases. 3.1.5 T

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896150261 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vi

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896132680 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vir

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896130636 ## tests/io/test_pyarrow.py: ## @@ -360,10 +360,11 @@ def test_pyarrow_s3_session_properties() -> None: **UNIFIED_AWS_SESSION_PROPERTIES, } -

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896126623 ## pyiceberg/io/pyarrow.py: ## @@ -1508,7 +1512,7 @@ def _record_batches_from_scan_tasks_and_deletes( if self._limit is not None and total_row_cou

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896125712 ## pyiceberg/io/pyarrow.py: ## @@ -1508,7 +1512,7 @@ def _record_batches_from_scan_tasks_and_deletes( if self._limit is not None and total_row_co

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896124443 ## tests/io/test_pyarrow.py: ## Review Comment: I think testing `PyArrowFileIO.fs_by_scheme` is good enough. in the unit test, maybe mention https://git

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896119481 ## tests/io/test_pyarrow.py: ## Review Comment: Yes, I thought about that as well, essentially I tried to set up another service like `minio-ap-southeast

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560242110 > Allow removing the partition source field in void transform in this PR. I think it requires extra work and we may need to keep the input type for void transform, which might lead to a t

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896072489 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vi

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2560234773 > According to what I found here seems fsspec doesn't have the same issue as pyarrow. So I guess we can leave it? wow thats interesting, i didn't know about that. I like t

Re: [PR] Core: Don't reset snapshotLog in `TableMetadata.removeRef` method [iceberg]

2024-12-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11779: URL: https://github.com/apache/iceberg/pull/11779#discussion_r1896074073 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2418,6 +2421,34 @@ public void testPaginationForListTables(int numberOfItems) {

Re: [PR] feat: search current working directory for config file [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464#discussion_r1896047664 ## tests/utils/test_config.py: ## @@ -93,3 +94,61 @@ def test_from_configuration_files_get_typed_value(tmp_path_factory: pytest.TempP assert Config().

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560194912 > Another option would be to provide a plan_util to support plan tasks like the Java-side implementation. Thats interesting, I like that too. "util" suggests that its opti

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
anuragmantri commented on code in PR #11842: URL: https://github.com/apache/iceberg/pull/11842#discussion_r1896047271 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +534,25 @@ private static Schema applyChanges( } } +if (base != null)

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
anuragmantri commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560179927 Thanks for your review @advancedxy. My concern is that this is a big behavior change. Partition columns cannot be dropped anymore. I will start a thread in dev channel to see what

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2560171094 Thanks for the PR! I've dug into the test failure a bit. Heres what I found. There's a subtle difference between `PartitionKey.partition` and `DataFile.partition`. In mos

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2560159143 interesting, thats the first time ive seen this issue. do you have remote dev environment? The typical set up is running the integration test docker containers on local lapt

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1895864213 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

Re: [I] how to grant s3 temp permissions when using pyiceberg? [iceberg-python]

2024-12-23 Thread via GitHub
jayceslesar commented on issue #1463: URL: https://github.com/apache/iceberg-python/issues/1463#issuecomment-2560088682 My org uses SSO and we use IAM for cloud runtimes but for local runtimes something like the following: ```sh aws configure export-credentials --profile YOUR_PROFILE

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#issuecomment-2560086936 FAO @liurenjie1024, @Xuanwo, @Fokko: I've finished refactoring this and after a few rounds I'm happier with the design of the `DeleteFileIndex` and how it is interacted with in the s

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2559955607 Not sure if useful. @kevinjqliu, mind taking a peek? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] fix: allow nullable field of equality delete writer [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME commented on PR #834: URL: https://github.com/apache/iceberg-rust/pull/834#issuecomment-2559958020 cc @liurenjie1024 @Fokko @Xuanwo @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] fix: allow nullable field of equality delete writer [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME opened a new pull request, #834: URL: https://github.com/apache/iceberg-rust/pull/834 According to the doc fixed in https://github.com/apache/iceberg/pull/8981, the equality delete writer can have an optional field id. This PR fixes this. -- This is an automated message from the A

[PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr opened a new pull request, #1466: URL: https://github.com/apache/iceberg-python/pull/1466 This let me run integration tests locally. Before, I was getting ``` py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContex

Re: [PR] Integrate Test Framework [iceberg-cpp]

2024-12-23 Thread via GitHub
wgtmac commented on code in PR #13: URL: https://github.com/apache/iceberg-cpp/pull/13#discussion_r1895873224 ## cmake_modules/FindGTestAlt.cmake: ## @@ -0,0 +1,28 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See th

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-12-23 Thread via GitHub
zachdisc commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-2559933098 Please let me know the best way to deal with merge conflicts. I thought to rebase and get everything back in sync with the main branch, but that looks like the wrong flow here. I can clo

Re: [I] Support LocationProviders like the Java Iceberg Reference Implementaiton [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on issue #861: URL: https://github.com/apache/iceberg-python/issues/861#issuecomment-2559908252 Great! I've put up https://github.com/apache/iceberg-python/pull/1452 that should address this -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Open-API: Fix compilation errors in generated Java classes due to mismatched return types [iceberg]

2024-12-23 Thread via GitHub
ajantha-bhat commented on code in PR #11806: URL: https://github.com/apache/iceberg/pull/11806#discussion_r1895867017 ## open-api/rest-catalog-open-api.py: ## @@ -981,8 +966,33 @@ class ValueMap(BaseModel): ) +class ContentFile(BaseModel): +content: ContentEnum +

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#issuecomment-2559905325 @Fokko, think this is ready for review now! I've implemented this for write codepaths - `add_files` seems like it should just add the files specified without transfor

Re: [PR] Build: Fix ignoring `.asf.yaml` in PR [iceberg]

2024-12-23 Thread via GitHub
amogh-jahagirdar merged PR #11860: URL: https://github.com/apache/iceberg/pull/11860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Open-API: Fix compilation errors in generated Java classes due to mismatched return types [iceberg]

2024-12-23 Thread via GitHub
ajantha-bhat commented on code in PR #11806: URL: https://github.com/apache/iceberg/pull/11806#discussion_r1895830121 ## open-api/rest-catalog-open-api.yaml: ## @@ -4372,6 +4399,39 @@ components: allOf: - $ref: '#/components/schemas/Expression' +Con

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1895864213 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

[PR] Feat/update sort order [iceberg-python]

2024-12-23 Thread via GitHub
JasperHG90 opened a new pull request, #1465: URL: https://github.com/apache/iceberg-python/pull/1465 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-12-23 Thread via GitHub
zachdisc commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-2559885046 I rebased locally and resolved merge conflicts and addressed Russel's ask to remove the function-based rewrite sorting. -- This is an automated message from the Apache Git Service

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
Fokko commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559871997 Just to add some context: > Currently, PyIceberg's read path assumes to be run on a single node machine. This assumption is embedded in the way we plan and execute the read pa

Re: [PR] Bump actions/checkout from 3 to 4 [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #19: URL: https://github.com/apache/iceberg-cpp/pull/19 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2559759825 Done, @kevinjqliu. Fails due to https://github.com/apache/iceberg-python/pull/1457#discussion_r1894689633 but will think over it. FYI, am away for a little bit now s

Re: [PR] Bump actions/setup-python from 3 to 5 [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #18: URL: https://github.com/apache/iceberg-cpp/pull/18 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895780803 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895780803 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

[PR] Bump actions/checkout from 3 to 4 [iceberg-cpp]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #19: URL: https://github.com/apache/iceberg-cpp/pull/19 Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. Release notes Sourced from https://github.com/actions/checkout/releases";>actions/checkout's releases.

[PR] Bump actions/setup-python from 3 to 5 [iceberg-cpp]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #18: URL: https://github.com/apache/iceberg-cpp/pull/18 Bumps [actions/setup-python](https://github.com/actions/setup-python) from 3 to 5. Release notes Sourced from https://github.com/actions/setup-python/releases";>actions/setup-python's

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895665053 ## crates/iceberg/src/arrow/reader.rs: ## @@ -176,6 +188,350 @@ impl ArrowReader { return Ok(rx.boxed()); } +// retrieve all delete files concurrentl

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895664861 ## crates/iceberg/src/spec/delete_file.rs: ## @@ -0,0 +1,780 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895664169 ## crates/iceberg/src/spec/delete_file.rs: ## @@ -0,0 +1,780 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko commented on PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16#issuecomment-2559523819 @zhjwpku no action needed, let me get this in -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559498484 @kevinjqliu, thanks for the summary and the great proposal. Another option would be to provide a `plan_util` to support plan tasks like the Java-side implementation. -- This i

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
zhjwpku commented on PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16#issuecomment-2559428937 > > Cool, I've raised an issue: https://issues.apache.org/jira/browse/INFRA-26378 > > It got approved :) Hi @Fokko, I see the comment that the pre-commit/action@3.0.1 has bee

[I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
liangyouze opened a new issue, #11862: URL: https://github.com/apache/iceberg/issues/11862 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug 🐞 When JdbcCatalog initialize, it will globally search whether `ic

[PR] feat: add insert support for iceberg-datafusion [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME opened a new pull request, #833: URL: https://github.com/apache/iceberg-rust/pull/833 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e