Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze commented on PR #11864: URL: https://github.com/apache/iceberg/pull/11864#issuecomment-2560706820 > Did you confirm the existing PR #11427? #11427 mentioned using SQL to check whether a table exists, but it seems more appropriate to use JDBC's native semantic -- This i

[I] HiveTableOperations may incorrectly consider a successful commit as failed [iceberg]

2024-12-23 Thread via GitHub
lirui-apache opened a new issue, #11866: URL: https://github.com/apache/iceberg/issues/11866 ### Apache Iceberg version 1.4.3 ### Query engine Spark ### Please describe the bug 🐞 We are using `NoLock` for committing, and we recently hit an issue when HiveTa

Re: [I] HiveTableOperations may incorrectly consider a successful commit as failed [iceberg]

2024-12-23 Thread via GitHub
lirui-apache commented on issue #11866: URL: https://github.com/apache/iceberg/issues/11866#issuecomment-2560714213 @pvary What do you think about the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560713627 > For any relative new clients(supports v2 format), it should produces specs with field id included. It should indeed, but you cannot guarantee that, and it is not enforced by the

[PR] Core: Replace deprecated Schema.toString with SchemaFormatter [iceberg]

2024-12-23 Thread via GitHub
ebyhr opened a new pull request, #11867: URL: https://github.com/apache/iceberg/pull/11867 The method is deprecated: https://avro.apache.org/docs/1.12.0/api/java/org/apache/avro/Schema.html#toString(boolean) > Deprecated. Use SchemaFormatter.format(Schema) instead, using the format j

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896430906 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMet

Re: [PR] chore: update download link to 0.4.0 [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo merged PR #836: URL: https://github.com/apache/iceberg-rust/pull/836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] feat: add s3tables catalog [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo commented on code in PR #807: URL: https://github.com/apache/iceberg-rust/pull/807#discussion_r1896346166 ## crates/catalog/s3tables/src/catalog.rs: ## @@ -0,0 +1,620 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [I] fix: resolve cyclical dev-dependency in `iceberg` [iceberg-rust]

2024-12-23 Thread via GitHub
Xuanwo commented on issue #835: URL: https://github.com/apache/iceberg-rust/issues/835#issuecomment-2560596097 Let me help fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] refactor: Remove spawn and channel inside arrow reader [iceberg-rust]

2024-12-23 Thread via GitHub
liurenjie1024 merged PR #806: URL: https://github.com/apache/iceberg-rust/pull/806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] ci: add sccache to speed up ci build [iceberg-rust]

2024-12-23 Thread via GitHub
liurenjie1024 commented on PR #824: URL: https://github.com/apache/iceberg-rust/pull/824#issuecomment-2560621001 > > I'm not a big fan of checking in Cargo.lock as it's an antipattern for library > > Just FYI that it's not considered an antipattern any more https://blog.rust-lang.org

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560684459 @advancedxy Sorry for ignoring comment 1, I had to think about that one a bit: > upgrade the v1 table to v2 and then remove the void transform in the old spec and produces a new on

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560794468 Just to double check, with dropping the offending column, I was assuming that you would mutate an existing spec. But I think after going to V2, we should rewrite it into a new spec (that

Re: [PR] Integrate Test Framework [iceberg-cpp]

2024-12-23 Thread via GitHub
zhjwpku commented on code in PR #13: URL: https://github.com/apache/iceberg-cpp/pull/13#discussion_r1896438330 ## cmake_modules/FindGTestAlt.cmake: ## @@ -0,0 +1,28 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See t

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560708474 > So, we cannot alter existing partition specs. Even after upgrading to V2, the metadata is still in V1. The relevant part of the spec: I think we should be able to evolve the

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560708280 > I like that idea. Could you elaborate on that? Yes, the following is what I implemented in the internal repo: ```python def plan_scan_tasks( files: Iterable[Fi

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
Fokko commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2560668984 @smaheshwar-pltr thanks for raising this. I haven't seen this before either. Can you check if your local hostname is configured correctly? -- This is an automated message from the

Re: [PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on code in PR #11864: URL: https://github.com/apache/iceberg/pull/11864#discussion_r1896351505 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -162,8 +162,8 @@ private void initializeCatalogTables() { DatabaseMetaData dbMeta = c

Re: [PR] Core, Spark: Avoid deprecated methods in Guava Files [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on code in PR #11865: URL: https://github.com/apache/iceberg/pull/11865#discussion_r189645 ## core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java: ## @@ -96,7 +95,8 @@ public int getFormatVersion() { @Benchmark @Threads(1) public void wr

[PR] REST: Avoid deprecated execute without HttpClientResponseHandler [iceberg]

2024-12-23 Thread via GitHub
ebyhr opened a new pull request, #11870: URL: https://github.com/apache/iceberg/pull/11870 https://hc.apache.org/httpcomponents-client-5.4.x/current/apidocs/org/apache/hc/client5/http/classic/HttpClient.html#execute-org.apache.hc.core5.http.ClassicHttpRequest- > Deprecated. It is stro

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560757725 > > For any relative new clients(supports v2 format), it should produces specs with field id included. > > It should indeed, but you cannot guarantee that, and it is not enforc

Re: [PR] Bump mypy-boto3-glue from 1.35.80 to 1.35.87 [iceberg-python]

2024-12-23 Thread via GitHub
Fokko merged PR #1468: URL: https://github.com/apache/iceberg-python/pull/1468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Bump jinja2 from 3.1.4 to 3.1.5 [iceberg-python]

2024-12-23 Thread via GitHub
Fokko merged PR #1467: URL: https://github.com/apache/iceberg-python/pull/1467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[PR] feat: add insert support for iceberg-datafusion [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME opened a new pull request, #833: URL: https://github.com/apache/iceberg-rust/pull/833 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
zhjwpku commented on PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16#issuecomment-2559428937 > > Cool, I've raised an issue: https://issues.apache.org/jira/browse/INFRA-26378 > > It got approved :) Hi @Fokko, I see the comment that the pre-commit/action@3.0.1 has bee

[I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
liangyouze opened a new issue, #11862: URL: https://github.com/apache/iceberg/issues/11862 ### Apache Iceberg version 1.7.1 (latest release) ### Query engine None ### Please describe the bug 🐞 When JdbcCatalog initialize, it will globally search whether `ic

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895780803 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] Bump actions/setup-python from 3 to 5 [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #18: URL: https://github.com/apache/iceberg-cpp/pull/18 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895779686 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1895780803 ## tests/table/test_partitioning.py: ## @@ -118,6 +119,27 @@ def test_deserialize_partition_spec() -> None: ) +def test_partition_spec_to_path()

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2559759825 Done, @kevinjqliu. Fails due to https://github.com/apache/iceberg-python/pull/1457#discussion_r1894689633 but will think over it. FYI, am away for a little bit now s

Re: [PR] Bump actions/checkout from 3 to 4 [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #19: URL: https://github.com/apache/iceberg-cpp/pull/19 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
ConeyLiu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559498484 @kevinjqliu, thanks for the summary and the great proposal. Another option would be to provide a `plan_util` to support plan tasks like the Java-side implementation. -- This i

[PR] Bump actions/checkout from 3 to 4 [iceberg-cpp]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #19: URL: https://github.com/apache/iceberg-cpp/pull/19 Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. Release notes Sourced from https://github.com/actions/checkout/releases";>actions/checkout's releases.

[PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr opened a new pull request, #1466: URL: https://github.com/apache/iceberg-python/pull/1466 This let me run integration tests locally. Before, I was getting ``` py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContex

[PR] fix: allow nullable field of equality delete writer [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME opened a new pull request, #834: URL: https://github.com/apache/iceberg-rust/pull/834 According to the doc fixed in https://github.com/apache/iceberg/pull/8981, the equality delete writer can have an optional field id. This PR fixes this. -- This is an automated message from the A

Re: [PR] fix: allow nullable field of equality delete writer [iceberg-rust]

2024-12-23 Thread via GitHub
ZENOTME commented on PR #834: URL: https://github.com/apache/iceberg-rust/pull/834#issuecomment-2559958020 cc @liurenjie1024 @Fokko @Xuanwo @sdd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2559955607 Not sure if useful. @kevinjqliu, mind taking a peek? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko commented on PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16#issuecomment-2559523819 @zhjwpku no action needed, let me get this in -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#issuecomment-2559905325 @Fokko, think this is ready for review now! I've implemented this for write codepaths - `add_files` seems like it should just add the files specified without transfor

Re: [PR] Open-API: Fix compilation errors in generated Java classes due to mismatched return types [iceberg]

2024-12-23 Thread via GitHub
ajantha-bhat commented on code in PR #11806: URL: https://github.com/apache/iceberg/pull/11806#discussion_r1895867017 ## open-api/rest-catalog-open-api.py: ## @@ -981,8 +966,33 @@ class ValueMap(BaseModel): ) +class ContentFile(BaseModel): +content: ContentEnum +

Re: [I] Support LocationProviders like the Java Iceberg Reference Implementaiton [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on issue #861: URL: https://github.com/apache/iceberg-python/issues/861#issuecomment-2559908252 Great! I've put up https://github.com/apache/iceberg-python/pull/1452 that should address this -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-12-23 Thread via GitHub
zachdisc commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-2559933098 Please let me know the best way to deal with merge conflicts. I thought to rebase and get everything back in sync with the main branch, but that looks like the wrong flow here. I can clo

Re: [PR] Integrate Test Framework [iceberg-cpp]

2024-12-23 Thread via GitHub
wgtmac commented on code in PR #13: URL: https://github.com/apache/iceberg-cpp/pull/13#discussion_r1895873224 ## cmake_modules/FindGTestAlt.cmake: ## @@ -0,0 +1,28 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See th

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895664169 ## crates/iceberg/src/spec/delete_file.rs: ## @@ -0,0 +1,780 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895665053 ## crates/iceberg/src/arrow/reader.rs: ## @@ -176,6 +188,350 @@ impl ArrowReader { return Ok(rx.boxed()); } +// retrieve all delete files concurrentl

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on code in PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#discussion_r1895664861 ## crates/iceberg/src/spec/delete_file.rs: ## @@ -0,0 +1,780 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-12-23 Thread via GitHub
zachdisc commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-2559885046 I rebased locally and resolved merge conflicts and addressed Russel's ask to remove the function-based rewrite sorting. -- This is an automated message from the Apache Git Service

[PR] Feat/update sort order [iceberg-python]

2024-12-23 Thread via GitHub
JasperHG90 opened a new pull request, #1465: URL: https://github.com/apache/iceberg-python/pull/1465 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1895864213 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

Re: [PR] Open-API: Fix compilation errors in generated Java classes due to mismatched return types [iceberg]

2024-12-23 Thread via GitHub
ajantha-bhat commented on code in PR #11806: URL: https://github.com/apache/iceberg/pull/11806#discussion_r1895830121 ## open-api/rest-catalog-open-api.yaml: ## @@ -4372,6 +4399,39 @@ components: allOf: - $ref: '#/components/schemas/Expression' +Con

Re: [PR] Build: Fix ignoring `.asf.yaml` in PR [iceberg]

2024-12-23 Thread via GitHub
amogh-jahagirdar merged PR #11860: URL: https://github.com/apache/iceberg/pull/11860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] Bump actions/setup-python from 3 to 5 [iceberg-cpp]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #18: URL: https://github.com/apache/iceberg-cpp/pull/18 Bumps [actions/setup-python](https://github.com/actions/setup-python) from 3 to 5. Release notes Sourced from https://github.com/actions/setup-python/releases";>actions/setup-python's

Re: [PR] Add pre-commit config [iceberg-cpp]

2024-12-23 Thread via GitHub
Fokko merged PR #16: URL: https://github.com/apache/iceberg-cpp/pull/16 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
Fokko commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2559871997 Just to add some context: > Currently, PyIceberg's read path assumes to be run on a single node machine. This assumption is embedded in the way we plan and execute the read pa

Re: [PR] Tests: Set PySpark driver host to `localhost` [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1466: URL: https://github.com/apache/iceberg-python/pull/1466#issuecomment-2560159143 interesting, thats the first time ive seen this issue. do you have remote dev environment? The typical set up is running the integration test docker containers on local lapt

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2560171094 Thanks for the PR! I've dug into the test failure a bit. Heres what I found. There's a subtle difference between `PartitionKey.partition` and `DataFile.partition`. In mos

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
anuragmantri commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560179927 Thanks for your review @advancedxy. My concern is that this is a big behavior change. Partition columns cannot be dropped anymore. I will start a thread in dev channel to see what

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
anuragmantri commented on code in PR #11842: URL: https://github.com/apache/iceberg/pull/11842#discussion_r1896047271 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +534,25 @@ private static Schema applyChanges( } } +if (base != null)

Re: [PR] Add plan tasks for TableScan [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1427: URL: https://github.com/apache/iceberg-python/pull/1427#issuecomment-2560194912 > Another option would be to provide a plan_util to support plan tasks like the Java-side implementation. Thats interesting, I like that too. "util" suggests that its opti

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896119481 ## tests/io/test_pyarrow.py: ## Review Comment: Yes, I thought about that as well, essentially I tried to set up another service like `minio-ap-southeast

Re: [PR] Table Scan Delete File Handling: Positional and Equality Delete Support [iceberg-rust]

2024-12-23 Thread via GitHub
sdd commented on PR #652: URL: https://github.com/apache/iceberg-rust/pull/652#issuecomment-2560086936 FAO @liurenjie1024, @Xuanwo, @Fokko: I've finished refactoring this and after a few rounds I'm happier with the design of the `DeleteFileIndex` and how it is interacted with in the s

Re: [PR] Support Location Providers [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1895864213 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

Re: [PR] feat: search current working directory for config file [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464#discussion_r1896047664 ## tests/utils/test_config.py: ## @@ -93,3 +94,61 @@ def test_from_configuration_files_get_typed_value(tmp_path_factory: pytest.TempP assert Config().

Re: [PR] Core: Don't reset snapshotLog in `TableMetadata.removeRef` method [iceberg]

2024-12-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #11779: URL: https://github.com/apache/iceberg/pull/11779#discussion_r1896074073 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2418,6 +2421,34 @@ public void testPaginationForListTables(int numberOfItems) {

Re: [I] how to grant s3 temp permissions when using pyiceberg? [iceberg-python]

2024-12-23 Thread via GitHub
jayceslesar commented on issue #1463: URL: https://github.com/apache/iceberg-python/issues/1463#issuecomment-2560088682 My org uses SSO and we use IAM for cloud runtimes but for local runtimes something like the following: ```sh aws configure export-credentials --profile YOUR_PROFILE

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896072489 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vi

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2560234773 > According to what I found here seems fsspec doesn't have the same issue as pyarrow. So I guess we can leave it? wow thats interesting, i didn't know about that. I like t

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
Fokko commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560242110 > Allow removing the partition source field in void transform in this PR. I think it requires extra work and we may need to keep the input type for void transform, which might lead to a t

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896125712 ## pyiceberg/io/pyarrow.py: ## @@ -1508,7 +1512,7 @@ def _record_batches_from_scan_tasks_and_deletes( if self._limit is not None and total_row_co

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896124443 ## tests/io/test_pyarrow.py: ## Review Comment: I think testing `PyArrowFileIO.fs_by_scheme` is good enough. in the unit test, maybe mention https://git

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896126623 ## pyiceberg/io/pyarrow.py: ## @@ -1508,7 +1512,7 @@ def _record_batches_from_scan_tasks_and_deletes( if self._limit is not None and total_row_cou

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896130636 ## tests/io/test_pyarrow.py: ## @@ -360,10 +360,11 @@ def test_pyarrow_s3_session_properties() -> None: **UNIFIED_AWS_SESSION_PROPERTIES, } -

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896132680 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vir

[PR] Bump jinja2 from 3.1.4 to 3.1.5 [iceberg-python]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #1467: URL: https://github.com/apache/iceberg-python/pull/1467 Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. Release notes Sourced from https://github.com/pallets/jinja/releases";>jinja2's releases. 3.1.5 T

[PR] Bump mypy-boto3-glue from 1.35.80 to 1.35.87 [iceberg-python]

2024-12-23 Thread via GitHub
dependabot[bot] opened a new pull request, #1468: URL: https://github.com/apache/iceberg-python/pull/1468 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.80 to 1.35.87. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

Re: [I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on issue #11862: URL: https://github.com/apache/iceberg/issues/11862#issuecomment-2560392322 It seems there is an existing issue. #11423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Getting schema of the all tables instead of actual data [iceberg]

2024-12-23 Thread via GitHub
github-actions[bot] closed issue #10483: Getting schema of the all tables instead of actual data URL: https://github.com/apache/iceberg/issues/10483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Getting schema of the all tables instead of actual data [iceberg]

2024-12-23 Thread via GitHub
github-actions[bot] commented on issue #10483: URL: https://github.com/apache/iceberg/issues/10483#issuecomment-2560459174 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Writing an arrow table with date64 unsupported [iceberg-python]

2024-12-23 Thread via GitHub
github-actions[bot] commented on issue #830: URL: https://github.com/apache/iceberg-python/issues/830#issuecomment-2560461252 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on code in PR #11842: URL: https://github.com/apache/iceberg/pull/11842#discussion_r1896278567 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -533,6 +534,25 @@ private static Schema applyChanges( } } +if (base != null) {

Re: [I] Cannot create DBMS Table automatically when JdbcCatalog initialize [iceberg]

2024-12-23 Thread via GitHub
liangyouze closed issue #11862: Cannot create DBMS Table automatically when JdbcCatalog initialize URL: https://github.com/apache/iceberg/issues/11862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] Core: check catalog and schema for JdbcCatalog initialization [iceberg]

2024-12-23 Thread via GitHub
liangyouze opened a new pull request, #11864: URL: https://github.com/apache/iceberg/pull/11864 Iceberg metadata table may not created automatically in some cases, see [#11423](https://github.com/apache/iceberg/issues/11423) and [#11862](https://github.com/apache/iceberg/issues/11862), In t

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2024-12-23 Thread via GitHub
advancedxy commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2560529630 > > Allow removing the partition source field in void transform in this PR. I think it requires extra work and we may need to keep the input type for void transform, which might lead

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
jiakai-li commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2560549414 This PR is ready for review now. Thanks very much and merry christmas! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

2024-12-23 Thread via GitHub
smaheshwar-pltr commented on PR #1457: URL: https://github.com/apache/iceberg-python/pull/1457#issuecomment-2560428546 Thanks a lot for this explanation and suggestion @kevinjqliu! It sounds good. Had some time so I've made this change so tests pass - using `make_compatible_name` as a

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2024-12-23 Thread via GitHub
kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1896150261 ## pyiceberg/io/pyarrow.py: ## @@ -377,6 +377,12 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste if force_vi

[I] fix: resolve cyclical dev-dependency in `iceberg` [iceberg-rust]

2024-12-23 Thread via GitHub
sungwy opened a new issue, #835: URL: https://github.com/apache/iceberg-rust/issues/835 The github workflow for publishing Iceberg rust crates to cargo has failed: https://github.com/apache/iceberg-rust/actions/runs/12475862491/job/34819847073 ``` Run cargo publish --all-fea

Re: [PR] Core: Don't reset snapshotLog in `TableMetadata.removeRef` method [iceberg]

2024-12-23 Thread via GitHub
ebyhr commented on PR #11779: URL: https://github.com/apache/iceberg/pull/11779#issuecomment-2560579572 @amogh-jahagirdar Thanks for your review. Addressed comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the