Re: [PR] ci: Add workflow for publish [iceberg-rust]

2024-02-22 Thread via GitHub
martin-g commented on code in PR #218: URL: https://github.com/apache/iceberg-rust/pull/218#discussion_r1498872362 ## .github/workflows/publish.yml: ## @@ -0,0 +1,55 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [I] Improve read times and reduce size of metadata.json by storing schemas in external files [iceberg]

2024-02-22 Thread via GitHub
liurenjie1024 commented on issue #9734: URL: https://github.com/apache/iceberg/issues/9734#issuecomment-1959037762 Thanks for raising this. As another way to improve loading speed, is it possible to compress metadata file? -- This is an automated message from the Apache Git Service. To re

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1498938153 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1498938153 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Add workflow for cargo audit [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko merged PR #217: URL: https://github.com/apache/iceberg-rust/pull/217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Run cargo audit in ci. [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko closed issue #209: Run cargo audit in ci. URL: https://github.com/apache/iceberg-rust/issues/209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

Re: [PR] ci: Add workflow for publish [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko merged PR #218: URL: https://github.com/apache/iceberg-rust/pull/218 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Add CI for dry-run and publish crates [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko closed issue #214: Add CI for dry-run and publish crates URL: https://github.com/apache/iceberg-rust/issues/214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] docs: Upload crates [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko closed pull request #211: docs: Upload crates URL: https://github.com/apache/iceberg-rust/pull/211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] docs: Upload crates [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko commented on PR #211: URL: https://github.com/apache/iceberg-rust/pull/211#issuecomment-1959074184 We can close this in favor of the GH Action :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] docs: Add basic README for all crates [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko merged PR #215: URL: https://github.com/apache/iceberg-rust/pull/215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Crates don't have README [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko closed issue #210: Crates don't have README URL: https://github.com/apache/iceberg-rust/issues/210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Follow naming convention from Iceberg's Java and Python implementations [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko merged PR #204: URL: https://github.com/apache/iceberg-rust/pull/204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] chore(deps): Update derive_builder requirement from 0.13.0 to 0.20.0 [iceberg-rust]

2024-02-22 Thread via GitHub
Fokko commented on PR #203: URL: https://github.com/apache/iceberg-rust/pull/203#issuecomment-1959081966 @Xuanwo I'm not following this reasoning. Should we just get this in? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Iceberg the condition function of org.apache.iceberg.expressions, not able to use time-related values as value [iceberg]

2024-02-22 Thread via GitHub
Freedomfirebody commented on issue #9431: URL: https://github.com/apache/iceberg/issues/9431#issuecomment-1959092029 I accept but disagree with the answer to question 1, It should support OffsetDateTime, which is its return from response. That is not true about question 2, After tryin

Re: [PR] Core: Add EnvironmentContext to commit summary [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9273: URL: https://github.com/apache/iceberg/pull/9273#discussion_r1498980055 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -848,6 +850,19 @@ public void testRewriteWith

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1498984200 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg( if

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1498988263 ## open-api/rest-catalog-open-api.yaml: ## @@ -1581,6 +1607,17 @@ components: type: string example: [ "accounting", "tax" ] +PageToken: + desc

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1498991946 ## open-api/rest-catalog-open-api.yaml: ## @@ -1581,6 +1607,17 @@ components: type: string example: [ "accounting", "tax" ] +PageToken: + desc

Re: [PR] PartitionKey [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on PR #453: URL: https://github.com/apache/iceberg-python/pull/453#issuecomment-1959113816 @jqin61 I wanted to do a second round, but I think you forgot to push? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1498994264 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1498994751 ## open-api/rest-catalog-open-api.yaml: ## @@ -1581,6 +1607,17 @@ components: type: string example: [ "accounting", "tax" ] +PageToken: + desc

Re: [PR] [WIP] Bin Pack Writes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1498996906 ## pyiceberg/io/pyarrow.py: ## @@ -1715,53 +1715,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask]) -> Iterator[Data

Re: [PR] [WIP] Bin Pack Writes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1499000219 ## pyiceberg/io/pyarrow.py: ## @@ -1715,53 +1715,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask]) -> Iterator[Data

Re: [PR] [WIP] Bin Pack Writes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1499000219 ## pyiceberg/io/pyarrow.py: ## @@ -1715,53 +1715,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask]) -> Iterator[Data

Re: [PR] [WIP] Bin Pack Writes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1499000219 ## pyiceberg/io/pyarrow.py: ## @@ -1715,53 +1715,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask]) -> Iterator[Data

Re: [PR] [WIP] Bin Pack Writes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1499027177 ## pyiceberg/io/pyarrow.py: ## @@ -1715,53 +1715,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask]) -> Iterator[Data

Re: [PR] detect breaking changes [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on PR #394: URL: https://github.com/apache/iceberg-python/pull/394#issuecomment-1959164253 Looks like we broke something already 😸 Can we make a list to allow breaking changes? Similar to https://github.com/apache/parquet-mr/blob/d8396086b3e3fefc6829f8640917c3bbde0fa9c4/pom.

Re: [I] Add support for custom header configs in RESTCatalog [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on issue #455: URL: https://github.com/apache/iceberg-python/issues/455#issuecomment-1959197528 Hey @geruh that's a great catch. I wasn't aware of that piece being missing. It would be great if you want to raise PR to add this 👍 -- This is an automated message from the Ap

Re: [PR] Flink: Flag Driven Dynamic Loading of Partition Spec From Table During Commits [iceberg]

2024-02-22 Thread via GitHub
adamyasharma2797 commented on PR #9774: URL: https://github.com/apache/iceberg/pull/9774#issuecomment-1959279749 @rdblue @ajantha-bhat Please check this small PR and throw light on the understanding gap if any. -- This is an automated message from the Apache Git Service. To respond to the

[PR] Docs: Fix listings on Release page / Update Multi-engine support [iceberg]

2024-02-22 Thread via GitHub
nastra opened a new pull request, #9775: URL: https://github.com/apache/iceberg/pull/9775 The current list item layout on https://iceberg.apache.org/releases/ is wrongly intended, which is being fixed by this PR. Additionally, this adds Spark 3.5 to the supported engine page. /cc @bi

Re: [I] Tracking issues of iceberg-rust v0.2 [iceberg-rust]

2024-02-22 Thread via GitHub
liurenjie1024 commented on issue #18: URL: https://github.com/apache/iceberg-rust/issues/18#issuecomment-1959421151 Close this issue as 0.2 has been release. Feel free to reopen when necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Tracking issues of iceberg-rust v0.2 [iceberg-rust]

2024-02-22 Thread via GitHub
liurenjie1024 closed issue #18: Tracking issues of iceberg-rust v0.2 URL: https://github.com/apache/iceberg-rust/issues/18 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[PR] Docs: Sync contributing page / refer to website for contributing [iceberg]

2024-02-22 Thread via GitHub
nastra opened a new pull request, #9776: URL: https://github.com/apache/iceberg/pull/9776 We currently have two markdown files that aren't fully in-sync about how to contribute to Iceberg. The one is only in the source tree, but not referenced. The other is the one on the website. I've

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499256325 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg( if

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499265736 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499265736 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
tomtongue commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499265736 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -188,25 +184,26 @@ private void checkArg(

Re: [PR] Infra: Add 1.5.0 to issue template [iceberg]

2024-02-22 Thread via GitHub
ajantha-bhat commented on code in PR #9778: URL: https://github.com/apache/iceberg/pull/9778#discussion_r1499322386 ## .github/ISSUE_TEMPLATE/iceberg_bug_report.yml: ## @@ -38,14 +39,6 @@ body: - "1.2.0" - "1.1.0" - "1.0.0" -- "0.14.1" Review

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1499375427 ## api/src/main/java/org/apache/iceberg/RemoveDanglingDeletesMode.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1499376818 ## api/src/main/java/org/apache/iceberg/RemoveDanglingDeletesMode.java: ## @@ -0,0 +1,64 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1499378001 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -172,11 +192,18 @@ public RewriteDataFiles.Result execute() {

Re: [PR] Core, Spark: Remove dangling deletes as part of RewriteDataFilesAction [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9724: URL: https://github.com/apache/iceberg/pull/9724#discussion_r1499380088 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -334,6 +344,208 @@ public void testBinPackWithDeletes() {

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499386214 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java: ## @@ -18,51 +18,45 @@ */ package org.apache.iceberg.

Re: [PR] Migrate Procedure sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-22 Thread via GitHub
nastra commented on code in PR #9760: URL: https://github.com/apache/iceberg/pull/9760#discussion_r1499387190 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRemoveOrphanFilesProcedure.java: ## @@ -106,7 +96,7 @@ public void testRemoveOrphanF

Re: [PR] Parquet, Arrow: Refactor vectorized reader [iceberg]

2024-02-22 Thread via GitHub
wgtmac commented on PR #9772: URL: https://github.com/apache/iceberg/pull/9772#issuecomment-1959648527 I plan to resolve https://github.com/apache/iceberg/issues/7162 by adding vectorized readers for all v2 encodings. This is the 1st patch. Would you mind taking a look? @rdblue @nastra @Fok

Re: [PR] detect breaking changes [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on PR #394: URL: https://github.com/apache/iceberg-python/pull/394#issuecomment-1959670410 > Looks like we broke something already 😸 Can we make a list to allow breaking changes? Similar to https://github.com/apache/parquet-mr/blob/d8396086b3e3fefc6829f8640917c3bbde0fa9c4/p

[PR] Docs: Sync specs to site via symlinks [iceberg]

2024-02-22 Thread via GitHub
manuzhang opened a new pull request, #9779: URL: https://github.com/apache/iceberg/pull/9779 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [PR] Docs: Fix listings on Release page / Update Multi-engine support [iceberg]

2024-02-22 Thread via GitHub
Fokko merged PR #9775: URL: https://github.com/apache/iceberg/pull/9775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Docs: Fix listings on Release page / Update Multi-engine support [iceberg]

2024-02-22 Thread via GitHub
Fokko commented on PR #9775: URL: https://github.com/apache/iceberg/pull/9775#issuecomment-1959716808 Thanks for fixing this @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] PartitionKey [iceberg-python]

2024-02-22 Thread via GitHub
jqin61 commented on PR #453: URL: https://github.com/apache/iceberg-python/pull/453#issuecomment-1959848004 > @jqin61 I wanted to do a second round, but I think you forgot to push? :) Hi Fokko sorry for the delayed push of the fixes. It took a little time to think through how to use t

Re: [PR] Infra: Don't run Delta Conversion CI on changes to site folder [iceberg]

2024-02-22 Thread via GitHub
nastra merged PR #9780: URL: https://github.com/apache/iceberg/pull/9780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-02-22 Thread via GitHub
jackye1995 commented on code in PR #9731: URL: https://github.com/apache/iceberg/pull/9731#discussion_r1499588047 ## api/src/main/java/org/apache/iceberg/actions/RewriteManifests.java: ## @@ -44,6 +45,16 @@ public interface RewriteManifests */ RewriteManifests rewriteIf(P

Re: [I] "CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support [iceberg-python]

2024-02-22 Thread via GitHub
Fokko commented on issue #438: URL: https://github.com/apache/iceberg-python/issues/438#issuecomment-1959898488 Duplicate of https://github.com/apache/iceberg-python/issues/281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [I] "CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support [iceberg-python]

2024-02-22 Thread via GitHub
syun64 closed issue #438: "CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support URL: https://github.com/apache/iceberg-python/issues/438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] "CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on issue #438: URL: https://github.com/apache/iceberg-python/issues/438#issuecomment-1959943274 > Duplicate of #281 This was my bad attempt at decoupling the introduction of REPALCE TABLE support from the discussion of how we should support `... AS SELECT` semantic -

Re: [I] REPLACE TABLE Support [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on issue #281: URL: https://github.com/apache/iceberg-python/issues/281#issuecomment-1959949307 There's a PR in progress that will introduce 'REPLACE TABLE' support, but I don't think we've come to a consensus yet on how we would want to support 'AS SELECT' semantics in PyI

Re: [I] Issue with 'writeTo' [iceberg]

2024-02-22 Thread via GitHub
jessiedanwang closed issue #9766: Issue with 'writeTo' URL: https://github.com/apache/iceberg/issues/9766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [I] Issue with 'writeTo' [iceberg]

2024-02-22 Thread via GitHub
jessiedanwang commented on issue #9766: URL: https://github.com/apache/iceberg/issues/9766#issuecomment-1960025819 It turns out that there is a bug in our code, after that's fixed, we have no problem with 'writeTo' or 'merge into'. Thanks -- This is an automated message from the Apache Gi

[PR] Core: RowDelta should conflict with RewriteFiles operation [iceberg]

2024-02-22 Thread via GitHub
boroknagyz opened a new pull request, #9781: URL: https://github.com/apache/iceberg/pull/9781 RewriteFiles might be used to compact a table, e.g. it can replace lots of small files with a few big files. Concurrent RowDelta (DELETE/ UPDATE) operations must fail, because they can put the tabl

Re: [PR] PartitionKey [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on code in PR #453: URL: https://github.com/apache/iceberg-python/pull/453#discussion_r1499729828 ## pyiceberg/partitioning.py: ## @@ -215,3 +236,54 @@ def assign_fresh_partition_spec_ids(spec: PartitionSpec, old_schema: Schema, fre ) )

[PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
rahil-c opened a new pull request, #9782: URL: https://github.com/apache/iceberg/pull/9782 Implemented pagination in list apis based on the spec: https://github.com/apache/iceberg/pull/9660 @jackye1995 @nastra @danielcweeks @geruh ## Testing * Ran gradle build * Manual test

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-02-22 Thread via GitHub
aokolnychyi commented on PR #9721: URL: https://github.com/apache/iceberg/pull/9721#issuecomment-1960107350 @huaxingao, we can open up some utilities on Iceberg side, if needed. Unfortunately, the logic will be fairly coupled any way. I kind of hope we can offset some of the duplicating by

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2024-02-22 Thread via GitHub
aokolnychyi commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1960120064 @Fokko, aren't we using `DataFileWriter` from Avro for Iceberg metadata? Yeah, I fully support the idea, it is just my preliminary analysis showed it would have no effect on currently

Re: [PR] Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled [iceberg]

2024-02-22 Thread via GitHub
aokolnychyi commented on code in PR #9611: URL: https://github.com/apache/iceberg/pull/9611#discussion_r1499825432 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -52,6 +52,13 @@ public interface RewriteDataFiles int PARTIAL_PROGRESS_MAX_COMMITS

[I] Make the OAuth2 request scope configurable [iceberg-python]

2024-02-22 Thread via GitHub
flyrain opened a new issue, #462: URL: https://github.com/apache/iceberg-python/issues/462 ### Feature Request / Improvement [OAuth2 Scope](https://oauth.net/2/scope/) is a mechanism to limit an application's access to a user's account. It can also be used to ask for more information

Re: [I] Support Table Migration features from Spark [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on issue #354: URL: https://github.com/apache/iceberg-python/issues/354#issuecomment-1960143413 Hey @kevinjqliu thank you for raising this. I'm interested in writing up the PyIceberg implementation for [add_files](https://iceberg.apache.org/docs/latest/spark-procedures/#add

Re: [I] Make the OAuth2 request scope configurable [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on issue #462: URL: https://github.com/apache/iceberg-python/issues/462#issuecomment-1960161562 Hi @flyrain thanks for raising this. I just took a look into the [Java code](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/RESTSessionCata

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499864124 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -490,12 +522,29 @@ public void createNamespace( @Override public List listNamespaces(

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499864124 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -490,12 +522,29 @@ public void createNamespace( @Override public List listNamespaces(

Re: [I] Make the OAuth2 request scope configurable [iceberg-python]

2024-02-22 Thread via GitHub
flyrain commented on issue #462: URL: https://github.com/apache/iceberg-python/issues/462#issuecomment-1960183527 Thanks for conform, @syun64 . Glad it was taken care in Java. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1499882109 ## open-api/rest-catalog-open-api.yaml: ## @@ -1482,6 +1490,24 @@ components: explode: false example: "vended-credentials,remote-signing" +page-token:

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1499882109 ## open-api/rest-catalog-open-api.yaml: ## @@ -1482,6 +1490,24 @@ components: explode: false example: "vended-credentials,remote-signing" +page-token:

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499894821 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -490,12 +522,29 @@ public void createNamespace( @Override public List listNamespace

[PR] Deprecate DynamoDB Catalog to Reduce Catalog Scope [iceberg]

2024-02-22 Thread via GitHub
geruh opened a new pull request, #9783: URL: https://github.com/apache/iceberg/pull/9783 As discussed in the [community sync](https://youtu.be/uAQVGd5zV4I?si=cj0xpfprgvJIGYIm&t=1323), we want to reduce the scope of supported catalogs in Iceberg. There are currently many options, which crea

Re: [PR] PartitionKey [iceberg-python]

2024-02-22 Thread via GitHub
jqin61 commented on code in PR #453: URL: https://github.com/apache/iceberg-python/pull/453#discussion_r1499922547 ## pyiceberg/partitioning.py: ## @@ -215,3 +236,54 @@ def assign_fresh_partition_spec_ids(spec: PartitionSpec, old_schema: Schema, fre ) )

Re: [PR] Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled [iceberg]

2024-02-22 Thread via GitHub
RussellSpitzer commented on code in PR #9611: URL: https://github.com/apache/iceberg/pull/9611#discussion_r1499923578 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -52,6 +52,13 @@ public interface RewriteDataFiles int PARTIAL_PROGRESS_MAX_COMM

Re: [PR] Spark 3.5: Add max allowed failed commits to RewriteDataFiles when partial progress is enabled [iceberg]

2024-02-22 Thread via GitHub
RussellSpitzer commented on code in PR #9611: URL: https://github.com/apache/iceberg/pull/9611#discussion_r1499925187 ## api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java: ## @@ -52,6 +52,13 @@ public interface RewriteDataFiles int PARTIAL_PROGRESS_MAX_COMM

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499953848 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -224,6 +229,12 @@ public void initialize(String name, Map unresolved) { clien

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
jackye1995 commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499965680 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -224,6 +229,12 @@ public void initialize(String name, Map unresolved) {

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
jackye1995 commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499966845 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -224,6 +229,12 @@ public void initialize(String name, Map unresolved) {

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
jackye1995 commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1499968208 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -1045,6 +1094,27 @@ public void commitTransaction(SessionContext context, List commits)

Re: [PR] Deprecate DynamoDB Catalog to Reduce Catalog Scope [iceberg]

2024-02-22 Thread via GitHub
geruh commented on code in PR #9783: URL: https://github.com/apache/iceberg/pull/9783#discussion_r1499969849 ## aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbCatalog.java: ## @@ -84,7 +84,12 @@ import software.amazon.awssdk.services.dynamodb.model.TransactWriteItem

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2024-02-22 Thread via GitHub
Fokko commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1960379373 @aokolnychyi Hmm, I did some quick checks and that seems to be correct. I'm pretty sure that it was using the code because I was seeing exceptions and differences in the benchmarks. Let me

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-02-22 Thread via GitHub
jackye1995 commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-1960392226 Looks like the CI test failed, @zachdisc could you take a look and fix? Overall, we have the 2 options based on this thread: https://github.com/apache/iceberg/pull/9731#discussio

Re: [PR] Deprecate DynamoDB Catalog to Reduce Catalog Scope [iceberg]

2024-02-22 Thread via GitHub
amogh-jahagirdar commented on code in PR #9783: URL: https://github.com/apache/iceberg/pull/9783#discussion_r1500025586 ## aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbCatalog.java: ## @@ -84,7 +84,12 @@ import software.amazon.awssdk.services.dynamodb.model.Transa

[I] Support OAuth2 Client credential flow [iceberg-python]

2024-02-22 Thread via GitHub
flyrain opened a new issue, #463: URL: https://github.com/apache/iceberg-python/issues/463 ### Feature Request / Improvement It seems PyIceberg only supports OAuth2 [token exchange flow](https://datatracker.ietf.org/doc/html/rfc8693#name-token-exchange-request-and-). Here is the toke

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-02-22 Thread via GitHub
danielcweeks commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1960467569 The client credential flow is already implemented: [see here](https://github.com/apache/iceberg-python/blob/82d88920aaa7cba8593e93b5df9f3a87b13c88da/pyiceberg/catalog/rest.p

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-02-22 Thread via GitHub
flyrain commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1960485657 The client credential flow doesn't return a field `issued_token_type`. PyIceberg failed at validation: ``` ValidationError: 1 validation error for TokenResponse issued_

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-02-22 Thread via GitHub
danielcweeks commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1960499113 According to the [RFC](https://datatracker.ietf.org/doc/html/rfc6749#section-4.4.3) it should be `token_type`, but we currently only support `bearer`. The exchange does wo

[I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-02-22 Thread via GitHub
flyrain opened a new issue, #464: URL: https://github.com/apache/iceberg-python/issues/464 ### Feature Request / Improvement [ID Tokens](https://auth0.com/docs/secure/tokens/id-tokens) are commonly used in a scenario that a server combines authentication and authorization in a single

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-02-22 Thread via GitHub
flyrain commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1960507583 `token_type` are required by both flows in RFC. It is fine to be mandatory for both. The problem is the `issued_token_type`, The token exchange flow [RFC](https://datatracker.ie

Re: [PR] Add Pagination To List Apis [iceberg]

2024-02-22 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1500065202 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -1045,6 +1094,27 @@ public void commitTransaction(SessionContext context, List commits)

Re: [I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-02-22 Thread via GitHub
syun64 commented on issue #464: URL: https://github.com/apache/iceberg-python/issues/464#issuecomment-1960526232 My understanding is that when a backend client is talking to an API server, we should only support Client Credentials Flow or the direct use of access tokens. We are validating t

Re: [I] Add an iceberg-mr-runtime module [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] commented on issue #890: URL: https://github.com/apache/iceberg/issues/890#issuecomment-1960546858 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Git

Re: [I] Add an iceberg-mr-runtime module [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] closed issue #890: Add an iceberg-mr-runtime module URL: https://github.com/apache/iceberg/issues/890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] upsertion/deletion for tables with primary key [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] closed issue #893: upsertion/deletion for tables with primary key URL: https://github.com/apache/iceberg/issues/893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] upsertion/deletion for tables with primary key [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] commented on issue #893: URL: https://github.com/apache/iceberg/issues/893#issuecomment-1960546881 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Git

Re: [I] Support notification on table writes [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] closed issue #926: Support notification on table writes URL: https://github.com/apache/iceberg/issues/926 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Support notification on table writes [iceberg]

2024-02-22 Thread via GitHub
github-actions[bot] commented on issue #926: URL: https://github.com/apache/iceberg/issues/926#issuecomment-1960546904 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Git

  1   2   >