Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2025-02-11 Thread via GitHub
anuragmantri commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2652867208 The PR is superseded by https://github.com/apache/iceberg/pull/11868 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] [WIP] Ignore UnknownType in General Parquet Writer [iceberg]

2025-02-11 Thread via GitHub
HonahX commented on code in PR #12177: URL: https://github.com/apache/iceberg/pull/12177#discussion_r1952099384 ## parquet/src/main/java/org/apache/iceberg/parquet/TypeToMessageType.java: ## @@ -71,6 +89,10 @@ public GroupType struct(StructType struct, Type.Repetition repetitio

Re: [I] Update Arrow deps once they release a version containing `RowSelection::union' [iceberg-rust]

2025-02-11 Thread via GitHub
Xuanwo closed issue #605: Update Arrow deps once they release a version containing `RowSelection::union' URL: https://github.com/apache/iceberg-rust/issues/605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] chore: use RowSelection::union from arrow-rs [iceberg-rust]

2025-02-11 Thread via GitHub
Xuanwo merged PR #953: URL: https://github.com/apache/iceberg-rust/pull/953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Materialized View Spec [iceberg]

2025-02-11 Thread via GitHub
JanKaul commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1952094747 ## format/view-spec.md: ## @@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when t

Re: [PR] [WIP] Ignore UnknownType in General Parquet Writer [iceberg]

2025-02-11 Thread via GitHub
HonahX commented on code in PR #12177: URL: https://github.com/apache/iceberg/pull/12177#discussion_r1952090213 ## parquet/src/main/java/org/apache/iceberg/parquet/TypeToMessageType.java: ## @@ -56,6 +56,10 @@ public class TypeToMessageType { LogicalTypeAnnotation.timesta

Re: [PR] Spec: Allow Equality Deletes with Row Lineage and Define Behavior [iceberg]

2025-02-11 Thread via GitHub
singhpk234 commented on code in PR #12230: URL: https://github.com/apache/iceberg/pull/12230#discussion_r1952049522 ## format/spec.md: ## @@ -392,7 +392,7 @@ In v3 and later, an Iceberg table can track row lineage fields for all newly cre These fields are assigned and update

Re: [PR] Spec: Allow Equality Deletes with Row Lineage and Define Behavior [iceberg]

2025-02-11 Thread via GitHub
pvary commented on code in PR #12230: URL: https://github.com/apache/iceberg/pull/12230#discussion_r1952054468 ## format/spec.md: ## @@ -392,7 +392,7 @@ In v3 and later, an Iceberg table can track row lineage fields for all newly cre These fields are assigned and updated by

Re: [PR] Spec: Allow Equality Deletes with Row Lineage and Define Behavior [iceberg]

2025-02-11 Thread via GitHub
pvary commented on code in PR #12230: URL: https://github.com/apache/iceberg/pull/12230#discussion_r1952013431 ## format/spec.md: ## @@ -1766,4 +1766,4 @@ The Geometry and Geography class hierarchy and its Well-known text (WKT) and Wel Points are always defined by the coordi

[I] Support commit retrie [iceberg-rust]

2025-02-11 Thread via GitHub
ZENOTME opened a new issue, #964: URL: https://github.com/apache/iceberg-rust/issues/964 I would like to separate this task into multiple steps: 1. [ ] Identify the RetryableCommitError type. We can introduce a new `ErrorKind::RetryableCommitError` to abstract kinds of catalog

Re: [I] [feature] Table Scan should take into account the table's sort order [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on issue #1637: URL: https://github.com/apache/iceberg-python/issues/1637#issuecomment-2652677058 hey @iyad-f sure thing. Iceberg has the concept of sort order https://iceberg.apache.org/spec/#sorting An Iceberg table can declare the data is sorted in certain

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2025-02-11 Thread via GitHub
ebyhr commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r1951940306 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -300,8 +300,7 @@ public Literal to(Type type) { case TIMESTAMP: return (Li

Re: [PR] feat: search current working directory for config file [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464#issuecomment-2652655967 rebased off main and fixed the tests. Thanks @IndexSeek for the contribution and @Fokko for the review :) -- This is an automated message from the Apache Git Service.

Re: [I] .pyiceberg.yaml config files should be loaded from current dir instead of home folder [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1333: .pyiceberg.yaml config files should be loaded from current dir instead of home folder URL: https://github.com/apache/iceberg-python/issues/1333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] .pyiceberg.yaml config files should be loaded from current dir instead of home folder [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1333: .pyiceberg.yaml config files should be loaded from current dir instead of home folder URL: https://github.com/apache/iceberg-python/issues/1333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] feat: search current working directory for config file [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] feat: Add `StrictMetricsEvaluator` [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n commented on PR #963: URL: https://github.com/apache/iceberg-rust/pull/963#issuecomment-2652648861 Tests will be added tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#issuecomment-2652647676 Thanks @kaushiksrini I applied the simple test changes via github. Thanks @Fokko for the review -- This is an automated message from the Apache Git Service. To respond to the

[PR] feat: Add `StrictMetricsEvaluator` [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n opened a new pull request, #963: URL: https://github.com/apache/iceberg-rust/pull/963 Added `StrictMetricsEvaluator`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Remove old metadata files [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1199: Remove old metadata files URL: https://github.com/apache/iceberg-python/issues/1199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kaushiksrini commented on PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#issuecomment-2652631187 @Fokko thanks for the review! and @kevinjqliu thanks for making the changes! -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#discussion_r1951956541 ## tests/catalog/test_sql.py: ## @@ -1613,3 +1614,50 @@ def test_merge_manifests_local_file_system(catalog: SqlCatalog, arrow_table_with tbl.append(

Re: [PR] partitioned write support [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #353: URL: https://github.com/apache/iceberg-python/pull/353#issuecomment-2652609001 hey @sungwy @jqin61 just wanted to double check that this PR is no longer relevant. I believe all components of partitioned write support has been already merged -- This is an

Re: [PR] Feat/add support kerberize hivemetastore [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1634: URL: https://github.com/apache/iceberg-python/pull/1634#discussion_r1951938200 ## pyproject.toml: ## @@ -80,6 +80,8 @@ sqlalchemy = { version = "^2.0.18", optional = true } getdaft = { version = ">=0.2.12", optional = true } cachetools

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-11 Thread via GitHub
dramaticlly commented on PR #12228: URL: https://github.com/apache/iceberg/pull/12228#issuecomment-2652589821 [Java CI Failure](https://github.com/apache/iceberg/actions/runs/13275773190/job/37064995172?pr=12228) is timing out on concurrent fast append and seems unrelated to the change.

Re: [PR] Feat/add support kerberize hivemetastore [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1652: URL: https://github.com/apache/iceberg-python/pull/1652#issuecomment-2652569173 wrong repo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Feat/add support kerberize hivemetastore [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed pull request #1652: Feat/add support kerberize hivemetastore URL: https://github.com/apache/iceberg-python/pull/1652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] feat(catalog/glue): Use awscfg from catalog creation when loading data from glue [iceberg-go]

2025-02-11 Thread via GitHub
curtisr7 commented on code in PR #286: URL: https://github.com/apache/iceberg-go/pull/286#discussion_r1951888034 ## utils/context.go: ## @@ -0,0 +1,20 @@ +package utils + +import ( + "context" + + "github.com/aws/aws-sdk-go-v2/aws" +) + +type awsctxkey struct{} + +fu

Re: [I] java.lang.ClassNotFoundException: org.apache.iceberg.spark.actions.ManifestFileBeanBeanInfo [iceberg]

2025-02-11 Thread via GitHub
melin commented on issue #12231: URL: https://github.com/apache/iceberg/issues/12231#issuecomment-2652542489 [yarn.txt](https://github.com/user-attachments/files/18761360/yarn.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[I] java.lang.ClassNotFoundException: org.apache.iceberg.spark.actions.ManifestFileBeanBeanInfo [iceberg]

2025-02-11 Thread via GitHub
melin opened a new issue, #12231: URL: https://github.com/apache/iceberg/issues/12231 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 ``` dfs://master:8020/user/superior/spark-jobserver/tempJars/laOAlRtRLJogYzjQPkBLkp0

Re: [I] Improve ThreadPools for graceful shutdown [iceberg]

2025-02-11 Thread via GitHub
ochanism commented on issue #12220: URL: https://github.com/apache/iceberg/issues/12220#issuecomment-2652510589 @pvary Oh, thanks for the information! That's very helpful to understand the current status. Here are my issue details. - error stack trace ``` server error: Er

Re: [PR] feat: Add existing parquet files [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n commented on PR #960: URL: https://github.com/apache/iceberg-rust/pull/960#issuecomment-2652435373 I have changed the data file builder and reimplemented the original. I couldn't change the parameter passed to `to_data_file_builder` as the ParquetWriter returns the unparsed meta

Re: [PR] Spec: Allow Equality Deletes with Row Lineage and Define Behavior [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on code in PR #12230: URL: https://github.com/apache/iceberg/pull/12230#discussion_r1951840824 ## format/spec.md: ## @@ -392,7 +392,7 @@ In v3 and later, an Iceberg table can track row lineage fields for all newly cre These fields are assigned and updated

Re: [PR] Spec: Allow Equality Deletes with Row Lineage and Define Behavior [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on code in PR #12230: URL: https://github.com/apache/iceberg/pull/12230#discussion_r1951840824 ## format/spec.md: ## @@ -392,7 +392,7 @@ In v3 and later, an Iceberg table can track row lineage fields for all newly cre These fields are assigned and updated

Re: [PR] Spec: Update partition stats for V3 [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on code in PR #12098: URL: https://github.com/apache/iceberg/pull/12098#discussion_r1951829171 ## format/spec.md: ## @@ -927,20 +927,21 @@ These rows must be sorted (in ascending manner with NULL FIRST) by `partition` f The schema of the partition statist

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-11 Thread via GitHub
dramaticlly closed pull request #12228: Core,Api: Add overwrite option when register external table to catalog URL: https://github.com/apache/iceberg/pull/12228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
bryanck commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2652364540 > [for my understanding] I thought we had a way to lazy load metadata in REST, the complete metadata parsing would only be required at the time of commit ? Are all the tables write heav

Re: [I] Consolidate catalog behavior [iceberg-python]

2025-02-11 Thread via GitHub
github-actions[bot] commented on issue #813: URL: https://github.com/apache/iceberg-python/issues/813#issuecomment-2652346865 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity oc

Re: [I] Configure timestamp downcast programmatically [iceberg-python]

2025-02-11 Thread via GitHub
github-actions[bot] closed issue #960: Configure timestamp downcast programmatically URL: https://github.com/apache/iceberg-python/issues/960 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Build: Bump mkdocstrings-python from 1.14.6 to 1.15.0 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1649: URL: https://github.com/apache/iceberg-python/pull/1649 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] Configure timestamp downcast programmatically [iceberg-python]

2025-02-11 Thread via GitHub
github-actions[bot] commented on issue #960: URL: https://github.com/apache/iceberg-python/issues/960#issuecomment-2652346846 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [I] Kafka Connect: Include design docs [iceberg]

2025-02-11 Thread via GitHub
github-actions[bot] commented on issue #10841: URL: https://github.com/apache/iceberg/issues/10841#issuecomment-2652343862 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Kafka Connect: Include design docs [iceberg]

2025-02-11 Thread via GitHub
github-actions[bot] closed issue #10841: Kafka Connect: Include design docs URL: https://github.com/apache/iceberg/issues/10841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Rate limiting feature for structured streaming [iceberg]

2025-02-11 Thread via GitHub
singhpk234 commented on issue #7885: URL: https://github.com/apache/iceberg/issues/7885#issuecomment-2652333168 yes, that true @wypoon, presently its the limitation as for the initial implementation, I didn't wanted to block it on opening a file and reading it partially. As there we

Re: [PR] Build: Bump mkdocstrings-python from 1.14.6 to 1.15.0 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1649: URL: https://github.com/apache/iceberg-python/pull/1649#issuecomment-2652322297 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Build: Bump griffe from 1.5.6 to 1.5.7 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1647: URL: https://github.com/apache/iceberg-python/pull/1647 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Spec: Update partition stats for V3 [iceberg]

2025-02-11 Thread via GitHub
aokolnychyi commented on code in PR #12098: URL: https://github.com/apache/iceberg/pull/12098#discussion_r1951754710 ## format/spec.md: ## @@ -927,20 +927,21 @@ These rows must be sorted (in ascending manner with NULL FIRST) by `partition` f The schema of the partition stati

Re: [PR] Spec: Update partition stats for V3 [iceberg]

2025-02-11 Thread via GitHub
aokolnychyi commented on code in PR #12098: URL: https://github.com/apache/iceberg/pull/12098#discussion_r1951754710 ## format/spec.md: ## @@ -927,20 +927,21 @@ These rows must be sorted (in ascending manner with NULL FIRST) by `partition` f The schema of the partition stati

Re: [I] Add unit tests for ColumnarBatchUtil using mocking [iceberg]

2025-02-11 Thread via GitHub
anuragmantri commented on issue #12054: URL: https://github.com/apache/iceberg/issues/12054#issuecomment-2652300353 Hi @ManasiRN @Monika-Rajendran-97 - Thanks for your interest. I have started working on this, but I don't wish to block progress. Please feel free to submit your patches. If

Re: [PR] feat: Make some REST methods public [iceberg-rust]

2025-02-11 Thread via GitHub
peasee commented on PR #922: URL: https://github.com/apache/iceberg-rust/pull/922#issuecomment-2652299260 Thanks for your response! I ended up taking a different direction that does not use Iceberg, so I'll be closing this PR. -- This is an automated message from the Apache Git Service. T

Re: [PR] feat: Make some REST methods public [iceberg-rust]

2025-02-11 Thread via GitHub
peasee closed pull request #922: feat: Make some REST methods public URL: https://github.com/apache/iceberg-rust/pull/922 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] Rate limiting feature for structured streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on issue #7885: URL: https://github.com/apache/iceberg/issues/7885#issuecomment-2652290719 @singhpk234 for my understanding, can you please confirm or refute the following -- Suppose streaming-max-rows-per-micro-batch = 1000 and streaming-max-files-per-micro-batch > 1. S

Re: [PR] Build: Bump mkdocs-autorefs from 1.3.0 to 1.3.1 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1650: URL: https://github.com/apache/iceberg-python/pull/1650 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] Add unit tests for ColumnarBatchUtil using mocking [iceberg]

2025-02-11 Thread via GitHub
ManasiRN commented on issue #12054: URL: https://github.com/apache/iceberg/issues/12054#issuecomment-2652282252 Hi @anuragmantri, I’d like to contribute to this issue by adding unit tests for ColumnarBatchUtil using mocking. Let me know if you have any specific considerations or if you’ve a

Re: [PR] Spec: Typo - missing be [iceberg]

2025-02-11 Thread via GitHub
RussellSpitzer merged PR #12229: URL: https://github.com/apache/iceberg/pull/12229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Spec: Typo - missing be [iceberg]

2025-02-11 Thread via GitHub
szehon-ho commented on PR #12229: URL: https://github.com/apache/iceberg/pull/12229#issuecomment-2652270632 Thanks @RussellSpitzer ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951735621 ## pyiceberg/table/__init__.py: ## @@ -1086,6 +1094,78 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping

Re: [PR] Build: Bump coverage from 7.6.11 to 7.6.12 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1648: URL: https://github.com/apache/iceberg-python/pull/1648 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump cython from 3.0.11 to 3.0.12 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1646: URL: https://github.com/apache/iceberg-python/pull/1646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Upgrade `cryptography` dependency to v44.0.1 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1651: URL: https://github.com/apache/iceberg-python/pull/1651#discussion_r1951725286 ## pyproject.toml: ## @@ -49,7 +49,7 @@ include = [ ] [tool.poetry.dependencies] -python = "^3.9, !=3.9.7" +python = "^3.9.2, !=3.9.7" Review Comment:

[PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-11 Thread via GitHub
dramaticlly opened a new pull request, #12228: URL: https://github.com/apache/iceberg/pull/12228 This PR adds a new register-table with overwrite option on Catalog interface to allow overwrite table metadata of an existing Iceberg table. The overwrite is achieved via `TableOperations.commit

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#discussion_r1951707200 ## pyiceberg/table/__init__.py: ## @@ -1212,6 +1213,23 @@ def to_daft(self) -> daft.DataFrame: return daft.read_iceberg(self) +@staticmethod

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#discussion_r1951696715 ## pyiceberg/table/update/snapshot.py: ## @@ -84,14 +84,14 @@ from pyiceberg.table import Transaction -def _new_manifest_path(location: str, num: int

[PR] Build: Bump mkdocs-autorefs from 1.3.0 to 1.3.1 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1650: URL: https://github.com/apache/iceberg-python/pull/1650 Bumps [mkdocs-autorefs](https://github.com/mkdocstrings/autorefs) from 1.3.0 to 1.3.1. Release notes Sourced from https://github.com/mkdocstrings/autorefs/releases";>mkdocs-aut

[PR] Build: Bump coverage from 7.6.11 to 7.6.12 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1648: URL: https://github.com/apache/iceberg-python/pull/1648 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.6.11 to 7.6.12. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's c

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
singhpk234 commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951710433 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descri

[PR] Build: Bump mkdocstrings-python from 1.14.6 to 1.15.0 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1649: URL: https://github.com/apache/iceberg-python/pull/1649 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.14.6 to 1.15.0. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstr

[PR] Build: Bump griffe from 1.5.6 to 1.5.7 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1647: URL: https://github.com/apache/iceberg-python/pull/1647 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 1.5.6 to 1.5.7. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.

[PR] Build: Bump cython from 3.0.11 to 3.0.12 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1646: URL: https://github.com/apache/iceberg-python/pull/1646 Bumps [cython](https://github.com/cython/cython) from 3.0.11 to 3.0.12. Changelog Sourced from https://github.com/cython/cython/blob/master/CHANGES.rst";>cython's changelog.

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-265384 @bryanck thanks for the experimentation with canonicalization. do you have any micro/jmh benchmark for the parser performance? if yes, maybe it would be useful to add it to the Iceber

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#issuecomment-2652211414 cc @Fokko @smaheshwar-pltr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#discussion_r1951696715 ## pyiceberg/table/update/snapshot.py: ## @@ -84,14 +84,14 @@ from pyiceberg.table import Transaction -def _new_manifest_path(location: str, num: int

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951676633 ## api/src/main/java/org/apache/iceberg/UpdateSchema.java: ## @@ -125,16 +185,23 @@ default UpdateSchema addColumn(String parent, String name, Type type) { * @par

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951673332 ## api/src/main/java/org/apache/iceberg/UpdateSchema.java: ## @@ -280,6 +410,30 @@ default UpdateSchema updateColumn(String name, Type.PrimitiveType newType, Strin

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
marcoaanogueira commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951669632 ## pyiceberg/table/__init__.py: ## @@ -1086,6 +1094,78 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMa

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2652066676 @Fokko - I saw the only thing that failed in the CI was the docs build. Any idea why and if it's something I can fix? -- This is an automated message from the Apache Git Ser

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
marcoaanogueira commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951669632 ## pyiceberg/table/__init__.py: ## @@ -1086,6 +1094,78 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMa

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951668597 ## api/src/main/java/org/apache/iceberg/UpdateSchema.java: ## @@ -67,24 +70,52 @@ default UpdateSchema addColumn(String name, Type type) { } /** - * Add a ne

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951665086 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -322,17 +303,45 @@ public UpdateSchema updateColumnDoc(String name, String doc) { // merge wi

[PR] API: Deprecate NestedType.of in favor of builder [iceberg]

2025-02-11 Thread via GitHub
rdblue opened a new pull request, #12227: URL: https://github.com/apache/iceberg/pull/12227 This is a follow up to #12211. While adding support for default values in `UpdateSchema`, many of the changes were to use the `NestedField` builder's copy constructor, `from(NestedField)`, so that fi

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
bryanck commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2652018035 I switched back to the original change, to just disable intern and the hash collision check. Disabling canonicalization can impact performance significantly. -- This is an automated

Re: [PR] Update documentation / add missing Iceberg table read properties [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12163: URL: https://github.com/apache/iceberg/pull/12163#discussion_r1951566070 ## docs/docs/configuration.md: ## @@ -32,10 +32,14 @@ Iceberg tables support table properties to configure table behavior, like the de | read.split.metadata-target-s

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951565008 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descriptio

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on PR #12217: URL: https://github.com/apache/iceberg/pull/12217#issuecomment-2651985358 I agree that it would be good to add this to the documentation! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951540998 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descriptio

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530296 ## package-lock.json: ## @@ -0,0 +1,1420 @@ +{ Review Comment: removed that file; it was created by npm when i ran "make lint" -- This is an automat

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2651967387 @Fokko - thank you for all the help on this. I'm hoping this is finally it. Make lint ran locally for me with no errors -- This is an automated message from the Apache Git

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530493 ## pyproject.toml: ## @@ -1183,6 +1184,766 @@ ignore_missing_imports = true module = "tenacity.*" ignore_missing_imports = true +[[tool.mypy.overrides]]

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530027 ## package.json: ## @@ -0,0 +1,5 @@ +{ Review Comment: removed that file -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] feat(catalog/glue): Use awscfg from catalog creation when loading data from glue [iceberg-go]

2025-02-11 Thread via GitHub
zeroshade commented on code in PR #286: URL: https://github.com/apache/iceberg-go/pull/286#discussion_r1951376224 ## utils/context.go: ## @@ -0,0 +1,20 @@ +package utils + +import ( + "context" + + "github.com/aws/aws-sdk-go-v2/aws" +) + +type awsctxkey struct{} + +f

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951523089 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951521796 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951518221 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951480372 ## package.json: ## @@ -0,0 +1,5 @@ +{ Review Comment: We should leave out this one as well, then we can also revert the changes in the `.gitignore` -- Th

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951479991 ## package-lock.json: ## @@ -0,0 +1,1420 @@ +{ Review Comment: I don't think we should commit this one -- This is an automated message from the Apache Git

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951478851 ## pyproject.toml: ## @@ -1183,6 +1184,766 @@ ignore_missing_imports = true module = "tenacity.*" ignore_missing_imports = true +[[tool.mypy.overrides]] +modul

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951477462 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. S

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951471685 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. S

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951471241 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. S

  1   2   >