Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kaushiksrini commented on PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#issuecomment-2652631187 @Fokko thanks for the review! and @kevinjqliu thanks for making the changes! -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] Remove old metadata files [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1199: Remove old metadata files URL: https://github.com/apache/iceberg-python/issues/1199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

[PR] feat: Add `StrictMetricsEvaluator` [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n opened a new pull request, #963: URL: https://github.com/apache/iceberg-rust/pull/963 Added `StrictMetricsEvaluator`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Clean up old metadata [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#issuecomment-2652647676 Thanks @kaushiksrini I applied the simple test changes via github. Thanks @Fokko for the review -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] feat: Add `StrictMetricsEvaluator` [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n commented on PR #963: URL: https://github.com/apache/iceberg-rust/pull/963#issuecomment-2652648861 Tests will be added tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] feat: search current working directory for config file [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] .pyiceberg.yaml config files should be loaded from current dir instead of home folder [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1333: .pyiceberg.yaml config files should be loaded from current dir instead of home folder URL: https://github.com/apache/iceberg-python/issues/1333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] .pyiceberg.yaml config files should be loaded from current dir instead of home folder [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed issue #1333: .pyiceberg.yaml config files should be loaded from current dir instead of home folder URL: https://github.com/apache/iceberg-python/issues/1333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] feat: search current working directory for config file [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1464: URL: https://github.com/apache/iceberg-python/pull/1464#issuecomment-2652655967 rebased off main and fixed the tests. Thanks @IndexSeek for the contribution and @Fokko for the review :) -- This is an automated message from the Apache Git Service.

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2025-02-11 Thread via GitHub
ebyhr commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r1951940306 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -300,8 +300,7 @@ public Literal to(Type type) { case TIMESTAMP: return (Li

Re: [I] [feature] Table Scan should take into account the table's sort order [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on issue #1637: URL: https://github.com/apache/iceberg-python/issues/1637#issuecomment-2652677058 hey @iyad-f sure thing. Iceberg has the concept of sort order https://iceberg.apache.org/spec/#sorting An Iceberg table can declare the data is sorted in certain

[I] Support commit retrie [iceberg-rust]

2025-02-11 Thread via GitHub
ZENOTME opened a new issue, #964: URL: https://github.com/apache/iceberg-rust/issues/964 I would like to separate this task into multiple steps: 1. [ ] Identify the RetryableCommitError type. We can introduce a new `ErrorKind::RetryableCommitError` to abstract kinds of catalog

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951479991 ## package-lock.json: ## @@ -0,0 +1,1420 @@ +{ Review Comment: I don't think we should commit this one -- This is an automated message from the Apache Git

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951478851 ## pyproject.toml: ## @@ -1183,6 +1184,766 @@ ignore_missing_imports = true module = "tenacity.*" ignore_missing_imports = true +[[tool.mypy.overrides]] +modul

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951480372 ## package.json: ## @@ -0,0 +1,5 @@ +{ Review Comment: We should leave out this one as well, then we can also revert the changes in the `.gitignore` -- Th

Re: [PR] feat(catalog/rest): Add support for view related operations [iceberg-go]

2025-02-11 Thread via GitHub
zeroshade commented on code in PR #290: URL: https://github.com/apache/iceberg-go/pull/290#discussion_r1951420833 ## catalog/rest/rest.go: ## @@ -989,3 +990,66 @@ func (r *Catalog) CheckTableExists(ctx context.Context, identifier table.Identif } return true, nil

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2651806164 @mattmartin14 Thanks, let me dig into the remaining issues and see what's going on 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Create table format version constants [iceberg-python]

2025-02-11 Thread via GitHub
Fokko commented on issue #851: URL: https://github.com/apache/iceberg-python/issues/851#issuecomment-2651851976 Sorry for being late here, my mailbox is a overflowing a bit. The earlier example of: ```python DATA_FILE_TYPE: Dict[int, StructType] ``` is wrong, and s

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
bryanck commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2651863061 We just need to disable `FAIL_ON_SYMBOL_HASH_OVERFLOW`, that will also disable canonicalization, according to the docs. I made that change. -- This is an automated message from the Ap

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951521796 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951523089 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] feat(catalog/glue): Use awscfg from catalog creation when loading data from glue [iceberg-go]

2025-02-11 Thread via GitHub
zeroshade commented on code in PR #286: URL: https://github.com/apache/iceberg-go/pull/286#discussion_r1951376224 ## utils/context.go: ## @@ -0,0 +1,20 @@ +package utils + +import ( + "context" + + "github.com/aws/aws-sdk-go-v2/aws" +) + +type awsctxkey struct{} + +f

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530493 ## pyproject.toml: ## @@ -1183,6 +1184,766 @@ ignore_missing_imports = true module = "tenacity.*" ignore_missing_imports = true +[[tool.mypy.overrides]]

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530027 ## package.json: ## @@ -0,0 +1,5 @@ +{ Review Comment: removed that file -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951530296 ## package-lock.json: ## @@ -0,0 +1,1420 @@ +{ Review Comment: removed that file; it was created by npm when i ran "make lint" -- This is an automat

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2651967387 @Fokko - thank you for all the help on this. I'm hoping this is finally it. Make lint ran locally for me with no errors -- This is an automated message from the Apache Git

Re: [PR] build(deps): bump the gomod_updates group with 4 updates [iceberg-go]

2025-02-11 Thread via GitHub
zeroshade merged PR #300: URL: https://github.com/apache/iceberg-go/pull/300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2651821329 > Canonicalization can help when field names are reused within a single metadata file, so that seemed helpful still. canonicalization lifecycle is scoped to a single metadata fi

Re: [PR] Core: Adjust Jackson settings to handle large metadata json [iceberg]

2025-02-11 Thread via GitHub
stevenzwu commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2651667891 > Given partition summary field names and other snapshot properties are often not reused across different metadata, the interning causes more harm than good. @bryanck I didn't

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951518221 ## pyiceberg/table/upsert_util.py: ## @@ -0,0 +1,131 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951540998 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descriptio

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on PR #12217: URL: https://github.com/apache/iceberg/pull/12217#issuecomment-2651985358 I agree that it would be good to add this to the documentation! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951565008 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descriptio

Re: [PR] Update documentation / add missing Iceberg table read properties [iceberg]

2025-02-11 Thread via GitHub
wypoon commented on code in PR #12163: URL: https://github.com/apache/iceberg/pull/12163#discussion_r1951566070 ## docs/docs/configuration.md: ## @@ -32,10 +32,14 @@ Iceberg tables support table properties to configure table behavior, like the de | read.split.metadata-target-s

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951665086 ## core/src/main/java/org/apache/iceberg/SchemaUpdate.java: ## @@ -322,17 +303,45 @@ public UpdateSchema updateColumnDoc(String name, String doc) { // merge wi

Re: [PR] API, Core: Support default values in UpdateSchema [iceberg]

2025-02-11 Thread via GitHub
rdblue commented on code in PR #12211: URL: https://github.com/apache/iceberg/pull/12211#discussion_r1951668597 ## api/src/main/java/org/apache/iceberg/UpdateSchema.java: ## @@ -67,24 +70,52 @@ default UpdateSchema addColumn(String name, Type type) { } /** - * Add a ne

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2652066676 @Fokko - I saw the only thing that failed in the CI was the docs build. Any idea why and if it's something I can fix? -- This is an automated message from the Apache Git Ser

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
marcoaanogueira commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951669632 ## pyiceberg/table/__init__.py: ## @@ -1086,6 +1094,78 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMa

[PR] Build: Bump coverage from 7.6.11 to 7.6.12 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1648: URL: https://github.com/apache/iceberg-python/pull/1648 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.6.11 to 7.6.12. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's c

[PR] Build: Bump mkdocs-autorefs from 1.3.0 to 1.3.1 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1650: URL: https://github.com/apache/iceberg-python/pull/1650 Bumps [mkdocs-autorefs](https://github.com/mkdocstrings/autorefs) from 1.3.0 to 1.3.1. Release notes Sourced from https://github.com/mkdocstrings/autorefs/releases";>mkdocs-aut

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#discussion_r1951696715 ## pyiceberg/table/update/snapshot.py: ## @@ -84,14 +84,14 @@ from pyiceberg.table import Transaction -def _new_manifest_path(location: str, num: int

[PR] Build: Bump mkdocstrings-python from 1.14.6 to 1.15.0 [iceberg-python]

2025-02-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1649: URL: https://github.com/apache/iceberg-python/pull/1649 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.14.6 to 1.15.0. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstr

Re: [PR] Docs: Add documentation for Rate limiting in Spark Structured Streaming [iceberg]

2025-02-11 Thread via GitHub
singhpk234 commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1951710433 ## docs/docs/spark-configuration.md: ## @@ -155,16 +155,18 @@ spark.read .table("catalog.db.table") ``` -| Spark option| Default | Descri

Re: [PR] Build: Bump coverage from 7.6.11 to 7.6.12 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1648: URL: https://github.com/apache/iceberg-python/pull/1648 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Spec: Update partition stats for V3 [iceberg]

2025-02-11 Thread via GitHub
aokolnychyi commented on code in PR #12098: URL: https://github.com/apache/iceberg/pull/12098#discussion_r1951754710 ## format/spec.md: ## @@ -927,20 +927,21 @@ These rows must be sorted (in ascending manner with NULL FIRST) by `partition` f The schema of the partition stati

Re: [PR] feat: Add existing parquet files [iceberg-rust]

2025-02-11 Thread via GitHub
jonathanc-n commented on PR #960: URL: https://github.com/apache/iceberg-rust/pull/960#issuecomment-2652435373 I have changed the data file builder and reimplemented the original. I couldn't change the parameter passed to `to_data_file_builder` as the ParquetWriter returns the unparsed meta

[PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-11 Thread via GitHub
dramaticlly opened a new pull request, #12228: URL: https://github.com/apache/iceberg/pull/12228 This PR adds a new register-table with overwrite option on Catalog interface to allow overwrite table metadata of an existing Iceberg table. The overwrite is achieved via `TableOperations.commit

Re: [PR] Upgrade `cryptography` dependency to v44.0.1 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1651: URL: https://github.com/apache/iceberg-python/pull/1651#discussion_r1951725286 ## pyproject.toml: ## @@ -49,7 +49,7 @@ include = [ ] [tool.poetry.dependencies] -python = "^3.9, !=3.9.7" +python = "^3.9.2, !=3.9.7" Review Comment:

Re: [PR] Add support for `write.metadata.path` [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1642: URL: https://github.com/apache/iceberg-python/pull/1642#discussion_r1951707200 ## pyiceberg/table/__init__.py: ## @@ -1212,6 +1213,23 @@ def to_daft(self) -> daft.DataFrame: return daft.read_iceberg(self) +@staticmethod

Re: [PR] Build: Bump cython from 3.0.11 to 3.0.12 [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu merged PR #1646: URL: https://github.com/apache/iceberg-python/pull/1646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1951735621 ## pyiceberg/table/__init__.py: ## @@ -1086,6 +1094,78 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping

Re: [PR] Spec: Typo - missing be [iceberg]

2025-02-11 Thread via GitHub
szehon-ho commented on PR #12229: URL: https://github.com/apache/iceberg/pull/12229#issuecomment-2652270632 Thanks @RussellSpitzer ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-11 Thread via GitHub
dramaticlly closed pull request #12228: Core,Api: Add overwrite option when register external table to catalog URL: https://github.com/apache/iceberg/pull/12228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] java.lang.ClassNotFoundException: org.apache.iceberg.spark.actions.ManifestFileBeanBeanInfo [iceberg]

2025-02-11 Thread via GitHub
melin commented on issue #12231: URL: https://github.com/apache/iceberg/issues/12231#issuecomment-2652542489 [yarn.txt](https://github.com/user-attachments/files/18761360/yarn.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] feat(catalog/glue): Use awscfg from catalog creation when loading data from glue [iceberg-go]

2025-02-11 Thread via GitHub
curtisr7 commented on code in PR #286: URL: https://github.com/apache/iceberg-go/pull/286#discussion_r1951888034 ## utils/context.go: ## @@ -0,0 +1,20 @@ +package utils + +import ( + "context" + + "github.com/aws/aws-sdk-go-v2/aws" +) + +type awsctxkey struct{} + +fu

Re: [PR] Feat/add support kerberize hivemetastore [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu closed pull request #1652: Feat/add support kerberize hivemetastore URL: https://github.com/apache/iceberg-python/pull/1652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Feat/add support kerberize hivemetastore [iceberg-python]

2025-02-11 Thread via GitHub
kevinjqliu commented on PR #1652: URL: https://github.com/apache/iceberg-python/pull/1652#issuecomment-2652569173 wrong repo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [WIP] Ignore UnknownType in General Parquet Writer [iceberg]

2025-02-11 Thread via GitHub
HonahX commented on code in PR #12177: URL: https://github.com/apache/iceberg/pull/12177#discussion_r1952099384 ## parquet/src/main/java/org/apache/iceberg/parquet/TypeToMessageType.java: ## @@ -71,6 +89,10 @@ public GroupType struct(StructType struct, Type.Repetition repetitio

Re: [PR] Core: Prevent dropping column which is referenced by active partition specs [iceberg]

2025-02-11 Thread via GitHub
anuragmantri commented on PR #11842: URL: https://github.com/apache/iceberg/pull/11842#issuecomment-2652867208 The PR is superseded by https://github.com/apache/iceberg/pull/11868 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

<    1   2