Re: [PR] Core: Exclude unexpected namespaces JdbcCatalog.listNamespaces [iceberg]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #10498: URL: https://github.com/apache/iceberg/pull/10498#discussion_r1675399327 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -466,6 +466,16 @@ public List listNamespaces(Namespace namespace) throws NoSuchNamespac

Re: [PR] Core: Exclude unexpected namespaces JdbcCatalog.listNamespaces [iceberg]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #10498: URL: https://github.com/apache/iceberg/pull/10498#discussion_r1675399047 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -466,6 +466,16 @@ public List listNamespaces(Namespace namespace) throws NoSuchNamespac

[PR] Glue endpoint config variable, continue #530 [iceberg-python]

2024-07-11 Thread via GitHub
HonahX opened a new pull request, #920: URL: https://github.com/apache/iceberg-python/pull/920 This PR builds on @sebpretzer 's great work https://github.com/apache/iceberg-python/pull/530. It rebases on main and add doc for glue catalog properties @sebpretzer I hope you don’t mind

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-11 Thread via GitHub
jbonofre commented on PR #10680: URL: https://github.com/apache/iceberg/pull/10680#issuecomment-2224915621 It looks good to me indeed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
lurnagao-dahua commented on PR #10661: URL: https://github.com/apache/iceberg/pull/10661#issuecomment-2224897427 Could you please take a review when you have time? @pvary I would be very grateful. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-11 Thread via GitHub
ajantha-bhat commented on PR #10680: URL: https://github.com/apache/iceberg/pull/10680#issuecomment-2224858353 Closing and repoening PR to trigger CI build. The current failure is unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-11 Thread via GitHub
ajantha-bhat closed pull request #10680: Core: Fix NPE during conflict handling of NULL partitions URL: https://github.com/apache/iceberg/pull/10680 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2224802452 Likely an issue with path-style vs virtual-hosted-style s3 access https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access

Re: [PR] Bump coverage from 7.5.4 to 7.6.0 [iceberg-python]

2024-07-11 Thread via GitHub
HonahX merged PR #917: URL: https://github.com/apache/iceberg-python/pull/917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2024-07-11 Thread via GitHub
ArijitSinghEDA commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2224784281 @kevinjqliu yes, I concur that too. Like I said before, in the REST server only it prefixed the bucket name to the MinIO service name, due to which it is unable to make a

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2224740046 @ArijitSinghEDA Something I noticed about the error message ``` pyiceberg.exceptions.ServerError: SdkClientException: Received an UnknownHostException when attemptin

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2024-07-11 Thread via GitHub
ArijitSinghEDA commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2224625065 Hi @kevinjqliu The example shared is very insightful, but my issue is that I have a MinIO service serving already for other tasks as well, and I want to access this M

Re: [PR] Flink: Backport #10565 to v1.18 and v1.19 [iceberg]

2024-07-11 Thread via GitHub
fengjiajie commented on PR #10676: URL: https://github.com/apache/iceberg/pull/10676#issuecomment-2224377733 Thanks for the review @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Docs: Add note on write distribution change when adding local order [iceberg]

2024-07-11 Thread via GitHub
manuzhang commented on code in PR #10647: URL: https://github.com/apache/iceberg/pull/10647#discussion_r1675111335 ## docs/docs/spark-ddl.md: ## @@ -425,6 +427,9 @@ To order within each task, not across tasks, use `LOCALLY ORDERED BY`: ALTER TABLE prod.db.sample WRITE LOCALLY

Re: [PR] Core: use bulk delete when removing old metadata.json files [iceberg]

2024-07-11 Thread via GitHub
amogh-jahagirdar merged PR #10679: URL: https://github.com/apache/iceberg/pull/10679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Core: use bulk delete when removing old metadata.json files [iceberg]

2024-07-11 Thread via GitHub
amogh-jahagirdar commented on PR #10679: URL: https://github.com/apache/iceberg/pull/10679#issuecomment-2224356987 Thanks @dramaticlly and thanks @RussellSpitzer for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Core: use bulk delete when removing old metadata.json files [iceberg]

2024-07-11 Thread via GitHub
amogh-jahagirdar commented on PR #10679: URL: https://github.com/apache/iceberg/pull/10679#issuecomment-2224356744 Yeah I double checked, CatalogTests#testMetadataFileLocationsRemovalAfterCommit will exercise both paths by virtue of testing different catalogs with different FileIO configur

Re: [I] [🐞] Collection of a few bugs [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #864: URL: https://github.com/apache/iceberg-python/issues/864#issuecomment-2224355480 I dont think the REST server API is at fault here since this request corresponds with the [`update_table` function](https://github.com/kevinjqliu/iceberg-rest-catalog/blob/7c

Re: [I] [🐞] Collection of a few bugs [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #864: URL: https://github.com/apache/iceberg-python/issues/864#issuecomment-2224351547 @HonahX Ok so I've found a way to reproduce this. Requires a bit of a setup. I found this while working on https://github.com/kevinjqliu/iceberg-rest-catalog and testing it

Re: [I] [Bug] Load the proper AWS credential for glue/dynamodb catalog [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #892: URL: https://github.com/apache/iceberg-python/issues/892#issuecomment-2224334703 Another consideration is that there can be 2 separate s3 credentials. One to pass to glue/dynamo. Another for the catalog itself. -- This is an automated message from the

Re: [I] Create table properties does not support boolean value [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #895: URL: https://github.com/apache/iceberg-python/issues/895#issuecomment-2224326839 Created #919 to track the fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[I] Set properties boolean value to lowercase string [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu opened a new issue, #919: URL: https://github.com/apache/iceberg-python/issues/919 ### Apache Iceberg version None ### Please describe the bug 🐞 ## Issue As shown in #895, setting a property to python's boolean value (`True`/`False`) will end up saving the

Re: [I] Refactor `file_io_s3_test.rs` to reuse container for test. [iceberg-rust]

2024-07-11 Thread via GitHub
liurenjie1024 commented on issue #453: URL: https://github.com/apache/iceberg-rust/issues/453#issuecomment-2224299105 > I'll take a crack at this Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Docs: Add note on write distribution change when adding local order [iceberg]

2024-07-11 Thread via GitHub
szehon-ho commented on code in PR #10647: URL: https://github.com/apache/iceberg/pull/10647#discussion_r1675009241 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSetWriteDistributionAndOrdering.java: ## @@ -213,6 +213,26 @@ public void testS

Re: [PR] Docs: Add note on write distribution change when adding local order [iceberg]

2024-07-11 Thread via GitHub
szehon-ho commented on code in PR #10647: URL: https://github.com/apache/iceberg/pull/10647#discussion_r1674895594 ## docs/docs/spark-ddl.md: ## @@ -245,7 +246,8 @@ ALTER TABLE prod.db.sample ADD COLUMN points.value.b int; ``` -Note: Altering a map 'key' column by adding col

Re: [PR] #9073 Junit 4 tests switched to JUnit 5 [iceberg]

2024-07-11 Thread via GitHub
igoradulian closed pull request #9793: #9073 Junit 4 tests switched to JUnit 5 URL: https://github.com/apache/iceberg/pull/9793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Create table format version constants [iceberg-python]

2024-07-11 Thread via GitHub
jayceslesar commented on issue #851: URL: https://github.com/apache/iceberg-python/issues/851#issuecomment-222423 Could define a `BaseVersion` class or something that downstream classes could inherit from in order to dish out versions of functions as expected/store constants? Might be t

Re: [I] [Bug] Load the proper AWS credential for glue/dynamodb catalog [iceberg-python]

2024-07-11 Thread via GitHub
jayceslesar commented on issue #892: URL: https://github.com/apache/iceberg-python/issues/892#issuecomment-2224262949 seems like a misnomer for dynamo/glue to expect something called `s3.some-aws-access-key`? -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
lurnagao-dahua commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674927325 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,46 @@ public void testCustomCatalog() throws IOException { testI

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-11 Thread via GitHub
karuppayya commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674793152 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected Sta

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
lurnagao-dahua commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674923132 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,56 @@ public void testCustomCatalog() throws IOException { testI

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674913965 ## tests/io/test_pyarrow.py: ## @@ -1798,3 +1799,35 @@ def test_identity_partition_on_multi_columns() -> None: ("n_legs", "ascending"), ("

[PR] Deprecate to_requested_schema [iceberg-python]

2024-07-11 Thread via GitHub
syun64 opened a new pull request, #918: URL: https://github.com/apache/iceberg-python/pull/918 Following up on a discussion on: https://github.com/apache/iceberg-python/pull/910#discussion_r1674879351 - The intended usage of `to_requested_schema` is to support our internal functions

Re: [PR] Check if schema is compatible in `add_files` API [iceberg-python]

2024-07-11 Thread via GitHub
HonahX merged PR #907: URL: https://github.com/apache/iceberg-python/pull/907 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Optional Schema Check for `add_files` [iceberg-python]

2024-07-11 Thread via GitHub
HonahX closed issue #869: Optional Schema Check for `add_files` URL: https://github.com/apache/iceberg-python/issues/869 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
HonahX commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674879351 ## tests/io/test_pyarrow.py: ## @@ -1798,3 +1799,35 @@ def test_identity_partition_on_multi_columns() -> None: ("n_legs", "ascending"), ("

Re: [PR] Add row-level operation benchmarks [iceberg]

2024-07-11 Thread via GitHub
kazuyukitanimura commented on PR #10687: URL: https://github.com/apache/iceberg/pull/10687#issuecomment-2224197361 cc @aokolnychyi @sunchao @RussellSpitzer @rdblue @szehon-ho @flyrain @dbtsai -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Refactor `file_io_s3_test.rs` to reuse container for test. [iceberg-rust]

2024-07-11 Thread via GitHub
fqaiser94 commented on issue #453: URL: https://github.com/apache/iceberg-rust/issues/453#issuecomment-2224096526 I'll take a crack at this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [I] Create table properties does not support boolean value [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #895: URL: https://github.com/apache/iceberg-python/issues/895#issuecomment-2224093275 Ah, if you pass in `True` (boolean value), this function will turn it into `"True"` https://github.com/kevinjqliu/iceberg-python/blob/428b894aacb107547ba41433b4b6f37e1ad19

Re: [I] Create table properties does not support boolean value [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #895: URL: https://github.com/apache/iceberg-python/issues/895#issuecomment-2224091530 > This code also ignores the bloom filter (although passing directly "True" as string). Would it be a case sensitive issue? Likely, can you set it to lowercase and see?

Re: [I] Bucket name getting appended to minIO service name [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #908: URL: https://github.com/apache/iceberg-python/issues/908#issuecomment-2224090002 What does your `docker-compose.yaml` look like? It's likely a configuration issue. I'd suggest starting with a known working docker configuration (such as https://github.com/

[PR] Bump coverage from 7.5.4 to 7.6.0 [iceberg-python]

2024-07-11 Thread via GitHub
dependabot[bot] opened a new pull request, #917: URL: https://github.com/apache/iceberg-python/pull/917 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.5.4 to 7.6.0. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's chang

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674797591 ## pyiceberg/io/pyarrow.py: ## @@ -1296,31 +1297,49 @@ def to_requested_schema( class ArrowProjectionVisitor(SchemaWithPartnerVisitor[pa.Array, Optional[pa.Ar

Re: [PR] [Docs] Add examples for DataFrame branch writes [iceberg]

2024-07-11 Thread via GitHub
szehon-ho commented on code in PR #10644: URL: https://github.com/apache/iceberg/pull/10644#discussion_r1674784383 ## docs/docs/spark-writes.md: ## @@ -332,6 +332,30 @@ The writer must enable the `mergeSchema` option. ```scala data.writeTo("prod.db.sample").option("mergeSchema

Re: [I] Support writing to a branch [iceberg-python]

2024-07-11 Thread via GitHub
kevinjqliu commented on issue #306: URL: https://github.com/apache/iceberg-python/issues/306#issuecomment-2224059233 @vinjai yes! please go ahead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674710445 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.types

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-11 Thread via GitHub
szehon-ho commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674693714 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkScan.java: ## @@ -97,6 +117,36 @@ public static Object[][] parameters() { }; }

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674691317 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.type

Re: [PR] Support Spark Column Stats [iceberg]

2024-07-11 Thread via GitHub
singhpk234 commented on code in PR #10659: URL: https://github.com/apache/iceberg/pull/10659#discussion_r1674683253 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -175,7 +181,25 @@ public Statistics estimateStatistics() { protected Sta

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674689843 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.type

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674686353 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.types

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674684198 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.types

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674682639 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.types

Re: [PR] Check if schema is compatible in `add_files` API [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #907: URL: https://github.com/apache/iceberg-python/pull/907#discussion_r1674607135 ## pyiceberg/io/pyarrow.py: ## @@ -166,6 +166,7 @@ ONE_MEGABYTE = 1024 * 1024 BUFFER_SIZE = "buffer-size" + Review Comment: ```suggestion ``` -- This

Re: [PR] Check if schema is compatible in `add_files` API [iceberg-python]

2024-07-11 Thread via GitHub
Fokko commented on code in PR #907: URL: https://github.com/apache/iceberg-python/pull/907#discussion_r1674603748 ## pyiceberg/io/pyarrow.py: ## @@ -2026,6 +2072,8 @@ def parquet_files_to_data_files(io: FileIO, table_metadata: TableMetadata, file_ f"Cannot add

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674568873 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,56 @@ public void testCustomCatalog() throws IOException { testInputF

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
lurnagao-dahua commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674565642 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,56 @@ public void testCustomCatalog() throws IOException { testI

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674558387 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,56 @@ public void testCustomCatalog() throws IOException { testInputF

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10661: URL: https://github.com/apache/iceberg/pull/10661#discussion_r1674554724 ## mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormats.java: ## @@ -381,6 +386,46 @@ public void testCustomCatalog() throws IOException { testInputF

Re: [PR] mr:Fix issues 10639 [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on PR #10661: URL: https://github.com/apache/iceberg/pull/10661#issuecomment-2223715282 LGTM +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
snazy commented on PR #10603: URL: https://github.com/apache/iceberg/pull/10603#issuecomment-2223703737 > Can you try rebase to see if it fixes the CI? CI looking good -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
Fokko commented on PR #10686: URL: https://github.com/apache/iceberg/pull/10686#issuecomment-2223698571 Thanks @jbonofre for bumping Gradle here, and thanks @nastra, @ajantha-bhat and @snazy for the prompt review 🚀 -- This is an automated message from the Apache Git Service. To respond t

Re: [I] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
Fokko closed issue #10685: Upgrade to Gradle 8.9 URL: https://github.com/apache/iceberg/issues/10685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: iss

Re: [PR] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
Fokko merged PR #10686: URL: https://github.com/apache/iceberg/pull/10686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10682: URL: https://github.com/apache/iceberg/pull/10682#discussion_r1674506153 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -278,6 +278,10 @@ public TableScan newScan() { return lazyTable().newScan(); } + p

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10682: URL: https://github.com/apache/iceberg/pull/10682#discussion_r1674506153 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -278,6 +278,10 @@ public TableScan newScan() { return lazyTable().newScan(); } + p

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10682: URL: https://github.com/apache/iceberg/pull/10682#discussion_r1674500091 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -278,6 +278,10 @@ public TableScan newScan() { return lazyTable().newScan(); } + p

Re: [I] Running MERGE INTO with more than one WHEN condition fails if the number of columns in the target table is > 321 [iceberg]

2024-07-11 Thread via GitHub
andreaschiappacasse commented on issue #10294: URL: https://github.com/apache/iceberg/issues/10294#issuecomment-2223558105 @krishan711 we ended up using spark instead of athena to do the upsert/delete operation. It is still very unfortunate because it is much more expensive and adds some co

Re: [PR] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
jbonofre commented on PR #10686: URL: https://github.com/apache/iceberg/pull/10686#issuecomment-2223435767 @Fokko @nastra thanks in advance gentlemen ! 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
jbonofre commented on PR #10686: URL: https://github.com/apache/iceberg/pull/10686#issuecomment-2223435089 This PR closes #10686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Upgrade to Gradle 8.9 [iceberg]

2024-07-11 Thread via GitHub
jbonofre commented on issue #10685: URL: https://github.com/apache/iceberg/issues/10685#issuecomment-2223424034 @Fokko @nastra I'm working on the gradle update PR. Do you mind guys to assign this ticket to me and set the milestone to 1.6.0 ? Thanks ! -- This is an automated message from t

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674351468 ## pyiceberg/io/pyarrow.py: ## @@ -1320,7 +1321,16 @@ def _cast_if_needed(self, field: NestedField, values: pa.Array) -> pa.Array: and pa.type

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on PR #910: URL: https://github.com/apache/iceberg-python/pull/910#issuecomment-2223408958 @Fokko @HonahX - thank you for your reviews. I've updated the integration test to make the [check more comprehensive](https://github.com/apache/iceberg-python/pull/910/files#diff-7f3d

Re: [PR] support PyArrow timestamptz with Etc/UTC [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #910: URL: https://github.com/apache/iceberg-python/pull/910#discussion_r1674334862 ## pyiceberg/table/__init__.py: ## @@ -528,10 +528,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) )

Re: [PR] Core: use bulk delete when removing old metadata.json files [iceberg]

2024-07-11 Thread via GitHub
dramaticlly commented on PR #10679: URL: https://github.com/apache/iceberg/pull/10679#issuecomment-2223373221 > I think this is fine, I mentioned this another similar PR though so I thought I would note it here as well. We need to make sure our test case now runs with Filesystems that both

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
danielcweeks commented on code in PR #10603: URL: https://github.com/apache/iceberg/pull/10603#discussion_r1674279099 ## open-api/rest-catalog-open-api.yaml: ## @@ -134,9 +134,22 @@ paths: post: tags: - OAuth2 API - summary: Get a token using an OAuth2

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
jbonofre commented on code in PR #10603: URL: https://github.com/apache/iceberg/pull/10603#discussion_r1674277456 ## open-api/rest-catalog-open-api.yaml: ## @@ -134,9 +134,17 @@ paths: post: tags: - OAuth2 API - summary: Get a token using an OAuth2 flow

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
danielcweeks commented on code in PR #10603: URL: https://github.com/apache/iceberg/pull/10603#discussion_r1674264095 ## open-api/rest-catalog-open-api.yaml: ## @@ -134,9 +134,17 @@ paths: post: tags: - OAuth2 API - summary: Get a token using an OAuth2

Re: [PR] API: add resultSchema() method to StructTransform [iceberg]

2024-07-11 Thread via GitHub
stevenzwu commented on PR #10496: URL: https://github.com/apache/iceberg/pull/10496#issuecomment-2223275495 hmm. the `TestSparkDataFile` failed after this change. ``` Caused by: org.apache.iceberg.exceptions.ValidationException: Invalid schema: multiple fields for name ts: 9 and 9

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
jackye1995 commented on PR #10603: URL: https://github.com/apache/iceberg/pull/10603#issuecomment-2223257678 Can you try rebase to see if it fixes the CI? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Kafka Connect: Commit coordination [iceberg]

2024-07-11 Thread via GitHub
danielcweeks merged PR #10351: URL: https://github.com/apache/iceberg/pull/10351 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Kafka Connect: Commit coordination [iceberg]

2024-07-11 Thread via GitHub
danielcweeks commented on PR #10351: URL: https://github.com/apache/iceberg/pull/10351#issuecomment-2223231870 Thanks @bryanck and @fqaiser94. It's really great to get this one in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Check if schema is compatible in `add_files` API [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on code in PR #907: URL: https://github.com/apache/iceberg-python/pull/907#discussion_r1674162322 ## pyiceberg/io/pyarrow.py: ## @@ -2026,6 +2072,8 @@ def parquet_files_to_data_files(io: FileIO, table_metadata: TableMetadata, file_ f"Cannot add

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-11 Thread via GitHub
epgif commented on PR #9008: URL: https://github.com/apache/iceberg/pull/9008#issuecomment-2223123048 > overall this LGTM once comments in `TestBucketing` have been addressed I've addressed these. Thanks! -- This is an automated message from the Apache Git Service. To respond

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-11 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1674135937 ## api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java: ## @@ -165,6 +159,68 @@ public void testLong() { .isEqualTo(hashBytes(buffer.array()));

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-11 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1674135249 ## api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java: ## @@ -165,6 +159,68 @@ public void testLong() { .isEqualTo(hashBytes(buffer.array()));

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-07-11 Thread via GitHub
epgif commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1674134865 ## api/src/test/java/org/apache/iceberg/transforms/TestBucketing.java: ## @@ -112,12 +112,6 @@ public void testSpecValues() { .as("Spec example: hash(2017-11-16T

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-11 Thread via GitHub
boroknagyz commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1674098559 ## core/src/main/java/org/apache/iceberg/util/PartitionSet.java: ## @@ -200,7 +200,8 @@ public String toString() { StringBuilder partitionStringBuilder

Re: [PR] OpenAPI: Deprecate `oauth/tokens` endpoint [iceberg]

2024-07-11 Thread via GitHub
snazy commented on PR #10603: URL: https://github.com/apache/iceberg/pull/10603#issuecomment-2223065233 I've updated the PR to mention "2.0". The CI failures look unrelated, but I don't have the power to rerun those. -- This is an automated message from the Apache Git Service. To re

Re: [PR] Update checkstyle definition [iceberg]

2024-07-11 Thread via GitHub
attilakreiner commented on PR #10681: URL: https://github.com/apache/iceberg/pull/10681#issuecomment-2223052922 Merged in the `main` branch and fixed `ConstantName`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Will not be merged: Spark timestamptz discrepancy [iceberg-python]

2024-07-11 Thread via GitHub
syun64 commented on PR #915: URL: https://github.com/apache/iceberg-python/pull/915#issuecomment-2223035079 Here's the dataframe schema that represents the Iceberg tables that are loaded through: 1. Spark Iceberg 2. PyIceberg Spark Iceberg loads both timestamptz and timestamp ty

[I] Detecting duplicates in the Flink Data Stream API [iceberg]

2024-07-11 Thread via GitHub
lkokhreidze opened a new issue, #10683: URL: https://github.com/apache/iceberg/issues/10683 ### Query engine Flink ### Question Hi, I was wondering if there's a way we could detect if ongoing batch written to the Iceberg table would perform the upsert? Context:

Re: [PR] Bump mypy-boto3-glue from 1.34.136 to 1.34.143 [iceberg-python]

2024-07-11 Thread via GitHub
Fokko merged PR #912: URL: https://github.com/apache/iceberg-python/pull/912 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Spark timestamptz discrepancy [iceberg-python]

2024-07-11 Thread via GitHub
syun64 opened a new pull request, #915: URL: https://github.com/apache/iceberg-python/pull/915 I ran into this issue when I was trying to improve our tests for verifying write integrity for the timestamp types following our recent implementations for supporting inputs of other precisions an

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-11 Thread via GitHub
nastra commented on code in PR #10682: URL: https://github.com/apache/iceberg/pull/10682#discussion_r1674049469 ## core/src/main/java/org/apache/iceberg/SerializableTable.java: ## @@ -278,6 +278,10 @@ public TableScan newScan() { return lazyTable().newScan(); } + publ

Re: [I] iceberg-aws-bundle jar includes org.slf4j.LoggerFactory [iceberg]

2024-07-11 Thread via GitHub
bryanck closed issue #10534: iceberg-aws-bundle jar includes org.slf4j.LoggerFactory URL: https://github.com/apache/iceberg/issues/10534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Core: Fix NPE during conflict handling of NULL partitions [iceberg]

2024-07-11 Thread via GitHub
deniskuzZ commented on code in PR #10680: URL: https://github.com/apache/iceberg/pull/10680#discussion_r1674048472 ## core/src/main/java/org/apache/iceberg/util/PartitionSet.java: ## @@ -200,7 +200,8 @@ public String toString() { StringBuilder partitionStringBuilder =

Re: [I] iceberg-aws-bundle jar includes org.slf4j.LoggerFactory [iceberg]

2024-07-11 Thread via GitHub
bryanck closed issue #10534: iceberg-aws-bundle jar includes org.slf4j.LoggerFactory URL: https://github.com/apache/iceberg/issues/10534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Build: don't include slf4j-api in bundled JARs [iceberg]

2024-07-11 Thread via GitHub
bryanck merged PR #10665: URL: https://github.com/apache/iceberg/pull/10665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Core: Expose table incremental scan for appends API in SerializableTable [iceberg]

2024-07-11 Thread via GitHub
nastra commented on PR #10682: URL: https://github.com/apache/iceberg/pull/10682#issuecomment-981956 @deniskuzZ can you add the missing `@Override` please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

  1   2   >