Re: [PR] Bin-pack Writes Operation into multiple parquet files, and parallelize writing `WriteTask`s [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on code in PR #444: URL: https://github.com/apache/iceberg-python/pull/444#discussion_r1513978231 ## pyiceberg/io/pyarrow.py: ## @@ -1717,54 +1717,65 @@ def fill_parquet_file_metadata( def write_file(table: Table, tasks: Iterator[WriteTask], file_schema: O

Re: [PR] feat: add always_true and always_false predicate [iceberg-rust]

2024-03-05 Thread via GitHub
Dysprosium0626 commented on PR #227: URL: https://github.com/apache/iceberg-rust/pull/227#issuecomment-1980224024 @liurenjie1024 Hi, this PR is ready for review. Thx~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Core: Only validate the current partition specs [iceberg]

2024-03-05 Thread via GitHub
hashhar commented on PR #5707: URL: https://github.com/apache/iceberg/pull/5707#issuecomment-1980199024 after this change while you can drop partition columns such tables become broken if they hadn't been optimized before dropping the column. Basically see https://github.com/apache/iceberg/

Re: [I] `ALTER TABLE ... DROP COLUMN` allows dropping a column used by old PartitionSpecs [iceberg]

2024-03-05 Thread via GitHub
hashhar commented on issue #4563: URL: https://github.com/apache/iceberg/issues/4563#issuecomment-1980198119 🤦 I didn't realise this was issue under `apache/iceberg` and not `trinodb/trino`. Sorry for the Trino specific discussion. I now realise that the issue is exactly asking for wh

Re: [I] Library public api isolation and import decoupling [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on issue #499: URL: https://github.com/apache/iceberg-python/issues/499#issuecomment-1980189490 @ndrluis Thanks for reporting this! I think we should fix these to import from the right module. Do you want to work on this? -- This is an automated message from the Apache Gi

Re: [I] `ALTER TABLE ... DROP COLUMN` allows dropping a column used by old PartitionSpecs [iceberg]

2024-03-05 Thread via GitHub
hashhar commented on issue #4563: URL: https://github.com/apache/iceberg/issues/4563#issuecomment-1980186447 cc: @rdblue / @RussellSpitzer Probably Spark should also disallow dropping partition column if it's referenced in the live table files. Here's `files` metadata table AFTER dro

Re: [I] `ALTER TABLE ... DROP COLUMN` allows dropping a column used by old PartitionSpecs [iceberg]

2024-03-05 Thread via GitHub
hashhar commented on issue #4563: URL: https://github.com/apache/iceberg/issues/4563#issuecomment-1980182977 hmmm, looks like BEFORE dropping a partition column I need to `optimize` the table otherwise subsequent reads fail (both in Spark and Trino). Here's a sample sequence of steps:

Re: [I] `ALTER TABLE ... DROP COLUMN` allows dropping a column used by old PartitionSpecs [iceberg]

2024-03-05 Thread via GitHub
hashhar commented on issue #4563: URL: https://github.com/apache/iceberg/issues/4563#issuecomment-1980169111 Iceberg 1.1.0 had https://github.com/apache/iceberg/commit/3b65cca8b5122157d112f232a8e110be93a740e5 which seems to address something. Can we now allow this on Trino? @alexjo21

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513864795 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -269,6 +284,160 @@ public void renameTable(TableIdentifier from, TableIdentifier origina

Re: [I] Add hive metastore catalog support [iceberg-rust]

2024-03-05 Thread via GitHub
Xuanwo commented on issue #113: URL: https://github.com/apache/iceberg-rust/issues/113#issuecomment-1980137973 > I setup the basis test-infra for the hms-catalog. In the next couple of days I will try to work on / finish the implementation of the catalog. That's great! -- This is a

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513851518 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,20 @@ protected enum CommitStatus { * @return Commit Status of Success

Re: [I] Add hive metastore catalog support [iceberg-rust]

2024-03-05 Thread via GitHub
marvinlanhenke commented on issue #113: URL: https://github.com/apache/iceberg-rust/issues/113#issuecomment-1980128393 @Xuanwo Yes, I thought the same - mocking the HMS would not be beneficial. I setup the basis test-infra for the hms-catalog. In the next couple of days I will try to

Re: [I] hive catalog drop table XX purge not delete hdfs path [iceberg]

2024-03-05 Thread via GitHub
RussellSpitzer commented on issue #9869: URL: https://github.com/apache/iceberg/issues/9869#issuecomment-1980120089 No it also removes data files, the only exception is files which are not referenced by the Iceberg table. -- This is an automated message from the Apache Git Service. To res

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513787586 ## pyiceberg/catalog/glue.py: ## @@ -437,46 +471,69 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons ) databas

Re: [I] hive catalog drop table XX purge not delete hdfs path [iceberg]

2024-03-05 Thread via GitHub
nqvuong1998 commented on issue #9869: URL: https://github.com/apache/iceberg/issues/9869#issuecomment-1980057905 In my case, when using PURGE, Iceberg only deletes metadata files. Do I need any conf to be able to delete data files? -- This is an automated message from the Apache Git Serv

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513787586 ## pyiceberg/catalog/glue.py: ## @@ -437,46 +471,69 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons ) databas

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513777158 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513776780 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513775889 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513775619 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients

Re: [PR] Docs: Fix links to internal files [iceberg]

2024-03-05 Thread via GitHub
manuzhang commented on code in PR #9819: URL: https://github.com/apache/iceberg/pull/9819#discussion_r1513757865 ## docs/docs/flink.md: ## @@ -24,20 +24,20 @@ Apache Iceberg supports both [Apache Flink](https://flink.apache.org/)'s DataStr | Feature support

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513742498 ## pyiceberg/catalog/glue.py: ## @@ -437,46 +471,69 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons ) databas

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513718061 ## pyiceberg/catalog/glue.py: ## @@ -358,6 +365,33 @@ def _get_glue_table(self, database_name: str, table_name: str) -> TableTypeDef: except self.glue.exc

Re: [I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-03-05 Thread via GitHub
syun64 closed issue #464: Support ID Tokens in Rest Catalog URL: https://github.com/apache/iceberg-python/issues/464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on issue #464: URL: https://github.com/apache/iceberg-python/issues/464#issuecomment-1979940847 Awesome. Thanks for raising this issue in the first place and walking the community through this discussion! -- This is an automated message from the Apache Git Service. To res

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513709599 ## pyiceberg/table/__init__.py: ## @@ -753,6 +807,143 @@ def update_table_metadata(base_metadata: TableMetadata, updates: Tuple[TableUpda return new_metadata.

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-03-05 Thread via GitHub
HonahX merged PR #473: URL: https://github.com/apache/iceberg-python/pull/473 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-03-05 Thread via GitHub
HonahX commented on PR #473: URL: https://github.com/apache/iceberg-python/pull/473#issuecomment-1979926180 Hi @Fokko @syun64 > I feel a bit awkward adding it to the write-options because the current list of write-options map to pyarrow parquet properties. I have a similar feeling

Re: [PR] feat: add `UnboundPredicate::negate()` [iceberg-rust]

2024-03-05 Thread via GitHub
liurenjie1024 commented on PR #228: URL: https://github.com/apache/iceberg-rust/pull/228#issuecomment-1979916413 cc @Xuanwo @ZENOTME @Fokko PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] feat: add `UnboundPredicate::negate()` [iceberg-rust]

2024-03-05 Thread via GitHub
liurenjie1024 commented on code in PR #228: URL: https://github.com/apache/iceberg-rust/pull/228#discussion_r1513696354 ## crates/iceberg/src/expr/mod.rs: ## @@ -30,7 +30,7 @@ pub use predicate::*; /// The discriminant of this enum is used for determining the type of the opera

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on PR #473: URL: https://github.com/apache/iceberg-python/pull/473#issuecomment-1979882949 Shall we merge this in if the place in the documentation (Write options vs Backward Compatibility) is the only subject for debate? We can always update the docs. Would appreciate it i

Re: [I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-03-05 Thread via GitHub
flyrain commented on issue #464: URL: https://github.com/apache/iceberg-python/issues/464#issuecomment-1979866702 Yeah, it is normally an anti-pattern to use id-token for resource servers. For example, the id token will carry all audiences that the client has, which could be misused, e.g.,

Re: [I] Support Parquet modular encryption [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] commented on issue #1413: URL: https://github.com/apache/iceberg/issues/1413#issuecomment-1979847605 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Consider a builder in TableMetadata [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] commented on issue #1412: URL: https://github.com/apache/iceberg/issues/1412#issuecomment-1979847579 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Support Parquet modular encryption [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] closed issue #1413: Support Parquet modular encryption URL: https://github.com/apache/iceberg/issues/1413 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Consider a builder in TableMetadata [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] closed issue #1412: Consider a builder in TableMetadata URL: https://github.com/apache/iceberg/issues/1412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] Flink: Improvements for iceberg sink connector. [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] closed issue #1403: Flink: Improvements for iceberg sink connector. URL: https://github.com/apache/iceberg/issues/1403 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Flink: Improvements for iceberg sink connector. [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] commented on issue #1403: URL: https://github.com/apache/iceberg/issues/1403#issuecomment-1979847544 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] No retries on snapshot commit on eventual consistent file system [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] closed issue #1398: No retries on snapshot commit on eventual consistent file system URL: https://github.com/apache/iceberg/issues/1398 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] No retries on snapshot commit on eventual consistent file system [iceberg]

2024-03-05 Thread via GitHub
github-actions[bot] commented on issue #1398: URL: https://github.com/apache/iceberg/issues/1398#issuecomment-1979847511 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Flink: Bump minor versions [iceberg]

2024-03-05 Thread via GitHub
ajantha-bhat commented on PR #9875: URL: https://github.com/apache/iceberg/pull/9875#issuecomment-1979833138 cc: @stevenzwu, @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Support ID Tokens in Rest Catalog [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on issue #464: URL: https://github.com/apache/iceberg-python/issues/464#issuecomment-1979774474 @flyrain do you have any thoughts on this? Does this issue require more discussion, or could we close this with the understanding that ID tokens should not be used against a reso

[I] append() fails with DataframeWriterV2's writeTo api [iceberg]

2024-03-05 Thread via GitHub
mkavinashkumar opened a new issue, #9874: URL: https://github.com/apache/iceberg/issues/9874 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 I get an error when I try to append data using the writeTo api

Re: [PR] Docs: Add Daft into Iceberg documentation [iceberg]

2024-03-05 Thread via GitHub
bitsondatadev commented on PR #9836: URL: https://github.com/apache/iceberg/pull/9836#issuecomment-1979739317 @nastra, I tend to agree with @jaychia on this one. I don't want to split up the documentation any more than necessary. Any compute engine that runs on Iceberg, I want to I

Re: [PR] Sql catalog [iceberg-rust]

2024-03-05 Thread via GitHub
odysa commented on code in PR #229: URL: https://github.com/apache/iceberg-rust/pull/229#discussion_r1513340583 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,397 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[I] Library public api isolation and import decoupling [iceberg-python]

2024-03-05 Thread via GitHub
ndrluis opened a new issue, #499: URL: https://github.com/apache/iceberg-python/issues/499 ### Feature Request / Improvement Hello, when I was trying to solve the #497 issue, I noticed that we are exposing the private API and we have some imports through modules. For example,

Re: [PR] Sql catalog [iceberg-rust]

2024-03-05 Thread via GitHub
martin-g commented on code in PR #229: URL: https://github.com/apache/iceberg-rust/pull/229#discussion_r1513532822 ## crates/catalog/sql/src/error.rs: ## @@ -0,0 +1,27 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Sql catalog [iceberg-rust]

2024-03-05 Thread via GitHub
martin-g commented on code in PR #229: URL: https://github.com/apache/iceberg-rust/pull/229#discussion_r1513531369 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,397 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Sql catalog [iceberg-rust]

2024-03-05 Thread via GitHub
martin-g commented on code in PR #229: URL: https://github.com/apache/iceberg-rust/pull/229#discussion_r1513530747 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,397 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

Re: [PR] Make optional oauth configurable [iceberg-python]

2024-03-05 Thread via GitHub
flyrain commented on code in PR #486: URL: https://github.com/apache/iceberg-python/pull/486#discussion_r1513528807 ## pyiceberg/catalog/rest.py: ## @@ -289,12 +291,25 @@ def auth_url(self) -> str: else: return self.url(Endpoints.get_token, prefixed=False)

Re: [I] Import statement unexpectedly executes functions in __init__.py [iceberg-python]

2024-03-05 Thread via GitHub
ndrluis closed issue #497: Import statement unexpectedly executes functions in __init__.py URL: https://github.com/apache/iceberg-python/issues/497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Import statement unexpectedly executes functions in __init__.py [iceberg-python]

2024-03-05 Thread via GitHub
ndrluis commented on issue #497: URL: https://github.com/apache/iceberg-python/issues/497#issuecomment-1979673270 I found the problem, it's related to Python 3.9.7. I updated to 3.9.12 and fixed the error. From the project's perspective, I believe that the isolation of the public API

Re: [I] Import statement unexpectedly executes functions in __init__.py [iceberg-python]

2024-03-05 Thread via GitHub
ndrluis commented on issue #497: URL: https://github.com/apache/iceberg-python/issues/497#issuecomment-1979663033 I was wrong about my hypothesis, but I believe that it is something occurring during the import process because the stacktrace error shows that the import is executing certain o

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-03-05 Thread via GitHub
flyrain commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1979651924 Closed. Thanks @danielcweeks and @syun64 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-03-05 Thread via GitHub
flyrain closed issue #463: Support OAuth2 Client credential flow URL: https://github.com/apache/iceberg-python/issues/463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Docs: Add Daft into Iceberg documentation [iceberg]

2024-03-05 Thread via GitHub
jaychia commented on PR #9836: URL: https://github.com/apache/iceberg/pull/9836#issuecomment-1979626594 > Given that this mainly targets pyiceberg, I think this doc should live in https://github.com/apache/iceberg-python/ Hello! The Daft query engine integrates with `pyiceberg` curren

Re: [PR] Dynamically support Spark native engine in Iceberg [iceberg]

2024-03-05 Thread via GitHub
huaxingao commented on PR #9721: URL: https://github.com/apache/iceberg/pull/9721#issuecomment-1979578692 cc @aokolnychyi @rdblue @RussellSpitzer @flyrain Here is the Comet/iceberg integration [PR](https://github.com/apache/iceberg/pull/9841). Please take a look when you have time. Tha

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513414672 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -222,53 +230,188 @@ public boolean dropTable(TableIdentifier identifier, boolean purg

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513413754 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -269,6 +284,160 @@ public void renameTable(TableIdentifier from, TableIdentifier orig

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513407885 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -222,53 +230,188 @@ public boolean dropTable(TableIdentifier identifier, boolean purg

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513399291 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -291,7 +286,7 @@ public long newSnapshotId() { }; } - protected enum Comm

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-05 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1513399291 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -291,7 +286,7 @@ public long newSnapshotId() { }; } - protected enum Comm

Re: [PR] Sql catalog [iceberg-rust]

2024-03-05 Thread via GitHub
sdd commented on code in PR #229: URL: https://github.com/apache/iceberg-rust/pull/229#discussion_r1513373458 ## crates/catalog/sql/src/catalog.rs: ## @@ -0,0 +1,397 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. Se

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513354406 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/functions/BaseScalarFunction.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513354406 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/functions/BaseScalarFunction.java: ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513352554 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownInRowLevelOperations.java: ## @@ -0,0 +1,348 @@ +/* + * L

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513352554 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownInRowLevelOperations.java: ## @@ -0,0 +1,348 @@ +/* + * L

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513351770 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/SparkPlanUtil.java: ## @@ -53,6 +58,49 @@ private static SparkPlan actualPlan(Spark

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513350467 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceStaticInvoke.scala: ## @@ -40,21 +48,39 @@ import org.apache.spark.sql.

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513349734 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceStaticInvoke.scala: ## @@ -40,21 +48,39 @@ import org.apache.spark.sql.

Re: [PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi commented on code in PR #9873: URL: https://github.com/apache/iceberg/pull/9873#discussion_r1513348794 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceStaticInvoke.scala: ## @@ -40,21 +48,39 @@ import org.apache.spark.sql.

[PR] Spark 3.5: Fix system function pushdown in CoW row-level commands [iceberg]

2024-03-05 Thread via GitHub
aokolnychyi opened a new pull request, #9873: URL: https://github.com/apache/iceberg/pull/9873 This PR fixes the system function predicate pushdown in CoW MERGE operations and adds tests for other use cases. Prior to this PR, we did not cover `ReplaceData` and `Join` nodes that are particip

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-03-05 Thread via GitHub
zachdisc commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-1979434799 > I see that's a good point, using the UDF function definitely is less performant than sorting native columns, in that case we can keep it as is, unless anyone has a way to fix the perf.

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
rdblue commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513332111 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients to mak

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
rdblue commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513331183 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients to mak

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513326436 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
rdblue commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513322269 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients to mak

Re: [PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
rdblue commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1513321631 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,19 @@ components: PageToken: description: -An opaque token which allows clients to mak

Re: [PR] Build: Don't publish iceberg-open-api module [iceberg]

2024-03-05 Thread via GitHub
danielcweeks merged PR #9871: URL: https://github.com/apache/iceberg/pull/9871 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[PR] Clarify pagination description [iceberg]

2024-03-05 Thread via GitHub
danielcweeks opened a new pull request, #9872: URL: https://github.com/apache/iceberg/pull/9872 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513259769 ## pyiceberg/table/__init__.py: ## @@ -753,6 +807,143 @@ def update_table_metadata(base_metadata: TableMetadata, updates: Tuple[TableUpda return new_metadata.

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513264836 ## pyiceberg/catalog/glue.py: ## @@ -437,46 +471,69 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons ) databas

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513259769 ## pyiceberg/table/__init__.py: ## @@ -753,6 +807,143 @@ def update_table_metadata(base_metadata: TableMetadata, updates: Tuple[TableUpda return new_metadata.

Re: [PR] Fix pagination spec description [iceberg]

2024-03-05 Thread via GitHub
jackye1995 commented on PR #9866: URL: https://github.com/apache/iceberg/pull/9866#issuecomment-1979323286 I think we still need to figure out `GET /tables?pageToken` vs `GET /tables?pageToken=` or both. As I said in the old PR, in https://www.youtube.com/watch?v=uAQVGd5zV4I, startin

Re: [PR] Build: Don't publish iceberg-open-api module [iceberg]

2024-03-05 Thread via GitHub
nastra commented on code in PR #9871: URL: https://github.com/apache/iceberg/pull/9871#discussion_r1513237163 ## deploy.gradle: ## @@ -22,6 +22,11 @@ if (project.hasProperty('release') && jdkVersion != '8') { } subprojects { + if (it.name == 'iceberg-open-api') { Review Co

Re: [PR] [WIP][Discussion]CreateTableTransaction Implementation [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1513218389 ## pyiceberg/catalog/glue.py: ## @@ -358,6 +365,33 @@ def _get_glue_table(self, database_name: str, table_name: str) -> TableTypeDef: except self.glue.exc

Re: [PR] Open-api: update prefix param description [iceberg]

2024-03-05 Thread via GitHub
ajantha-bhat commented on code in PR #9870: URL: https://github.com/apache/iceberg/pull/9870#discussion_r1513198334 ## open-api/rest-catalog-open-api.yaml: ## @@ -1444,7 +1444,7 @@ components: schema: type: string required: true - description: An opti

Re: [I] Support OAuth2 Client credential flow [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on issue #463: URL: https://github.com/apache/iceberg-python/issues/463#issuecomment-1979175057 Given that we've confirmed that Client Credentials flow is already supported, and we've fixed the current model to make attributes optional, is this issue good to close @flyrain

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-03-05 Thread via GitHub
zachdisc commented on code in PR #9731: URL: https://github.com/apache/iceberg/pull/9731#discussion_r1513132291 ## api/src/main/java/org/apache/iceberg/actions/RewriteManifests.java: ## @@ -44,6 +47,39 @@ public interface RewriteManifests */ RewriteManifests rewriteIf(Pre

Re: [PR] [Bug Fix] cast None `current-snapshot-id` as -1 for Backwards Compatibility [iceberg-python]

2024-03-05 Thread via GitHub
syun64 commented on PR #473: URL: https://github.com/apache/iceberg-python/pull/473#issuecomment-1979164521 > This looks great, one minor suggestion: > > Could you add the `legacy-current-snapshot-id` key to the write options table as well: https://github.com/apache/iceberg-python/bl

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-03-05 Thread via GitHub
mudit-97 commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1979163696 sure, I already have a thread there, let me add you also there -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Flink: Made IcebergFilesCommitter work with single phase commit [iceberg]

2024-03-05 Thread via GitHub
stevenzwu commented on PR #9694: URL: https://github.com/apache/iceberg/pull/9694#issuecomment-1979161305 @mudit-97 next community sync meeting might be focused on materialized view. the other option to get broader feedback is to start a discussion thread on dev@, which tends to get more at

Re: [PR] Build: Don't publish iceberg-open-api module [iceberg]

2024-03-05 Thread via GitHub
ajantha-bhat commented on code in PR #9871: URL: https://github.com/apache/iceberg/pull/9871#discussion_r1513107056 ## deploy.gradle: ## @@ -22,6 +22,11 @@ if (project.hasProperty('release') && jdkVersion != '8') { } subprojects { + if (it.name == 'iceberg-open-api') { Rev

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-03-05 Thread via GitHub
jackye1995 commented on PR #9731: URL: https://github.com/apache/iceberg/pull/9731#issuecomment-1979126601 > I don't think it yields quite as performant a result unfortunately. I see that's a good point, using the UDF function definitely is less performant than sorting native columns

Re: [I] Support deletion in Apache Flink [iceberg]

2024-03-05 Thread via GitHub
AjayChitumalla commented on issue #8718: URL: https://github.com/apache/iceberg/issues/8718#issuecomment-1979125075 > Is this for a V2 table? I have seen deleting rows working using V2 table, Java code with the stream API, but I yet to try out SQL. Can you share a reference for perfor

[PR] Build: Don't publish iceberg-open-api module [iceberg]

2024-03-05 Thread via GitHub
nastra opened a new pull request, #9871: URL: https://github.com/apache/iceberg/pull/9871 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] Improve the InMemory Catalog Implementation [iceberg-python]

2024-03-05 Thread via GitHub
kevinjqliu commented on PR #289: URL: https://github.com/apache/iceberg-python/pull/289#issuecomment-1979115277 Thanks for the suggestions, @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-03-05 Thread via GitHub
jackye1995 commented on code in PR #9731: URL: https://github.com/apache/iceberg/pull/9731#discussion_r1513085638 ## api/src/main/java/org/apache/iceberg/actions/RewriteManifests.java: ## @@ -44,6 +47,39 @@ public interface RewriteManifests */ RewriteManifests rewriteIf(P

Re: [PR] Docs: Fix links to internal files [iceberg]

2024-03-05 Thread via GitHub
nastra commented on code in PR #9819: URL: https://github.com/apache/iceberg/pull/9819#discussion_r1513086680 ## docs/docs/flink.md: ## @@ -24,20 +24,20 @@ Apache Iceberg supports both [Apache Flink](https://flink.apache.org/)'s DataStr | Feature support

Re: [PR] Spark: Adding simple custom partition sort order option to RewriteManifests Spark Action [iceberg]

2024-03-05 Thread via GitHub
jackye1995 commented on code in PR #9731: URL: https://github.com/apache/iceberg/pull/9731#discussion_r1513085638 ## api/src/main/java/org/apache/iceberg/actions/RewriteManifests.java: ## @@ -44,6 +47,39 @@ public interface RewriteManifests */ RewriteManifests rewriteIf(P

  1   2   >