Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-25 Thread via GitHub
nastra merged PR #10001: URL: https://github.com/apache/iceberg/pull/10001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
nastra commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538667986 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
nastra commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1528543356 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/nam

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-25 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1538653347 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,20 @@ protected enum CommitStatus { * @return Commit Status of Success

[PR] Migrate WAP and Metrics in Core to JUnit5 [iceberg]

2024-03-25 Thread via GitHub
tomtongue opened a new pull request, #10039: URL: https://github.com/apache/iceberg/pull/10039 Migrate the following test classes in iceberg-core to JUnit 5 and AssertJ style for https://github.com/apache/iceberg/issues/9085. ## Current Progress - [x] `TestWapWorkflow` - [x] `T

Re: [PR] feat: Make OAuth token server configurable [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on PR #305: URL: https://github.com/apache/iceberg-rust/pull/305#issuecomment-2019452515 @flyrain @liurenjie1024 Please take a look. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] feat: Support customized header in Rest catalog client [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on PR #306: URL: https://github.com/apache/iceberg-rust/pull/306#issuecomment-2019455618 @flyrain @liurenjie1024 Please take a look. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Cannot create table if location/endpoint is s3 on a "secure" Minio server [iceberg-python]

2024-03-25 Thread via GitHub
thinkORo commented on issue #540: URL: https://github.com/apache/iceberg-python/issues/540#issuecomment-2019454045 Hi Kevin, should I close this issue? I think everything is documented, isn't it? Thank you for your support. The only thing that I still need is a data compaction. But I

Re: [PR] feat: Make OAuth token server configurable [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on code in PR #305: URL: https://github.com/apache/iceberg-rust/pull/305#discussion_r1538627909 ## crates/catalog/rest/src/catalog.rs: ## @@ -956,6 +964,39 @@ mod tests { ); } +#[tokio::test] +async fn test_oauth_with_auth_url() { +

Re: [PR] feat: Support customized header in Rest catalog client [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on code in PR #306: URL: https://github.com/apache/iceberg-rust/pull/306#discussion_r1538626859 ## crates/catalog/rest/src/catalog.rs: ## @@ -956,6 +983,68 @@ mod tests { ); } +#[tokio::test] +async fn test_get_default_headers() { +

Re: [PR] feat: Make OAuth token server configurable [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on code in PR #305: URL: https://github.com/apache/iceberg-rust/pull/305#discussion_r1538627197 ## crates/catalog/rest/src/catalog.rs: ## @@ -866,8 +870,12 @@ mod tests { } async fn create_oauth_mock(server: &mut ServerGuard) -> Mock { +cre

Re: [PR] feat: Support customized header in Rest catalog client [iceberg-rust]

2024-03-25 Thread via GitHub
whynick1 commented on code in PR #306: URL: https://github.com/apache/iceberg-rust/pull/306#discussion_r1538626596 ## crates/catalog/rest/src/catalog.rs: ## @@ -130,6 +130,33 @@ impl RestCatalogConfig { ); } +for (key, value) in self.props.iter()

Re: [PR] Manifest list encryption [iceberg]

2024-03-25 Thread via GitHub
ggershinsky commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1538620497 ## api/src/main/java/org/apache/iceberg/Snapshot.java: ## @@ -162,6 +162,15 @@ default Iterable removedDeleteFiles(FileIO io) { */ String manifestListLocati

Re: [PR] Parquet: Implement column index filter and update row read path to support page skipping [iceberg]

2024-03-25 Thread via GitHub
iflytek-hmwang5 commented on PR #6967: URL: https://github.com/apache/iceberg/pull/6967#issuecomment-2019420545 > Hi @iflytek-hmwang5 , We're waiting for people in the community who are interested in it to review it. What can we do to speed up the process? Looking forward to this feature

Re: [PR] Parquet: Implement column index filter and update row read path to support page skipping [iceberg]

2024-03-25 Thread via GitHub
zhongyujiang commented on PR #6967: URL: https://github.com/apache/iceberg/pull/6967#issuecomment-2019404655 Hi @iflytek-hmwang5 , We're waiting for people in the community who are interested in it to review it. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Core: Add EnvironmentContext to commit summary [iceberg]

2024-03-25 Thread via GitHub
manuzhang commented on PR #9273: URL: https://github.com/apache/iceberg/pull/9273#issuecomment-2019398747 @nastra how can we move this forward? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-25 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1537915830 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -304,23 +310,13 @@ protected void doCommit(TableMetadata base, TableMetadata m

Re: [I] Ci: Throttling causes flakyness in link checker [iceberg]

2024-03-25 Thread via GitHub
manuzhang commented on issue #10038: URL: https://github.com/apache/iceberg/issues/10038#issuecomment-2019389582 429 error can be mitigated with config `"retryOn429": true` as per https://github.com/tcort/markdown-link-check?tab=readme-ov-file#config-file-format while there are on such opti

Re: [I] Ci: Throttling causes flakyness in link checker [iceberg]

2024-03-25 Thread via GitHub
manuzhang commented on issue #10038: URL: https://github.com/apache/iceberg/issues/10038#issuecomment-2019388084 I'm also seeing 503 error in https://github.com/apache/iceberg/actions/runs/8423700634/job/23065932581?pr=10037 ``` [✖] https://medium.com/@ayushtkn/apache-hive-4-x-wit

[PR] feat: Glue Catalog - namespace operations (2/3) [iceberg-rust]

2024-03-25 Thread via GitHub
marvinlanhenke opened a new pull request, #304: URL: https://github.com/apache/iceberg-rust/pull/304 ### Which issue does this PR close? Partly #249 (Task 2/3) ### Rationale for this change Add support for Glue Catalog, to reach feature parity with other implementations.

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538559396 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538554450 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538554648 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538552714 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1538552714 ## open-api/rest-catalog-open-api.yaml: ## @@ -537,6 +537,113 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/na

Re: [I] Implement `PruneColumns` for `Schema`. [iceberg-rust]

2024-03-25 Thread via GitHub
Dysprosium0626 commented on issue #251: URL: https://github.com/apache/iceberg-rust/issues/251#issuecomment-2019320012 Hi @liurenjie1024, thanks for your review. This issue can be closed now. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Add Pagination To List Apis [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on PR #9782: URL: https://github.com/apache/iceberg/pull/9782#issuecomment-2019317801 @jackye1995 @danielcweeks @nastra If you can take a look at the revision for this whenever you guys get a chance would be appreciated. -- This is an automated message from the Apache G

Re: [PR] Add Pagination To List Apis [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1538538607 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -490,12 +522,29 @@ public void createNamespace( @Override public List listNamespace

Re: [PR] Add Pagination To List Apis [iceberg]

2024-03-25 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1538536990 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -114,6 +114,9 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog private

Re: [PR] On write operation, cast data to Iceberg Table's pyarrow schema [iceberg-python]

2024-03-25 Thread via GitHub
kevinjqliu commented on code in PR #523: URL: https://github.com/apache/iceberg-python/pull/523#discussion_r1538537111 ## pyiceberg/table/__init__.py: ## @@ -1156,7 +1166,9 @@ def overwrite( if len(self.spec().fields) > 0: raise ValueError("Cannot write to

Re: [I] How to load a table with bucket partition [iceberg-python]

2024-03-25 Thread via GitHub
frankliee commented on issue #548: URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2019248895 I have considered this, `.plan_files()` will get all files, and cannot distinguish which files are in the same bucket. For Iceberg-Spark, there is a helpful system funct

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
wanghualei commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2019248320 > > Because it undermines table level integrity > > Iceberg manages table integrity. What can be improved is to offer options to delete directory when users know it's safe t

Re: [PR] Add labeler [iceberg-python]

2024-03-25 Thread via GitHub
HonahX commented on PR #549: URL: https://github.com/apache/iceberg-python/pull/549#issuecomment-2019206065 Shall we also add a workflow that use these labels in this PR?. As suggested in https://github.com/marketplace/actions/labeler `.github/workflows/labeler.yml` ```yml name:

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
manuzhang commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2019189367 > Because it undermines table level integrity Iceberg manages table integrity. What can be improved is to offer options to delete directory when users know it's safe to do so.

Re: [I] Can pyiceberg rename iceberg database? [iceberg-python]

2024-03-25 Thread via GitHub
madeirak closed issue #546: Can pyiceberg rename iceberg database? URL: https://github.com/apache/iceberg-python/issues/546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
wanghualei commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2019148264 > For example, I can create a table B with location under that of table A. I don't want to delete table B when dropping table A. If we reason like this, does HadoopCatalog a

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
wanghualei commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2019145436 > For example, I can create a table B with location under that of table A. I don't want to delete table B when dropping table A. Generally speaking, this situation should no

Re: [I] Please tidy up the incubator release directories [iceberg]

2024-03-25 Thread via GitHub
sebbASF closed issue #2261: Please tidy up the incubator release directories URL: https://github.com/apache/iceberg/issues/2261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Please tidy up the incubator release directories [iceberg]

2024-03-25 Thread via GitHub
sebbASF commented on issue #2261: URL: https://github.com/apache/iceberg/issues/2261#issuecomment-2019141829 Directories have been deleted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Please tidy up the incubator release directories [iceberg]

2024-03-25 Thread via GitHub
github-actions[bot] commented on issue #2261: URL: https://github.com/apache/iceberg/issues/2261#issuecomment-2019139343 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Custom partition functions [iceberg]

2024-03-25 Thread via GitHub
github-actions[bot] commented on issue #1482: URL: https://github.com/apache/iceberg/issues/1482#issuecomment-2019139023 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Custom partition functions [iceberg]

2024-03-25 Thread via GitHub
github-actions[bot] closed issue #1482: Custom partition functions URL: https://github.com/apache/iceberg/issues/1482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] OpenAPI: Fix additionalProperties for SnapshotSummary [iceberg]

2024-03-25 Thread via GitHub
haizhou-zhao commented on code in PR #9838: URL: https://github.com/apache/iceberg/pull/9838#discussion_r1538384050 ## open-api/rest-catalog-open-api.py: ## @@ -171,7 +171,6 @@ class SortOrder(BaseModel): class Summary(BaseModel): operation: Literal['append', 'replace',

Re: [PR] Core: Calling rewrite_position_delete_files fails on tables with more than 1k columns [iceberg]

2024-03-25 Thread via GitHub
szehon-ho commented on PR #10020: URL: https://github.com/apache/iceberg/pull/10020#issuecomment-2019108301 Actually testing further, this does not work correctly if position delete file actually has 'row' value populated. The problem being, the position delete file is a parquet file with

Re: [PR] core,api: Refactor code with `hasLiveEntries` [iceberg]

2024-03-25 Thread via GitHub
dramaticlly commented on code in PR #9993: URL: https://github.com/apache/iceberg/pull/9993#discussion_r1538353913 ## api/src/main/java/org/apache/iceberg/ManifestFile.java: ## @@ -177,6 +177,15 @@ default boolean hasDeletedFiles() { /** Returns the total number of rows in al

Re: [PR] Core: Add data sequence number as derived column to files metadata table [iceberg]

2024-03-25 Thread via GitHub
dramaticlly commented on code in PR #9813: URL: https://github.com/apache/iceberg/pull/9813#discussion_r1538338541 ## core/src/main/java/org/apache/iceberg/MetadataTableUtils.java: ## @@ -109,4 +117,68 @@ public static Table createMetadataTableInstance( private static String

Re: [PR] Core: Add data sequence number as derived column to files metadata table [iceberg]

2024-03-25 Thread via GitHub
dramaticlly commented on code in PR #9813: URL: https://github.com/apache/iceberg/pull/9813#discussion_r1538335617 ## core/src/main/java/org/apache/iceberg/BaseFilesTable.java: ## @@ -158,14 +176,26 @@ static class ManifestReadTask extends BaseFileScanTask implements DataTask {

Re: [PR] Core: Add data sequence number as derived column to files metadata table [iceberg]

2024-03-25 Thread via GitHub
dramaticlly commented on code in PR #9813: URL: https://github.com/apache/iceberg/pull/9813#discussion_r1538335078 ## core/src/main/java/org/apache/iceberg/BaseFilesTable.java: ## @@ -54,7 +56,23 @@ public Schema schema() { schema = TypeUtil.selectNot(schema, Sets.newHas

Re: [PR] Core, Spark: Fix handling of null binary values when sorting with zorder [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar merged PR #10026: URL: https://github.com/apache/iceberg/pull/10026 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Core, Spark: Fix handling of null binary values when sorting with zorder [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar commented on PR #10026: URL: https://github.com/apache/iceberg/pull/10026#issuecomment-2018998345 Merging, thanks for the reviews @RussellSpitzer @nastra ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Core, Spark: Fix handling of null binary values when sorting with zorder [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #10026: URL: https://github.com/apache/iceberg/pull/10026#discussion_r1538289522 ## spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -218,6 +218,44 @@ public void test

Re: [I] Bloom filter not properly leveraged when using an OR condition [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar commented on issue #10029: URL: https://github.com/apache/iceberg/issues/10029#issuecomment-2018962330 I've been following this thread and after thinking about the proposed solution and going through the code a bit more, I think @cccs-jc approach is logically sound. This is

Re: [PR] Core, Spark: Fix handling of null binary values when sorting with zorder [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar commented on code in PR #10026: URL: https://github.com/apache/iceberg/pull/10026#discussion_r1538240118 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -261,6 +261,44 @@ public void test

Re: [PR] Add Pagination To List Apis [iceberg]

2024-03-25 Thread via GitHub
danielcweeks commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1538216438 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -114,6 +114,9 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog pr

Re: [PR] On write operation, cast data to Iceberg Table's pyarrow schema [iceberg-python]

2024-03-25 Thread via GitHub
kevinjqliu commented on code in PR #523: URL: https://github.com/apache/iceberg-python/pull/523#discussion_r1538212872 ## pyiceberg/table/__init__.py: ## @@ -1156,7 +1166,9 @@ def overwrite( if len(self.spec().fields) > 0: raise ValueError("Cannot write to

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-25 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1538176793 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[I] Ci: Throttling causes flakyness in link checker [iceberg]

2024-03-25 Thread via GitHub
CsengerG opened a new issue, #10038: URL: https://github.com/apache/iceberg/issues/10038 ### Apache Iceberg version 1.5.0 (latest release) ### Query engine None ### Please describe the bug 🐞 When you open a new Iceberg PR, the link checker CI step sometimes

Re: [PR] Aws: Add Iceberg version to UserAgent in S3 requests [iceberg]

2024-03-25 Thread via GitHub
CsengerG commented on PR #9963: URL: https://github.com/apache/iceberg/pull/9963#issuecomment-2018841589 Opened https://github.com/apache/iceberg/issues/10038 to track. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-25 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1538178042 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-25 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1538176793 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] On write operation, cast data to Iceberg Table's pyarrow schema [iceberg-python]

2024-03-25 Thread via GitHub
Fokko commented on code in PR #523: URL: https://github.com/apache/iceberg-python/pull/523#discussion_r1538172665 ## pyiceberg/table/__init__.py: ## @@ -1156,7 +1166,9 @@ def overwrite( if len(self.spec().fields) > 0: raise ValueError("Cannot write to parti

Re: [PR] Support CreateTableTransaction in Glue and Rest [iceberg-python]

2024-03-25 Thread via GitHub
syun64 commented on code in PR #498: URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1538156050 ## pyiceberg/catalog/__init__.py: ## @@ -288,6 +291,78 @@ def __init__(self, name: str, **properties: str): def _load_file_io(self, properties: Properties = EM

[PR] fix: HMS Catalog missing properties `fn create_namespace` [iceberg-rust]

2024-03-25 Thread via GitHub
marvinlanhenke opened a new pull request, #303: URL: https://github.com/apache/iceberg-rust/pull/303 closes #302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Aws: Add Iceberg version to UserAgent in S3 requests [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar merged PR #9963: URL: https://github.com/apache/iceberg/pull/9963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Aws: Add Iceberg version to UserAgent in S3 requests [iceberg]

2024-03-25 Thread via GitHub
amogh-jahagirdar commented on PR #9963: URL: https://github.com/apache/iceberg/pull/9963#issuecomment-2018700661 Since the failures were trottling errors from another service, I retried and they succeeded. If this is a consistent problem now in our CI, let's create a separate issue to track

Re: [PR] Aws: Add Iceberg version to UserAgent in S3 requests [iceberg]

2024-03-25 Thread via GitHub
nastra commented on PR #9963: URL: https://github.com/apache/iceberg/pull/9963#issuecomment-2018662340 > Looks like the link checker got throttled (HTTP 429s from Medium). Is this a known failure mode? FYI @Fokko -- This is an automated message from the Apache Git Service. To resp

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-25 Thread via GitHub
stevenzwu commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1538061348 ## core/src/test/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-25 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1538017237 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,20 @@ protected enum CommitStatus { * @return Commit Status of Succ

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-25 Thread via GitHub
szehon-ho commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1538017237 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +304,20 @@ protected enum CommitStatus { * @return Commit Status of Succ

Re: [PR] [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist [iceberg]

2024-03-25 Thread via GitHub
abmo-x commented on code in PR #9998: URL: https://github.com/apache/iceberg/pull/9998#discussion_r1537981941 ## core/src/test/java/org/apache/iceberg/hadoop/HadoopStreamsTest.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] Spark: Add CopyTable spark action [iceberg]

2024-03-25 Thread via GitHub
jotarada commented on code in PR #10024: URL: https://github.com/apache/iceberg/pull/10024#discussion_r1537942005 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/BaseCopyTableSparkAction.java: ## @@ -0,0 +1,871 @@ +/* + * Licensed to the Apache Software Founda

Re: [I] Consider renaming `data_file` field on FileScanTask to maintain consistency and avoid confusion [iceberg-rust]

2024-03-25 Thread via GitHub
a-agmon commented on issue #299: URL: https://github.com/apache/iceberg-rust/issues/299#issuecomment-2018477011 > > Sorry I don't quite get your meaning, do you mean to use `manifest_entry` for data file, and `deletes` for deletions? > > Yes. Something like that. Not sure if th

Re: [PR] Hive: Arrange common part of the code for Iceberg View. [iceberg]

2024-03-25 Thread via GitHub
nk1506 commented on code in PR #10001: URL: https://github.com/apache/iceberg/pull/10001#discussion_r1537915830 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -304,23 +310,13 @@ protected void doCommit(TableMetadata base, TableMetadata m

Re: [I] Can pyiceberg rename iceberg database? [iceberg-python]

2024-03-25 Thread via GitHub
syun64 commented on issue #546: URL: https://github.com/apache/iceberg-python/issues/546#issuecomment-2018451026 You can extend the same function to renaming the table namespace, granted that the table namespace exists. ``` properties = {##Your catalog properties##} catalog = lo

[PR] Add labeler [iceberg-python]

2024-03-25 Thread via GitHub
Fokko opened a new pull request, #549: URL: https://github.com/apache/iceberg-python/pull/549 To add the ability to filter on specific components. The list of labels is probably not exhaustive, but is a first start. -- This is an automated message from the Apache Git Service. To res

Re: [I] How to load a table with bucket partition [iceberg-python]

2024-03-25 Thread via GitHub
Fokko commented on issue #548: URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018364320 Ah, I misread what you're looking for. Have you considered the `.plan_files()` API where you just get a list of tasks to read? -- This is an automated message from the Apache Gi

Re: [PR] Aws: Add Iceberg version to UserAgent in S3 requests [iceberg]

2024-03-25 Thread via GitHub
CsengerG commented on PR #9963: URL: https://github.com/apache/iceberg/pull/9963#issuecomment-2018358125 Looks like the link checker got throttled (HTTP 429s from Medium). Is this a known failure mode? -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] Can pyiceberg rename iceberg database? [iceberg-python]

2024-03-25 Thread via GitHub
madeirak commented on issue #546: URL: https://github.com/apache/iceberg-python/issues/546#issuecomment-2018270689 thx for reply, I mean table namespace. Is there any function to rename this? Replied Message | From | Sung ***@***.***> | | Date | 03/25/2024 22:37 | |

Re: [PR] Migrate other files in Core to JUnit5 [iceberg]

2024-03-25 Thread via GitHub
nastra merged PR #10027: URL: https://github.com/apache/iceberg/pull/10027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] rest: refresh the OAuth token when expired [iceberg-rust]

2024-03-25 Thread via GitHub
Xuanwo commented on issue #301: URL: https://github.com/apache/iceberg-rust/issues/301#issuecomment-2018252582 Have fun! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] How to load a table with bucket partition [iceberg-python]

2024-03-25 Thread via GitHub
frankliee commented on issue #548: URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018198305 > @frankliee PyIceberg will do the filtering automatically, so when you filter on the id column, it will automatically use the bucketing to filter down to the correct bucket:

Re: [PR] Fix race condition on `Table.scan` with `limit` [iceberg-python]

2024-03-25 Thread via GitHub
bigluck commented on PR #545: URL: https://github.com/apache/iceberg-python/pull/545#issuecomment-2018170492 Thanks @kevinjqliu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Consider renaming `data_file` field on FileScanTask to maintain consistency and avoid confusion [iceberg-rust]

2024-03-25 Thread via GitHub
marvinlanhenke commented on issue #299: URL: https://github.com/apache/iceberg-rust/issues/299#issuecomment-2018168538 > Sorry I don't quite get your meaning, do you mean to use `manifest_entry` for data file, and `deletes` for deletions? Yes. Something like that. -- This is an

Re: [I] rest: refresh the token when expired [iceberg-rust]

2024-03-25 Thread via GitHub
TennyZhuang commented on issue #301: URL: https://github.com/apache/iceberg-rust/issues/301#issuecomment-2018161639 request for self-assign -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [I] Can pyiceberg rename iceberg database? [iceberg-python]

2024-03-25 Thread via GitHub
syun64 commented on issue #546: URL: https://github.com/apache/iceberg-python/issues/546#issuecomment-2018152377 Hi @madeirak , by "iceberg database", which one of the following are you referring to? To my knowledge, there are three levels of naming in Iceberg: - catalog - table na

Re: [PR] feat: add transform_literal [iceberg-rust]

2024-03-25 Thread via GitHub
ZENOTME commented on PR #287: URL: https://github.com/apache/iceberg-rust/pull/287#issuecomment-2018117523 Thanks for your review! Have fixed it. @liurenjie1024 @marvinlanhenke -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Docs: Fix inconsistency in branching and tagging scenario [iceberg]

2024-03-25 Thread via GitHub
bitsondatadev commented on PR #9968: URL: https://github.com/apache/iceberg/pull/9968#issuecomment-2018058241 @lawofcycles Thanks for your patience and willingness to help here! I'd like to consider an alternative explanation. I think conflating version with days is a big part of the confus

Re: [I] How to load a table with bucket partition [iceberg-python]

2024-03-25 Thread via GitHub
Fokko commented on issue #548: URL: https://github.com/apache/iceberg-python/issues/548#issuecomment-2018051402 @frankliee PyIceberg will do the filtering automatically, so when you filter on the id column, it will automatically use the bucketing to filter down to the correct bucket:

Re: [I] Consider renaming `data_file` field on FileScanTask to maintain consistency and avoid confusion [iceberg-rust]

2024-03-25 Thread via GitHub
a-agmon commented on issue #299: URL: https://github.com/apache/iceberg-rust/issues/299#issuecomment-2018036775 Hi, Please see this PR and let me know what do you think https://github.com/apache/iceberg-rust/pull/300 -- This is an automated message from the Apache Git Service. To re

Re: [PR] Spark: Add CopyTable spark action [iceberg]

2024-03-25 Thread via GitHub
laithalzyoud commented on PR #10024: URL: https://github.com/apache/iceberg/pull/10024#issuecomment-2018030997 > @laithalzyoud thanks for working on this. I just did a very quick high-level review, but will do a more thorough one this week Thanks for taking a look @nastra, I'll stash

[PR] fix: renaming FileScanTask.data_file to data_manifest_entry [iceberg-rust]

2024-03-25 Thread via GitHub
a-agmon opened a new pull request, #300: URL: https://github.com/apache/iceberg-rust/pull/300 resolves: #299 This PR is: 1. Renames FileScanTask.data_file to data_manifest_entry (it returns `ManifestEntry`) 2. Exposes data_file() on `ManifestEntry` 3. change access to d

Re: [PR] feat: rest client respect prefix prop [iceberg-rust]

2024-03-25 Thread via GitHub
liurenjie1024 commented on PR #297: URL: https://github.com/apache/iceberg-rust/pull/297#issuecomment-2017998890 cc @flyrain PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[I] How to load a table with bucket partitioned [iceberg-python]

2024-03-25 Thread via GitHub
frankliee opened a new issue, #548: URL: https://github.com/apache/iceberg-python/issues/548 ### Question The table DDL is ``` create table test1 ( id int, data int ) using iceberg partitioned by (bucket(4, id)); ``` How to write the row_filter to load o

Re: [I] Consider renaming `data_file` field on FileScanTask to maintain consistency and avoid confusion [iceberg-rust]

2024-03-25 Thread via GitHub
a-agmon commented on issue #299: URL: https://github.com/apache/iceberg-rust/issues/299#issuecomment-2017966111 > Hi, @a-agmon Thanks for raising this. Let me add some background here, the reason the field is named as `data_file` is that, in fact for each `FileScanTask` we have not only dat

Re: [PR] feat: Complete predicate builders for all operators. [iceberg-rust]

2024-03-25 Thread via GitHub
QuakeWang commented on PR #276: URL: https://github.com/apache/iceberg-rust/pull/276#issuecomment-2017945301 > cc @QuakeWang Are you still working on this? Sorry, I've been a bit busy lately, I'll keep following up. -- This is an automated message from the Apache Git Service. To res

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
manuzhang commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2017943260 For example, I can create a table B with location under that of table A. I don't want to delete table B when dropping table A. -- This is an automated message from the Apache Git

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
wanghualei commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2017939050 Is it related to object storage? There is no directory relationship. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Spark can not delete table metadata and data when drop table [iceberg]

2024-03-25 Thread via GitHub
wanghualei commented on issue #9990: URL: https://github.com/apache/iceberg/issues/9990#issuecomment-2017924722 @manuzhang > @tomfans If you mean empty table directories are left over, I can confirm that's behavior for `HiveCatalog`. It removes the table record from metastore, and

Re: [PR] feat: Complete predicate builders for all operators. [iceberg-rust]

2024-03-25 Thread via GitHub
liurenjie1024 commented on PR #276: URL: https://github.com/apache/iceberg-rust/pull/276#issuecomment-2017900714 cc @QuakeWang Are you still working on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] Implement `project` for `Transform`. [iceberg-rust]

2024-03-25 Thread via GitHub
ZENOTME commented on issue #264: URL: https://github.com/apache/iceberg-rust/issues/264#issuecomment-2017900050 > @ZENOTME are you still working on #287 and the proposed interface? just looking for a quick update on this - and huge thanks for your effort on this. Yes. I will update it

  1   2   >