Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-10 Thread via GitHub
pvary commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1519220818 ## core/src/main/java/org/apache/iceberg/BaseMetadata.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor

Re: [I] When deleting data from Iceberg tables in Spark, the current approach is to delete all data and then rewrite the new data, which is very wasteful in terms of storage space and computation. How

2024-03-10 Thread via GitHub
c8679724 commented on issue #9891: URL: https://github.com/apache/iceberg/issues/9891#issuecomment-1987694456 > i think that if you set write.delete.mode = merge-on-read in the table configuration then spark will do positional deletes , the default for it according to the docs is copy-on-wr

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-10 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1519191056 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java: ## @@ -181,4 +264,220 @@ default Table newHmsTable(String hmsTableOwner) { return

Re: [I] When deleting data from Iceberg tables in Spark, the current approach is to delete all data and then rewrite the new data, which is very wasteful in terms of storage space and computation. How

2024-03-10 Thread via GitHub
c8679724 commented on issue #9891: URL: https://github.com/apache/iceberg/issues/9891#issuecomment-1987633789 > > When deleting data from Iceberg tables in Spark > > How do you delete the data? could you provide an example for your code? delete from iceberg.ods.syc_db12005_cf_db

Re: [PR] Hive: Add View support for HIVE catalog [iceberg]

2024-03-10 Thread via GitHub
nk1506 commented on code in PR #9852: URL: https://github.com/apache/iceberg/pull/9852#discussion_r1519153254 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java: ## @@ -62,6 +81,57 @@ interface HiveOperationsBase { String table(); + String ca

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
Xuanwo commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519137105 ## crates/iceberg/src/expr/predicate.rs: ## @@ -55,6 +60,24 @@ impl LogicalExpression { } } +impl Bind for LogicalExpression +where +T::Bound: Sized, +{ +

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519136619 ## crates/iceberg/src/expr/predicate.rs: ## @@ -55,6 +60,24 @@ impl LogicalExpression { } } +impl Bind for LogicalExpression +where +T::Bound: Size

Re: [PR] [DRAFT]: Adjust site links to absolute from site_url [iceberg]

2024-03-10 Thread via GitHub
manuzhang commented on PR #9887: URL: https://github.com/apache/iceberg/pull/9887#issuecomment-1987595128 Will this create more complexity? Contributors might be confused if we use different link formats in documentation and site. How do we link to site in doc? -- This is an automated mes

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
Xuanwo commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519133419 ## crates/iceberg/src/expr/predicate.rs: ## @@ -406,13 +635,239 @@ mod tests { #[test] fn test_predicate_negate_set() { -let expression = Referen

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
Xuanwo commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519132180 ## crates/iceberg/src/expr/predicate.rs: ## @@ -55,6 +60,24 @@ impl LogicalExpression { } } +impl Bind for LogicalExpression +where +T::Bound: Sized, +{ +

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519131858 ## crates/iceberg/src/expr/predicate.rs: ## @@ -406,13 +635,239 @@ mod tests { #[test] fn test_predicate_negate_set() { -let expression =

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519131277 ## crates/iceberg/src/expr/predicate.rs: ## @@ -55,6 +60,24 @@ impl LogicalExpression { } } +impl Bind for LogicalExpression +where +T::Bound: Size

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
Xuanwo commented on code in PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#discussion_r1519129192 ## crates/iceberg/src/expr/predicate.rs: ## @@ -406,13 +635,239 @@ mod tests { #[test] fn test_predicate_negate_set() { -let expression = Referen

Re: [PR] [DRAFT]: Create iceberg_docs_improvement issues template. [iceberg]

2024-03-10 Thread via GitHub
manuzhang commented on code in PR #9896: URL: https://github.com/apache/iceberg/pull/9896#discussion_r1519126366 ## .github/ISSUE_TEMPLATE/iceberg_docs_improvement: ## @@ -0,0 +1,69 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor licens

Re: [PR] [DRAFT]: Create iceberg_docs_improvement issues template. [iceberg]

2024-03-10 Thread via GitHub
manuzhang commented on code in PR #9896: URL: https://github.com/apache/iceberg/pull/9896#discussion_r1519124720 ## .github/ISSUE_TEMPLATE/iceberg_docs_improvement: ## @@ -0,0 +1,69 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor licens

Re: [PR] feat: Implement binding expression [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on PR #231: URL: https://github.com/apache/iceberg-rust/pull/231#issuecomment-1987569345 I took a look at python's code again and realized that we should not couple not rewrite with binding, since not rewrite is not as widely used as binding, and we still need a rewr

Re: [I] Add gcs support in FileIO [iceberg-rust]

2024-03-10 Thread via GitHub
Xuanwo commented on issue #239: URL: https://github.com/apache/iceberg-rust/issues/239#issuecomment-1987552227 Let me handle this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Docs: Enhance create_changelog_view usage [iceberg]

2024-03-10 Thread via GitHub
manuzhang commented on PR #9889: URL: https://github.com/apache/iceberg/pull/9889#issuecomment-1987545272 @nastra @flyrain please help review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] refactor: Make plan_files as asynchronous stream [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on PR #243: URL: https://github.com/apache/iceberg-rust/pull/243#issuecomment-1987512913 cc @Xuanwo @Fokko PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] refactor: Make plan_files as asynchronous stream [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on code in PR #243: URL: https://github.com/apache/iceberg-rust/pull/243#discussion_r1519071696 ## crates/iceberg/src/scan.rs: ## @@ -143,37 +144,43 @@ pub type FileScanTaskStream = BoxStream<'static, crate::Result>; impl TableScan { /// Returns a

Re: [PR] [WIP] Add `PartitionEvaluator` to allow filtering of files in a table scan (Issue #152) [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on PR #241: URL: https://github.com/apache/iceberg-rust/pull/241#issuecomment-1987506207 Hi, @sdd Thanks for this pr. The filtering process in fact consists of two steps: 1. Filter manifest files in manifest list. This step is relative easy to do, and has no d

Re: [PR] chore: Enable projects. [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on PR #247: URL: https://github.com/apache/iceberg-rust/pull/247#issuecomment-1987444875 cc @Xuanwo @Fokko @ZENOTME PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] chore: Enable projects. [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 opened a new pull request, #247: URL: https://github.com/apache/iceberg-rust/pull/247 See discussion in https://github.com/apache/iceberg-rust/issues/113. We have more and more contributors, and I want to enable projects so that we can schedule development better. --

Re: [PR] Clarify pagination description [iceberg]

2024-03-10 Thread via GitHub
danielcweeks closed pull request #9872: Clarify pagination description URL: https://github.com/apache/iceberg/pull/9872 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Clarify pagination description [iceberg]

2024-03-10 Thread via GitHub
danielcweeks commented on PR #9872: URL: https://github.com/apache/iceberg/pull/9872#issuecomment-1987433194 Closing in favor of #9917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] Make HiveCatalog inheritable for custom IMetaStoreClient [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1470: URL: https://github.com/apache/iceberg/issues/1470#issuecomment-1987425427 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] make Flink iceberg sink work without checkpoint enabled. [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1442: URL: https://github.com/apache/iceberg/issues/1442#issuecomment-1987425401 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] any plan for Iceberg Table on S3? [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1468: any plan for Iceberg Table on S3? URL: https://github.com/apache/iceberg/issues/1468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Uppercased schemas are not readable in Iceberg-mr/ hive [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1445: Uppercased schemas are not readable in Iceberg-mr/ hive URL: https://github.com/apache/iceberg/issues/1445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] make Flink iceberg sink work without checkpoint enabled. [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1442: make Flink iceberg sink work without checkpoint enabled. URL: https://github.com/apache/iceberg/issues/1442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] How to use VectorizedRowBatchIterator [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1486: URL: https://github.com/apache/iceberg/issues/1486#issuecomment-1987425450 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How to use VectorizedRowBatchIterator [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1486: How to use VectorizedRowBatchIterator URL: https://github.com/apache/iceberg/issues/1486 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Reconsider handling of spaces in PartitionSpec$partitionToPath [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1479: URL: https://github.com/apache/iceberg/issues/1479#issuecomment-1987425440 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Reconsider handling of spaces in PartitionSpec$partitionToPath [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1479: Reconsider handling of spaces in PartitionSpec$partitionToPath URL: https://github.com/apache/iceberg/issues/1479 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Make HiveCatalog inheritable for custom IMetaStoreClient [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] closed issue #1470: Make HiveCatalog inheritable for custom IMetaStoreClient URL: https://github.com/apache/iceberg/issues/1470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] any plan for Iceberg Table on S3? [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1468: URL: https://github.com/apache/iceberg/issues/1468#issuecomment-1987425418 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Uppercased schemas are not readable in Iceberg-mr/ hive [iceberg]

2024-03-10 Thread via GitHub
github-actions[bot] commented on issue #1445: URL: https://github.com/apache/iceberg/issues/1445#issuecomment-1987425406 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Clarify pagination description [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1519005341 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,26 @@ components: PageToken: description: -An opaque token which allows clients to ma

Re: [PR] Clarify pagination description [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9872: URL: https://github.com/apache/iceberg/pull/9872#discussion_r1519004229 ## open-api/rest-catalog-open-api.yaml: ## @@ -1610,13 +1610,26 @@ components: PageToken: description: -An opaque token which allows clients to ma

Re: [PR] Fix pagination spec description [iceberg]

2024-03-10 Thread via GitHub
rahil-c closed pull request #9866: Fix pagination spec description URL: https://github.com/apache/iceberg/pull/9866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518977725 ## open-api/rest-catalog-open-api.yaml: ## @@ -2838,6 +3093,59 @@ components: additionalProperties: type: string +PreplanTableRequest: +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518979856 ## open-api/rest-catalog-open-api.yaml: ## @@ -2068,6 +2162,145 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518979634 ## open-api/rest-catalog-open-api.yaml: ## @@ -2068,6 +2162,145 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518978311 ## open-api/rest-catalog-open-api.yaml: ## @@ -2068,6 +2162,145 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518977725 ## open-api/rest-catalog-open-api.yaml: ## @@ -2838,6 +3093,59 @@ components: additionalProperties: type: string +PreplanTableRequest: +

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-03-10 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1518977725 ## open-api/rest-catalog-open-api.yaml: ## @@ -2838,6 +3093,59 @@ components: additionalProperties: type: string +PreplanTableRequest: +

Re: [I] Integrate pyiceberg with Dask [iceberg]

2024-03-10 Thread via GitHub
TomAugspurger commented on issue #5800: URL: https://github.com/apache/iceberg/issues/5800#issuecomment-1987287800 I think the main decision point was around where this should live: dask or pyiceberg. I don't see the private `_file_to_table` in https://github.com/apache/iceberg-pytho

Re: [PR] [Bug Fix] Allow Partition data to be nullable in ManifestEntry [iceberg-python]

2024-03-10 Thread via GitHub
syun64 commented on code in PR #509: URL: https://github.com/apache/iceberg-python/pull/509#discussion_r1518883984 ## pyiceberg/manifest.py: ## @@ -308,6 +308,7 @@ def data_file_with_partition(partition_type: StructType, format_version: Literal field_id=field.field

Re: [I] Add `table_exists` method to the Catalog [iceberg-python]

2024-03-10 Thread via GitHub
jayceslesar commented on issue #507: URL: https://github.com/apache/iceberg-python/issues/507#issuecomment-1987259909 could be similar to the mkdirs api? With `exists_ok` or some param? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Build: Bump orc from 1.9.2 to 2.0.0 [iceberg]

2024-03-10 Thread via GitHub
manuzhang commented on PR #9913: URL: https://github.com/apache/iceberg/pull/9913#issuecomment-1987249395 It looks ignoring major version update in dependabot is not working. @nastra can you help check dependabot logs? -- This is an automated message from the Apache Git Service. To respon

Re: [I] Discussion: Design of `TableMetadataBuilder`. [iceberg-rust]

2024-03-10 Thread via GitHub
marvinlanhenke commented on issue #232: URL: https://github.com/apache/iceberg-rust/issues/232#issuecomment-1987210644 > [...] and we don't have plan to change it > That's possible, but that means we need to implement the validation logic in two places Makes sense to me. thanks

Re: [I] Discussion: Design of `TableMetadataBuilder`. [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 closed issue #232: Discussion: Design of `TableMetadataBuilder`. URL: https://github.com/apache/iceberg-rust/issues/232 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Discussion: Design of `TableMetadataBuilder`. [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 commented on issue #232: URL: https://github.com/apache/iceberg-rust/issues/232#issuecomment-1987208369 > what about supporting FormatVersion V1 and V2? Will we simply model V2 and provide all optional fields as well for V1 in order to simplify the struct TableMetadata (like i

Re: [PR] [Bug Fix] Allow Partition data to be nullable in ManifestEntry [iceberg-python]

2024-03-10 Thread via GitHub
Fokko commented on code in PR #509: URL: https://github.com/apache/iceberg-python/pull/509#discussion_r1518833699 ## pyiceberg/manifest.py: ## @@ -308,6 +308,7 @@ def data_file_with_partition(partition_type: StructType, format_version: Literal field_id=field.field_

Re: [PR] [Bug Fix] Allow Partition data to be nullable in ManifestEntry [iceberg-python]

2024-03-10 Thread via GitHub
Fokko commented on code in PR #509: URL: https://github.com/apache/iceberg-python/pull/509#discussion_r1518833699 ## pyiceberg/manifest.py: ## @@ -308,6 +308,7 @@ def data_file_with_partition(partition_type: StructType, format_version: Literal field_id=field.field_

[PR] Allow fsspec up to 2025.1 [iceberg-python]

2024-03-10 Thread via GitHub
bolkedebruin opened a new pull request, #510: URL: https://github.com/apache/iceberg-python/pull/510 fsspec uses calendar versioning in which case the limit might not even make sense. Bumping it to another year, but you might want to consider removing the upper bound. -- This is an autom

Re: [I] Integrate pyiceberg with Dask [iceberg]

2024-03-10 Thread via GitHub
sam-goodwin commented on issue #5800: URL: https://github.com/apache/iceberg/issues/5800#issuecomment-1987174551 What's the latest on this issue? Also keen to write to Iceberg directly from Dask. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] Discussion: Design of `TableMetadataBuilder`. [iceberg-rust]

2024-03-10 Thread via GitHub
marvinlanhenke commented on issue #232: URL: https://github.com/apache/iceberg-rust/issues/232#issuecomment-1987173619 what about supporting FormatVersion V1 and V2? Will we simply model V2 and provide all optional fields as well for V1 in order to simplify the struct TableMetadata (like it

Re: [I] Read Parquet data file with projection [iceberg-rust]

2024-03-10 Thread via GitHub
viirya commented on issue #244: URL: https://github.com/apache/iceberg-rust/issues/244#issuecomment-1987147807 Thank you @sdd. I will take a look the doc tomorrow and update the PR accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Read Parquet data file with projection [iceberg-rust]

2024-03-10 Thread via GitHub
sdd commented on issue #244: URL: https://github.com/apache/iceberg-rust/issues/244#issuecomment-1987145982 Firstly, it's great to see someone else helping out on this - getting projection and filtering working on reads will unlock the most important (for me anyway 😅) use cases, so thanks f

Re: [I] feat: Add `AlwaysTrue`/`AlwaysFalse` for `Predicate` enum. [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 closed issue #224: feat: Add `AlwaysTrue`/`AlwaysFalse` for `Predicate` enum. URL: https://github.com/apache/iceberg-rust/issues/224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: add always_true and always_false predicate [iceberg-rust]

2024-03-10 Thread via GitHub
liurenjie1024 closed pull request #227: feat: add always_true and always_false predicate URL: https://github.com/apache/iceberg-rust/pull/227 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec