Re: [I] Add apply interface in transaction [iceberg-rust]

2024-09-03 Thread via GitHub
liurenjie1024 commented on issue #596: URL: https://github.com/apache/iceberg-rust/issues/596#issuecomment-2328031227 Hi, @ZENOTME Could you elaborate on this? I'm kind of confusing about the proposal. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Table Scan: Add Row Selection Filtering [iceberg-rust]

2024-09-03 Thread via GitHub
liurenjie1024 commented on code in PR #565: URL: https://github.com/apache/iceberg-rust/pull/565#discussion_r1742915490 ## crates/iceberg/src/expr/visitors/page_index_evaluator.rs: ## @@ -0,0 +1,1491 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more c

Re: [I] Use Min, Max, and NumOfNulls from Manifest Files for Spark Column Stats [iceberg]

2024-09-03 Thread via GitHub
huaxingao commented on issue #10791: URL: https://github.com/apache/iceberg/issues/10791#issuecomment-2327991774 I think we could introduce a property that allows users to choose whether to calculate the statistics on the fly. -- This is an automated message from the Apache Git Service. T

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-03 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2327947629 @PaulLiang1 Thanks! I'll check with my colleague tomorrow to find out where we are in the binary release process. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Kafka Connect: increase timeout for integration test [iceberg]

2024-09-03 Thread via GitHub
manuzhang commented on PR #11075: URL: https://github.com/apache/iceberg/pull/11075#issuecomment-2327896017 Can we create a separate CI for kafka connect? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-09-03 Thread via GitHub
PaulLiang1 commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2327877261 hey @huaxingao we are really interested in this feature, just wonder what can we help to getting this integrated? -- This is an automated message from the Apache Git Service. To

Re: [PR] Materialized View Spec [iceberg]

2024-09-03 Thread via GitHub
stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1742350128 ## format/view-spec.md: ## @@ -42,12 +42,24 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata fil

Re: [PR] Spark: Deprecate SparkAppenderFactory [iceberg]

2024-09-03 Thread via GitHub
ajantha-bhat commented on code in PR #11076: URL: https://github.com/apache/iceberg/pull/11076#discussion_r1742886689 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java: ## @@ -48,6 +48,10 @@ import org.apache.spark.sql.types.StructType;

[PR] Bump cryptography from 43.0.0 to 43.0.1 [iceberg-python]

2024-09-03 Thread via GitHub
dependabot[bot] opened a new pull request, #1130: URL: https://github.com/apache/iceberg-python/pull/1130 Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.0 to 43.0.1. Changelog Sourced from https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst";>cryptogr

Re: [I] Regression in 0.7.0 due to type coercion from "string" to "large_string" [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1128: URL: https://github.com/apache/iceberg-python/issues/1128#issuecomment-2327686045 As a workaround, you can manually set the table property to force the read path to use the `string` type ``` from pyiceberg.io import PYARROW_USE_LARGE_TY

Re: [I] Regression in 0.7.0 due to type coercion from "string" to "large_string" [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1128: URL: https://github.com/apache/iceberg-python/issues/1128#issuecomment-2327680908 The issue above is was mentioned here https://github.com/apache/iceberg-python/pull/986#discussion_r1706662170 On read, pyarrow will use large type as default. It is

Re: [I] Regression in 0.7.0 due to type coercion from "string" to "large_string" [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1128: URL: https://github.com/apache/iceberg-python/issues/1128#issuecomment-2327679275 To summarize, given a table created with `string` type schema and written to with `string` type data, reading the table back returns pyarrow dataframe with `large_string`

Re: [I] How to remove orphan manifest and manifest list file [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on issue #7937: URL: https://github.com/apache/iceberg/issues/7937#issuecomment-2327661686 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: reduce scale factor for HadoopFileIOTest prefix tests [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #7047: Core: reduce scale factor for HadoopFileIOTest prefix tests URL: https://github.com/apache/iceberg/pull/7047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Document available Flink config options. [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #7041: Document available Flink config options. URL: https://github.com/apache/iceberg/pull/7041 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Duplicate records with MERGE command [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on issue #7005: URL: https://github.com/apache/iceberg/issues/7005#issuecomment-2327661264 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Build: Upgrade netty-buffer to 4.1.89.Final [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #6986: Build: Upgrade netty-buffer to 4.1.89.Final URL: https://github.com/apache/iceberg/pull/6986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Push down group by for partition columns [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #6981: Push down group by for partition columns URL: https://github.com/apache/iceberg/pull/6981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Parquet: Implement column index filter and update row read path to support page skipping [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #6967: URL: https://github.com/apache/iceberg/pull/6967#issuecomment-2327661207 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Add Catalog Transactions API [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #6948: Core: Add Catalog Transactions API URL: https://github.com/apache/iceberg/pull/6948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Core: Add Catalog Transactions API [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #6948: URL: https://github.com/apache/iceberg/pull/6948#issuecomment-2327661185 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Parquet: Add page filter using page indexes [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #6935: Parquet: Add page filter using page indexes URL: https://github.com/apache/iceberg/pull/6935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Parquet: Add page filter using page indexes [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #6935: URL: https://github.com/apache/iceberg/pull/6935#issuecomment-2327661157 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Spark 3.2 and 3.3: Use Reblance instead of Repartition for distribution in SparkWrite [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #7932: URL: https://github.com/apache/iceberg/pull/7932#issuecomment-2327661645 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Docs: Improve possible options/parameters for system procedures and usage. [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on issue #7934: URL: https://github.com/apache/iceberg/issues/7934#issuecomment-2327661667 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Read is not working on Iceberg Hive table [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on issue #7924: URL: https://github.com/apache/iceberg/issues/7924#issuecomment-2327661632 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: reduce scale factor for HadoopFileIOTest prefix tests [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #7047: URL: https://github.com/apache/iceberg/pull/7047#issuecomment-2327661311 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Document available Flink config options. [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #7041: URL: https://github.com/apache/iceberg/pull/7041#issuecomment-2327661291 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Build: Upgrade netty-buffer to 4.1.89.Final [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #6986: URL: https://github.com/apache/iceberg/pull/6986#issuecomment-2327661253 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Push down group by for partition columns [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #6981: URL: https://github.com/apache/iceberg/pull/6981#issuecomment-2327661225 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Parquet: Implement column index filter and update row read path to support page skipping [iceberg]

2024-09-03 Thread via GitHub
github-actions[bot] closed pull request #6967: Parquet: Implement column index filter and update row read path to support page skipping URL: https://github.com/apache/iceberg/pull/6967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Regression in 0.7.0 due to type coercion from "string" to "large_string" [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1128: URL: https://github.com/apache/iceberg-python/issues/1128#issuecomment-2327597621 The default was changed back to `string` in 0.7.1 (from `large_string` in 0.7.0) Can you test the above again with 0.7.1? See #887 for more info -- This i

Re: [I] PartitionSpec.Builder does not support column name case-insensitivity [iceberg]

2024-09-03 Thread via GitHub
sl255051 commented on issue #10668: URL: https://github.com/apache/iceberg/issues/10668#issuecomment-2327544195 This issue has been fixed with the following PR https://github.com/apache/iceberg/pull/10678 -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [I] PartitionSpec.Builder does not support column name case-insensitivity [iceberg]

2024-09-03 Thread via GitHub
sl255051 closed issue #10668: PartitionSpec.Builder does not support column name case-insensitivity URL: https://github.com/apache/iceberg/issues/10668 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742770089 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742752535 ## open-api/rest-catalog-open-api.yaml: ## @@ -629,7 +887,7 @@ paths: The snapshots to return in the body of the metadata. Setting the value to `all` would

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742750950 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pla

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742749253 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs"

Re: [I] Remove `InMemoryCatalog` from the test-codebase [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1110: URL: https://github.com/apache/iceberg-python/issues/1110#issuecomment-2327513232 perhaps its a good idea to alias a new `InMemoryCatalog` implementation using SqlCatalog such as https://github.com/apache/iceberg-python/blob/9857107561d2267813b7c

Re: [PR] Docs: Fix Flink 1.20 support versions [iceberg]

2024-09-03 Thread via GitHub
stevenzwu commented on PR #11065: URL: https://github.com/apache/iceberg/pull/11065#issuecomment-2327440134 thanks @manuzhang for the fix and @pvary for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Docs: Fix Flink 1.20 support versions [iceberg]

2024-09-03 Thread via GitHub
stevenzwu merged PR #11065: URL: https://github.com/apache/iceberg/pull/11065 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Flink: multiple sinks for the different iceberg tables in the same job? [iceberg]

2024-09-03 Thread via GitHub
pvary commented on issue #11074: URL: https://github.com/apache/iceberg/issues/11074#issuecomment-2327299266 I don't think this should happen. Could you please create a small unit test to reproduce? Thanks, Peter -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Add metadata tables for `data_files` and `delete_files` [iceberg-python]

2024-09-03 Thread via GitHub
soumya-ghosh commented on code in PR #1066: URL: https://github.com/apache/iceberg-python/pull/1066#discussion_r1742582633 ## tests/integration/test_inspect_table.py: ## @@ -672,126 +672,141 @@ def test_inspect_files( # append more data tbl.append(arrow_table_with_null

Re: [PR] Flink: Fix compile warning [iceberg]

2024-09-03 Thread via GitHub
pvary merged PR #11072: URL: https://github.com/apache/iceberg/pull/11072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Flink: Fix compile warning [iceberg]

2024-09-03 Thread via GitHub
pvary commented on PR #11072: URL: https://github.com/apache/iceberg/pull/11072#issuecomment-2327260622 Merged to main! Thanks for the fix @ajantha-bhat! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Add metadata tables for `data_files` and `delete_files` [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on code in PR #1066: URL: https://github.com/apache/iceberg-python/pull/1066#discussion_r1742555465 ## tests/integration/test_inspect_table.py: ## @@ -672,126 +672,141 @@ def test_inspect_files( # append more data tbl.append(arrow_table_with_null)

Re: [PR] Initial committer guidelines and requirements for merging [iceberg]

2024-09-03 Thread via GitHub
aokolnychyi commented on PR #10780: URL: https://github.com/apache/iceberg/pull/10780#issuecomment-2327198571 Thanks @emkornfield and everyone who reviewed! Merged per the dev list vote. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Initial committer guidelines and requirements for merging [iceberg]

2024-09-03 Thread via GitHub
aokolnychyi merged PR #10780: URL: https://github.com/apache/iceberg/pull/10780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Add metadata tables for `data_files` and `delete_files` [iceberg-python]

2024-09-03 Thread via GitHub
soumya-ghosh commented on code in PR #1066: URL: https://github.com/apache/iceberg-python/pull/1066#discussion_r1742512075 ## tests/integration/test_inspect_table.py: ## @@ -672,126 +672,141 @@ def test_inspect_files( # append more data tbl.append(arrow_table_with_null

Re: [PR] Core: Refactor ZOrderByteUtils [iceberg]

2024-09-03 Thread via GitHub
RussellSpitzer merged PR #10624: URL: https://github.com/apache/iceberg/pull/10624 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Core: Refactor ZOrderByteUtils [iceberg]

2024-09-03 Thread via GitHub
RussellSpitzer commented on PR #10624: URL: https://github.com/apache/iceberg/pull/10624#issuecomment-2327157530 Thanks @ajantha-bhat , Merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Do not deprecate Botocore Session in upcoming release (0.8) [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1104: URL: https://github.com/apache/iceberg-python/issues/1104#issuecomment-2327150004 @BTheunissen +1, opened #1129 to track this feature. It can be hacky for now. This feature is generally nice to have for the project -- This is an automated message fro

Re: [PR] Add metadata tables for `data_files` and `delete_files` [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on code in PR #1066: URL: https://github.com/apache/iceberg-python/pull/1066#discussion_r1742483484 ## tests/integration/test_inspect_table.py: ## @@ -672,126 +672,141 @@ def test_inspect_files( # append more data tbl.append(arrow_table_with_null)

Re: [I] CLI list not working [iceberg-python]

2024-09-03 Thread via GitHub
kevinjqliu commented on issue #1122: URL: https://github.com/apache/iceberg-python/issues/1122#issuecomment-2327131543 > I would expect this will not attempting to make any connection calls, but simply print the help message yea same, perhaps its some setting with the `click` lib

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742462669 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742462669 ## open-api/rest-catalog-open-api.yaml: ## @@ -3647,6 +4080,105 @@ components: type: integer description: "List of equality field IDs" +Pl

Re: [I] Use Min, Max, and NumOfNulls from Manifest Files for Spark Column Stats [iceberg]

2024-09-03 Thread via GitHub
guykhazma commented on issue #10791: URL: https://github.com/apache/iceberg/issues/10791#issuecomment-2327104016 @huaxingao min/max will not stay accurate but still provide valid lower and upper bounds. The issue I am seeing with null counts is that when spark gets for example a combinat

[I] Regression in 0.7.0 due to type coercion from "string" to "large_string" [iceberg-python]

2024-09-03 Thread via GitHub
maxfirman opened a new issue, #1128: URL: https://github.com/apache/iceberg-python/issues/1128 ### Apache Iceberg version 0.7.0 ### Please describe the bug 🐞 There is a regression in introduced in version 0.7.0 where arrow tables written with a "string" data type, get ca

Re: [PR] open-api: Fix compile warnings for testFixtures [iceberg]

2024-09-03 Thread via GitHub
danielcweeks commented on PR #11071: URL: https://github.com/apache/iceberg/pull/11071#issuecomment-2327027531 @ajantha-bhat Strange as it may seem, we can use exclusions and not add a new library: ```groovy testImplementation(libs.junit.suite.api) { exclude group: 'or

Re: [I] Use Min, Max, and NumOfNulls from Manifest Files for Spark Column Stats [iceberg]

2024-09-03 Thread via GitHub
huaxingao commented on issue #10791: URL: https://github.com/apache/iceberg/issues/10791#issuecomment-2326996054 If data files are filtered out by the query predicate, the pushed-down min/max/null counts are no longer accurate. Spark takes filter estimation into consideration when calcu

Re: [PR] Core: Add support for `view-default` property in catalog [iceberg]

2024-09-03 Thread via GitHub
singhpk234 commented on code in PR #11064: URL: https://github.com/apache/iceberg/pull/11064#discussion_r1742360146 ## docs/docs/spark-configuration.md: ## @@ -77,6 +77,8 @@ Both catalogs are configured using properties nested under the catalog name. Com | spark.sql.catalog._c

Re: [PR] Materialized View Spec [iceberg]

2024-09-03 Thread via GitHub
stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1742354781 ## format/view-spec.md: ## @@ -158,6 +173,59 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when

Re: [PR] Materialized View Spec [iceberg]

2024-09-03 Thread via GitHub
stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1742350128 ## format/view-spec.md: ## @@ -42,12 +42,24 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata fil

Re: [PR] Add Scan Planning Endpoints to open api spec [iceberg]

2024-09-03 Thread via GitHub
jackye1995 commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1742315149 ## open-api/rest-catalog-open-api.yaml: ## @@ -2774,6 +3062,140 @@ components: additionalProperties: type: string +ScanTasks: + ty

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-09-03 Thread via GitHub
rdblue commented on PR #9008: URL: https://github.com/apache/iceberg/pull/9008#issuecomment-2326856387 Thanks, @jacobmarble and @epgif! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Materialized View Spec [iceberg]

2024-09-03 Thread via GitHub
bennychow commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1742296539 ## format/view-spec.md: ## @@ -42,12 +42,24 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata fil

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-09-03 Thread via GitHub
rdblue merged PR #9008: URL: https://github.com/apache/iceberg/pull/9008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] New types can break older Iceberg agents [iceberg]

2024-09-03 Thread via GitHub
rdblue closed issue #10775: New types can break older Iceberg agents URL: https://github.com/apache/iceberg/issues/10775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [I] add type: Timestamp with nanosecond units [iceberg]

2024-09-03 Thread via GitHub
rdblue closed issue #8657: add type: Timestamp with nanosecond units URL: https://github.com/apache/iceberg/issues/8657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Kafka: runtime integration test failure or flaky [iceberg]

2024-09-03 Thread via GitHub
bryanck commented on issue #11046: URL: https://github.com/apache/iceberg/issues/11046#issuecomment-2326851207 I bumped the timeout for initializing Docker containers in https://github.com/apache/iceberg/pull/11075, hopefully that will help. -- This is an automated message from the Apache

Re: [PR] API: implement types timestamp_ns and timestamptz_ns [iceberg]

2024-09-03 Thread via GitHub
rdblue commented on code in PR #9008: URL: https://github.com/apache/iceberg/pull/9008#discussion_r1742287758 ## api/src/main/java/org/apache/iceberg/Schema.java: ## @@ -573,4 +575,27 @@ private List reassignIds(List columns, TypeUtil.GetID }); return res.asSt

[PR] Kafka Connect: increase timeout for integration test [iceberg]

2024-09-03 Thread via GitHub
bryanck opened a new pull request, #11075: URL: https://github.com/apache/iceberg/pull/11075 This PR increases the timeout for initializing the docker containers from the default of 1 minute to 2 minutes in the integration tests, to hopefully address https://github.com/apache/iceberg/issue

[I] Flink: multiple sinks for the different iceberg tables in the same job? [iceberg]

2024-09-03 Thread via GitHub
chenwyi2 opened a new issue, #11074: URL: https://github.com/apache/iceberg/issues/11074 ### Query engine flink 1.15 iceberg 1.2.1 ### Question can we use one source sinks to different tables in one task? such as https://github.com/user-attachments/assets/67e529b1

[PR] Bump flask-cors from 4.0.1 to 5.0.0 [iceberg-python]

2024-09-03 Thread via GitHub
dependabot[bot] opened a new pull request, #1127: URL: https://github.com/apache/iceberg-python/pull/1127 Bumps [flask-cors](https://github.com/corydolphin/flask-cors) from 4.0.1 to 5.0.0. Release notes Sourced from https://github.com/corydolphin/flask-cors/releases";>flask-cors's

[PR] feat: implement IcebergTableProviderFactory for datafusion [iceberg-rust]

2024-09-03 Thread via GitHub
yukkit opened a new pull request, #600: URL: https://github.com/apache/iceberg-rust/pull/600 resolve #586 TODO: supplement unit tests and add validation of data -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-03 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1742041236 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -498,11 +506,11 @@ public void testAllFilesUnpartitioned() throws

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-03 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1742044457 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -530,12 +530,13 @@ public void testFilesTableTimeTravel

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-03 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1742041236 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -498,11 +506,11 @@ public void testAllFilesUnpartitioned() throws

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-03 Thread via GitHub
hsiang-c commented on PR #9335: URL: https://github.com/apache/iceberg/pull/9335#issuecomment-2326486292 @szehon-ho @RussellSpitzer Please take a look for me, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[I] Discussion: Typesafe(r) properties [iceberg-rust]

2024-09-03 Thread via GitHub
c-thiel opened a new issue, #599: URL: https://github.com/apache/iceberg-rust/issues/599 We currently have property names scattered around the crate. If a property needs to be used, it typically needs to be parsed from string first. This is a potentially falling operation that should be don

Re: [I] Use Min, Max, and NumOfNulls from Manifest Files for Spark Column Stats [iceberg]

2024-09-03 Thread via GitHub
guykhazma commented on issue #10791: URL: https://github.com/apache/iceberg/issues/10791#issuecomment-2326424328 @huaxingao @karuppayya @jeesou @aokolnychyi @alexjo2144 @findepi @manishmalhotrawork Continuing the [discussion from the mailing list](https://lists.apache.org/thread/6kyvp5x

Re: [PR] Table Scan: Add Row Selection Filtering [iceberg-rust]

2024-09-03 Thread via GitHub
sdd commented on PR #565: URL: https://github.com/apache/iceberg-rust/pull/565#issuecomment-2326321589 @liurenjie1024 / @Xuanwo - PTAL when you get chance, thanks 👍🏼 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Bump crate-ci/typos from 1.24.1 to 1.24.3 [iceberg-rust]

2024-09-03 Thread via GitHub
liurenjie1024 closed issue #597: Bump crate-ci/typos from 1.24.1 to 1.24.3 URL: https://github.com/apache/iceberg-rust/issues/597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: bump crate-ci/typos to 1.24.3 [iceberg-rust]

2024-09-03 Thread via GitHub
liurenjie1024 merged PR #598: URL: https://github.com/apache/iceberg-rust/pull/598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Make `commit_table` public [iceberg-python]

2024-09-03 Thread via GitHub
sungwy commented on PR #1112: URL: https://github.com/apache/iceberg-python/pull/1112#issuecomment-2326240075 Hi @Fokko Just a heads up - sorry for adding in a merge conflict here. I merged in https://github.com/apache/iceberg-python/pull/820 first because it had been open for quite

Re: [PR] Add drop_view to the rest catalog [iceberg-python]

2024-09-03 Thread via GitHub
sungwy merged PR #820: URL: https://github.com/apache/iceberg-python/pull/820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Support more complex types when reading into arrow record batch. [iceberg-rust]

2024-09-03 Thread via GitHub
sdd commented on issue #405: URL: https://github.com/apache/iceberg-rust/issues/405#issuecomment-2325981291 I'm tackling items 2 and 3 from @liurenjie1024's initial comment on this issue. I've got a design proposal that I'd like to share to get feedback on. It involves creating a post

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-09-03 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1741651006 ## core/src/main/java/org/apache/iceberg/data/PartitionStatsRecord.java: ## @@ -0,0 +1,172 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-03 Thread via GitHub
fengjiajie commented on code in PR #11073: URL: https://github.com/apache/iceberg/pull/11073#discussion_r1741622214 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/FlinkInputFormat.java: ## @@ -93,7 +93,7 @@ public FlinkInputSplit[] createInputSplits(int minNu

Re: [PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-03 Thread via GitHub
fengjiajie commented on code in PR #11073: URL: https://github.com/apache/iceberg/pull/11073#discussion_r1741622214 ## flink/v1.18/flink/src/main/java/org/apache/iceberg/flink/source/FlinkInputFormat.java: ## @@ -93,7 +93,7 @@ public FlinkInputSplit[] createInputSplits(int minNu

[PR] Flink: Avoid metaspace memory leak by not registering ShutdownHook for ExecutorService in Flink [iceberg]

2024-09-03 Thread via GitHub
fengjiajie opened a new pull request, #11073: URL: https://github.com/apache/iceberg/pull/11073 We are currently using Flink's Local Env mode (executing Flink tasks within the local JVM). We are starting many "Flink query Iceberg tasks" sequentially within the lifecycle of a single JVM (whi

Re: [PR] Docs: Fix Flink 1.20 support versions [iceberg]

2024-09-03 Thread via GitHub
manuzhang commented on PR #11065: URL: https://github.com/apache/iceberg/pull/11065#issuecomment-2325764783 @pvary I don't think that's necessary if it's only in nightly docs. https://iceberg.apache.org/docs/nightly/flink-writes/ -- This is an automated message from the Apache Git Service

Re: [I] Review new ImmutablesReferenceEquality error-prone check [iceberg]

2024-09-03 Thread via GitHub
findepi commented on issue #10855: URL: https://github.com/apache/iceberg/issues/10855#issuecomment-2325763757 thanks for your insights @danielhumanmod ! > there are several `DangerousJavaDeserialization` warning in our code, do we have a plan to do some investigation on that?