Re: [I] Restoring the Flink streaming job from the savepoint might trigger a silent data loss when Kafka source is used [iceberg]

2024-08-06 Thread via GitHub
lkokhreidze commented on issue #10892: URL: https://github.com/apache/iceberg/issues/10892#issuecomment-2272758965 I hope the explanation makes sense and I am not thinking about it in a completely wrong way. If this makes sense, I was wondering, if it would be possible to actually rollba

[I] Restoring the Flink streaming job from the savepoint might trigger a silent data loss when Kafka source is used [iceberg]

2024-08-06 Thread via GitHub
lkokhreidze opened a new issue, #10892: URL: https://github.com/apache/iceberg/issues/10892 ### Apache Iceberg version 1.5.2 ### Query engine Flink ### Please describe the bug 🐞 When Flink's state is restored, Iceberg File Committer gets the max committed c

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2272749648 Closed by #373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] FileIO S3: Add support for Assume-Role-Arn and other AWS Client properties [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo commented on issue #527: URL: https://github.com/apache/iceberg-rust/issues/527#issuecomment-2272745872 Thanks a lot for release this. - client.assume-role.arn: we can use [role_arn](https://docs.rs/opendal/0.48.0/opendal/services/struct.S3Config.html#structfield.role_arn) -

Re: [PR] OpenAPI: Standardize credentials in loadTable/loadView responses [iceberg]

2024-08-06 Thread via GitHub
c-thiel commented on code in PR #10722: URL: https://github.com/apache/iceberg/pull/10722#discussion_r1706467120 ## open-api/rest-catalog-open-api.yaml: ## @@ -2747,6 +2747,81 @@ components: uuid: type: string +ADLSCredentials: + type: object +

[I] MERGE INTO requires sorting in already sorted iceberg tables [iceberg]

2024-08-06 Thread via GitHub
korbel-jacek opened a new issue, #10891: URL: https://github.com/apache/iceberg/issues/10891 ### Apache Iceberg version 1.4.2 ### Query engine Spark ### Please describe the bug 🐞 Hi, I am trying to MERGE a small iceberg table into a large iceberg table, but

Re: [I] Use `std::thread::available_parallelism` instead of `num_cpus` [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo closed issue #525: Use `std::thread::available_parallelism` instead of `num_cpus` URL: https://github.com/apache/iceberg-rust/issues/525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] refactor: replace num_cpus with thread::available_parallelism [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo merged PR #526: URL: https://github.com/apache/iceberg-rust/pull/526 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] feat: Establish subproject pyiceberg_core [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo commented on PR #518: URL: https://github.com/apache/iceberg-rust/pull/518#issuecomment-2272722392 > nit: any strong feelings about `pyiceberg_core_rust` vs `pyiceberg_core`? The folder is named `pyiceberg_core` but the module is named `pyiceberg_core_rust` Hi, there are some

Re: [PR] feat: Establish subproject pyiceberg_core [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo commented on code in PR #518: URL: https://github.com/apache/iceberg-rust/pull/518#discussion_r1706451728 ## .github/workflows/bindings_python_ci.yml: ## @@ -0,0 +1,83 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreemen

Re: [PR] feat: Establish subproject pyiceberg_core [iceberg-rust]

2024-08-06 Thread via GitHub
Xuanwo commented on code in PR #518: URL: https://github.com/apache/iceberg-rust/pull/518#discussion_r1706450404 ## bindings/python/Cargo.toml: ## @@ -0,0 +1,35 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the N

[I] FileIO S3: Add support for Assume-Role-Arn and other AWS Client properties [iceberg-rust]

2024-08-06 Thread via GitHub
c-thiel opened a new issue, #527: URL: https://github.com/apache/iceberg-rust/issues/527 Currently FileIO respects: - s3.endpoint - s3.access-key-id - s3.secret-access-key - s3.region It would be great to also support additional client attributes which help with cross-

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706431358 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkDistributionMode.java: ## @@ -177,4 +185,288 @@ public void testOverrideWriteC

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706418907 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: ## @@ -548,21 +599,46 @@ private DataStream distributeDataStream( }

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706409470 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: ## @@ -233,15 +239,56 @@ public Builder flinkConf(ReadableConfig config) { *

Re: [PR] AWS: Implement SupportsRecoveryOperations for S3FileIO [iceberg]

2024-08-06 Thread via GitHub
amogh-jahagirdar commented on code in PR #10721: URL: https://github.com/apache/iceberg/pull/10721#discussion_r1706410889 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -106,6 +109,12 @@ public static void beforeClass() { AwsIntegT

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706409470 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java: ## @@ -233,15 +239,56 @@ public Builder flinkConf(ReadableConfig config) { *

Re: [PR] AWS: Implement SupportsRecoveryOperations for S3FileIO [iceberg]

2024-08-06 Thread via GitHub
amogh-jahagirdar commented on PR #10721: URL: https://github.com/apache/iceberg/pull/10721#issuecomment-2272662631 cc @singhpk234 @geruh @rahil-c -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706398874 ## docs/docs/flink-writes.md: ## @@ -262,6 +262,91 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ Check out all the options here: [write-opti

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706398874 ## docs/docs/flink-writes.md: ## @@ -262,6 +262,91 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ Check out all the options here: [write-opti

[PR] refactor: replace num_cpus with thread::available_parallelism [iceberg-rust]

2024-08-06 Thread via GitHub
SteveLauC opened a new pull request, #526: URL: https://github.com/apache/iceberg-rust/pull/526 ### What does this PR do Replaces `num_cpus::get()` with `std::thread::available_parallelism()`, as discussed in #525. Closes #525 -- This is an automated message from the Apache

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706396475 ## docs/docs/flink-writes.md: ## @@ -262,6 +262,91 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ Check out all the options here: [write-opti

Re: [PR] Simplify PrimitiveLiteral [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 commented on code in PR #502: URL: https://github.com/apache/iceberg-rust/pull/502#discussion_r1706386977 ## crates/iceberg/src/spec/values.rs: ## @@ -65,24 +65,14 @@ pub enum PrimitiveLiteral { Float(OrderedFloat), /// Stored as 8-byte little-endian

[I] NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented [iceberg-python]

2024-08-06 Thread via GitHub
djouallah opened a new issue, #1013: URL: https://github.com/apache/iceberg-python/issues/1013 ### Apache Iceberg version 0.7.0 (latest release) ### Please describe the bug 🐞 it was working fine, and today, I got this ? using Tabular as a catalog ``` /usr/local/

Re: [I] Use `std::thread::available_parallelism` instead of `num_cpus` [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 commented on issue #525: URL: https://github.com/apache/iceberg-rust/issues/525#issuecomment-2272571976 > Let me take this, looks pretty straightforward:) Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Flink: put everything together for range distribution in Flink sink [iceberg]

2024-08-06 Thread via GitHub
stevenzwu commented on code in PR #10859: URL: https://github.com/apache/iceberg/pull/10859#discussion_r1706343923 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/FlinkWriteOptions.java: ## @@ -60,6 +61,14 @@ private FlinkWriteOptions() {} public static final Conf

Re: [I] Use `std::thread::available_parallelism` instead of `num_cpus` [iceberg-rust]

2024-08-06 Thread via GitHub
SteveLauC commented on issue #525: URL: https://github.com/apache/iceberg-rust/issues/525#issuecomment-2272541994 Let me take this, looks pretty straightforward:) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] core: support support move a column with same name after rename column [iceberg]

2024-08-06 Thread via GitHub
FANNG1 closed pull request #10862: core: support support move a column with same name after rename column URL: https://github.com/apache/iceberg/pull/10862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Concurrent table scans [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1706317094 ## crates/iceberg/src/scan.rs: ## @@ -55,17 +55,23 @@ pub struct TableScanBuilder<'a> { batch_size: Option, case_sensitive: bool, filter: Option

[I] Parquet write option 'write.parquet.row-group-limit' issue [iceberg-python]

2024-08-06 Thread via GitHub
zhongyujiang opened a new issue, #1012: URL: https://github.com/apache/iceberg-python/issues/1012 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 Hi community, I found this write option in the Python documentation, `write.parquet.row-group-li

Re: [PR] Concurrent table scans [iceberg-rust]

2024-08-06 Thread via GitHub
SteveLauC commented on code in PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#discussion_r1706281350 ## crates/iceberg/src/scan.rs: ## @@ -55,17 +55,23 @@ pub struct TableScanBuilder<'a> { batch_size: Option, case_sensitive: bool, filter: Option, +

Re: [PR] Concurrent table scans [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 commented on PR #373: URL: https://github.com/apache/iceberg-rust/pull/373#issuecomment-2272420118 > @liurenjie1024 @Xuanwo are we able to merge this now? Done, thanks @sdd for this pr! -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Concurrent table scans [iceberg-rust]

2024-08-06 Thread via GitHub
liurenjie1024 merged PR #373: URL: https://github.com/apache/iceberg-rust/pull/373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] DOCS: Improve Documentation on Write Support [iceberg-python]

2024-08-06 Thread via GitHub
guitcastro commented on issue #1008: URL: https://github.com/apache/iceberg-python/issues/1008#issuecomment-2272411042 @sungwy When using overwrite how can we compare fields between source and target? like: ``` source.id = target.id and source.updated < target.updated ``` -

Re: [PR] core: support support move a column with same name after rename column [iceberg]

2024-08-06 Thread via GitHub
FANNG1 commented on code in PR #10862: URL: https://github.com/apache/iceberg/pull/10862#discussion_r1706240127 ## core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java: ## @@ -705,13 +705,13 @@ public void testMixedChanges() { .renameColumn("points.x", "X")

Re: [PR] Hive: Pre-check namespace presence for listTables [iceberg]

2024-08-06 Thread via GitHub
haizhou-zhao commented on PR #10845: URL: https://github.com/apache/iceberg/pull/10845#issuecomment-2272383395 @pvary I downloaded Hive 4.0.0 binary and did a quick test. I believe listing tables on a DB that does not existing in 4.0.0 will also return an empty list, not throwing `UnknownDB

Re: [PR] [Core] Add utility to print an Iceberg table metadata [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4142: URL: https://github.com/apache/iceberg/pull/4142#issuecomment-2272380922 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: bugfix add initial offset when using parquet format [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4162: URL: https://github.com/apache/iceberg/pull/4162#issuecomment-2272380954 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: add GenericFileWriterFactory [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4085: URL: https://github.com/apache/iceberg/pull/4085#issuecomment-2272380832 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: add FanoutEqualityDeleteWriter and FanoutPositionDeleteWriter [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4091: URL: https://github.com/apache/iceberg/pull/4091#issuecomment-2272380848 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: filter manifest with deleted files optimization [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4094: URL: https://github.com/apache/iceberg/pull/4094#issuecomment-2272380874 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] push down aggregation (min/max/count) to iceberg scan [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #4046: URL: https://github.com/apache/iceberg/issues/4046#issuecomment-2272380756 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Add delete file details for PartitionsTable [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4005: URL: https://github.com/apache/iceberg/pull/4005#issuecomment-2272380721 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support S3 Batch Removal of objects as part of snapshot expiration and removing orphan files [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #4012: URL: https://github.com/apache/iceberg/issues/4012#issuecomment-2272380738 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Use min sequence number on each partition to remove old delete files [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3990: URL: https://github.com/apache/iceberg/pull/3990#issuecomment-2272380666 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Support nested projection [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3991: URL: https://github.com/apache/iceberg/pull/3991#issuecomment-2272380685 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: fix it will not be rewritten when only one large file is divided into several target files [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3975: URL: https://github.com/apache/iceberg/pull/3975#issuecomment-2272380623 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Hive: HiveSchemaConverter should use one-based indexing like Iceberg … [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3947: URL: https://github.com/apache/iceberg/pull/3947#issuecomment-2272380523 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] ParquetWriter leaks memory [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3950: URL: https://github.com/apache/iceberg/issues/3950#issuecomment-2272380545 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Hive: Support 'identifier-field-ids' when creating table in hive [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3912: URL: https://github.com/apache/iceberg/pull/3912#issuecomment-2272380432 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Doc: add a page to explain Iceberg multi-engine support [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3864: URL: https://github.com/apache/iceberg/pull/3864#issuecomment-2272380361 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spark: Supports partition management in V2 Catalog [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3862: URL: https://github.com/apache/iceberg/pull/3862#issuecomment-2272380339 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support Iceberg branching for all the snapshot producer operations [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3896: URL: https://github.com/apache/iceberg/issues/3896#issuecomment-2272380388 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Inserts to the table partitioned by 'fixed' column fail [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3704: URL: https://github.com/apache/iceberg/issues/3704#issuecomment-2272380270 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Core: Reading manifetsFiles parallel with ManifestGroup#planFiles [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3742: URL: https://github.com/apache/iceberg/pull/3742#issuecomment-2272380311 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [SPARK] Make drop namespaces call respect CASCADE and IF EXISTS [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3701: URL: https://github.com/apache/iceberg/pull/3701#issuecomment-2272380203 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Docs: Add Drill to Documentation [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4300: URL: https://github.com/apache/iceberg/pull/4300#issuecomment-2272381179 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Spark: Skip corrupt files in Spark Procedures and Actions [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4325: URL: https://github.com/apache/iceberg/pull/4325#issuecomment-2272381202 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Add disk-based insertedRowMap to resolve the OOM while ingesting. [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4298: URL: https://github.com/apache/iceberg/pull/4298#issuecomment-2272381151 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Refactor to use the BaseEqualityDeltaWriter. [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4264: URL: https://github.com/apache/iceberg/pull/4264#issuecomment-2272381135 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Add support for ResolvedSchema [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4246: URL: https://github.com/apache/iceberg/pull/4246#issuecomment-2272381120 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Support non-optional union types for Avro [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4242: URL: https://github.com/apache/iceberg/pull/4242#issuecomment-2272381104 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] [Spark] Preserves the original table format when migrating from hive to Iceberg #4226 [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4227: URL: https://github.com/apache/iceberg/pull/4227#issuecomment-2272381092 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Docs: Sync back markdown updates which were done directly in iceberg-docs [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4215: URL: https://github.com/apache/iceberg/pull/4215#issuecomment-2272381073 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Remove Orphan Files error... [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] closed issue #4192: Remove Orphan Files error... URL: https://github.com/apache/iceberg/issues/4192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Remove Orphan Files error... [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #4192: URL: https://github.com/apache/iceberg/issues/4192#issuecomment-2272381046 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] SPEC - Update REST Catalog spec to put examples + type details on catalog config [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4186: URL: https://github.com/apache/iceberg/pull/4186#issuecomment-2272381021 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Flink: Implement catalog Factory interface (#3117) [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4183: URL: https://github.com/apache/iceberg/pull/4183#issuecomment-2272381004 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Doc: Add flink configuration [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4181: URL: https://github.com/apache/iceberg/pull/4181#issuecomment-2272380977 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Delete files not eventually removed if RewriteDataFile run right after delete (when using 'use-starting-sequence-number' default) [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #4127: URL: https://github.com/apache/iceberg/issues/4127#issuecomment-2272380895 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] [Core][Flink][Spark]: Refactor `TaskWriter` implementations [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4132: URL: https://github.com/apache/iceberg/pull/4132#issuecomment-2272380908 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Add VaultKmsClient as an example KMS client implementation [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4080: URL: https://github.com/apache/iceberg/pull/4080#issuecomment-2272380810 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Core: Add InMemory FileIO/Catalog and extract common catalog tests from HadoopCatalog into a common test [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #4054: URL: https://github.com/apache/iceberg/pull/4054#issuecomment-2272380779 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Support Iceberg Metadata storage in a variety of engines [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3997: URL: https://github.com/apache/iceberg/issues/3997#issuecomment-2272380705 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] :bug: fix Flink Read support for parquet int96 timestamps [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3987: URL: https://github.com/apache/iceberg/pull/3987#issuecomment-2272380644 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Parquet: close parquet writer compressor to avoid leaking the native memory #3950 [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3974: URL: https://github.com/apache/iceberg/pull/3974#issuecomment-2272380603 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Allowing ZOrder Distribution for normal Spark Writes [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3962: URL: https://github.com/apache/iceberg/issues/3962#issuecomment-2272380589 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Linkage errors in published jars [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3958: URL: https://github.com/apache/iceberg/issues/3958#issuecomment-2272380571 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Use changed partition to validate file confilct [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3945: URL: https://github.com/apache/iceberg/pull/3945#issuecomment-2272380504 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] [Feature Request] Support for change data capture [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3941: URL: https://github.com/apache/iceberg/issues/3941#issuecomment-2272380484 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Support timestamp in partition path [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3933: URL: https://github.com/apache/iceberg/pull/3933#issuecomment-2272380455 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] Test: Add unit tests to validate forTable calls setAll with table properties [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3902: URL: https://github.com/apache/iceberg/pull/3902#issuecomment-2272380405 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] [Improvement] ParallelIterable#hasNext submit reading ManifestFile task slowly with DataTableScan#planTasks [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on issue #3741: URL: https://github.com/apache/iceberg/issues/3741#issuecomment-2272380295 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Inserts to the table partitioned by 'fixed' column fail [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] closed issue #3704: Inserts to the table partitioned by 'fixed' column fail URL: https://github.com/apache/iceberg/issues/3704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] AWS: Add unit tests for GlueCatalog's isValidIdentifier method [iceberg]

2024-08-06 Thread via GitHub
github-actions[bot] commented on PR #3698: URL: https://github.com/apache/iceberg/pull/3698#issuecomment-2272380170 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-08-06 Thread via GitHub
szehon-ho commented on PR #10784: URL: https://github.com/apache/iceberg/pull/10784#issuecomment-2272369995 Another one is : https://github.com/apache/iceberg/issues/10535 corrupt manifest file. It may make sense to repair those as well. So in summary: - repairSnapshotMetadata (re

[PR] Bump coverage from 7.6.0 to 7.6.1 [iceberg-python]

2024-08-06 Thread via GitHub
dependabot[bot] opened a new pull request, #1011: URL: https://github.com/apache/iceberg-python/pull/1011 Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.6.0 to 7.6.1. Changelog Sourced from https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst";>coverage's cha

Re: [PR] Core: V3 Metadata Upgrade Validation and Testing [iceberg]

2024-08-06 Thread via GitHub
amogh-jahagirdar commented on code in PR #10861: URL: https://github.com/apache/iceberg/pull/10861#discussion_r1706189039 ## core/src/test/java/org/apache/iceberg/TestTableMetadata.java: ## @@ -1451,50 +1457,67 @@ public void testCreateV2MetadataThroughTableProperty() {

Re: [PR] Fix list namespace response in rest catalog [iceberg-python]

2024-08-06 Thread via GitHub
kevinjqliu commented on PR #995: URL: https://github.com/apache/iceberg-python/pull/995#issuecomment-2272277647 Thanks @ndrluis! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Fix list namespace response in rest catalog [iceberg-python]

2024-08-06 Thread via GitHub
kevinjqliu merged PR #995: URL: https://github.com/apache/iceberg-python/pull/995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ice

Re: [PR] Support execution in Windows using Local File System and NFS [iceberg-python]

2024-08-06 Thread via GitHub
rfung777 commented on PR #996: URL: https://github.com/apache/iceberg-python/pull/996#issuecomment-2272267033 Thanks for reviewing. New to the Windows integration test in GitHub Actions. I will do a bit of research to see how it is done. -- This is an automated message from the Apache Git

Re: [PR] Exclude Python 3.9.7 due to import error in catalog module [iceberg-python]

2024-08-06 Thread via GitHub
kevinjqliu commented on code in PR #526: URL: https://github.com/apache/iceberg-python/pull/526#discussion_r1706178197 ## pyproject.toml: ## @@ -49,7 +49,7 @@ include = [ ] [tool.poetry.dependencies] -python = "^3.8" +python = ">=3.8,<3.9.7 || >=3.9.8,<4.0" Review Comment:

Re: [PR] Exclude Python 3.9.7 due to import error in catalog module [iceberg-python]

2024-08-06 Thread via GitHub
kevinjqliu commented on code in PR #526: URL: https://github.com/apache/iceberg-python/pull/526#discussion_r1706178197 ## pyproject.toml: ## @@ -49,7 +49,7 @@ include = [ ] [tool.poetry.dependencies] -python = "^3.8" +python = ">=3.8,<3.9.7 || >=3.9.8,<4.0" Review Comment:

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-08-06 Thread via GitHub
karuppayya commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1704749308 ## data/src/main/java/org/apache/iceberg/data/PartitionStatsGenerator.java: ## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] API: Define RepairManifests action interface [iceberg]

2024-08-06 Thread via GitHub
szehon-ho commented on PR #10784: URL: https://github.com/apache/iceberg/pull/10784#issuecomment-2272138130 Hi, @amogh-jahagirdar @RussellSpitzer had a question. I think one motivation was to handle CheckSnapshotIntegrity API proposed in https://github.com/apache/iceberg/pull/10642

Re: [PR] Improve test_version_format() error message for version mismatches [iceberg-python]

2024-08-06 Thread via GitHub
Fokko commented on code in PR #991: URL: https://github.com/apache/iceberg-python/pull/991#discussion_r1706101449 ## tests/test_version.py: ## @@ -25,4 +25,8 @@ def test_version_format() -> None: assert ( __version__ == installed_version -), f"{__version__} <

Re: [PR] Support execution in Windows using Local File System and NFS [iceberg-python]

2024-08-06 Thread via GitHub
Fokko commented on PR #996: URL: https://github.com/apache/iceberg-python/pull/996#issuecomment-2272127267 Thanks for working on this @rfung777 🙌 Ideally, we would also love to have Windows integration tests on Github Actions. WDYT? -- This is an automated message from the Apache

Re: [PR] Exclude Python 3.9.7 due to import error in catalog module [iceberg-python]

2024-08-06 Thread via GitHub
ndrluis commented on PR #526: URL: https://github.com/apache/iceberg-python/pull/526#issuecomment-2272100410 Yes @kevinjqliu! This is a example https://github.com/user-attachments/assets/f546968e-8d7b-4b8e-9905-fb037dfadcb9";> -- This is an automated message from the Apache

Re: [PR] Fix list namespace response in rest catalog [iceberg-python]

2024-08-06 Thread via GitHub
ndrluis commented on PR #995: URL: https://github.com/apache/iceberg-python/pull/995#issuecomment-2272076565 Issue created #1010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Fix list namespace response in rest catalog [iceberg-python]

2024-08-06 Thread via GitHub
ndrluis commented on PR #995: URL: https://github.com/apache/iceberg-python/pull/995#issuecomment-2272064934 > Is there another way to test this behavior? test_rest.py is using mocks. Perhaps there's a way to write an integration test comparing to spark I'll create an issue to set up

  1   2   3   >