Re: [PR] Kafka Connect: separate CI workflow [iceberg]

2024-09-17 Thread via GitHub
nastra merged PR #11075: URL: https://github.com/apache/iceberg/pull/11075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-17 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2357605851 I'm happy to add the partitioning result to the task. This is useful to the executor node when deciding how to distribute tasks, as it enables the use of a few different strategies, t

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-17 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1764468077 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java: ## @@ -262,6 +264,120 @@ public void testReadColumnFilter2() throws Exception {

Re: [PR] Arrow: add support for null vectors [iceberg]

2024-09-17 Thread via GitHub
nastra commented on code in PR #10953: URL: https://github.com/apache/iceberg/pull/10953#discussion_r1764467923 ## arrow/src/test/java/org/apache/iceberg/arrow/vectorized/ArrowReaderTest.java: ## @@ -262,6 +264,120 @@ public void testReadColumnFilter2() throws Exception {

Re: [PR] Core: Add explicit JSON parser for LoadTableResponse [iceberg]

2024-09-17 Thread via GitHub
nastra commented on code in PR #11148: URL: https://github.com/apache/iceberg/pull/11148#discussion_r1764428159 ## core/src/main/java/org/apache/iceberg/rest/responses/LoadTableResponse.java: ## @@ -61,7 +62,12 @@ public String metadataLocation() { } public TableMetadata

Re: [I] Mixed usage of snapshotCreationTs, metadataCommitTs & tableAccessTs when using REST Catalog [iceberg]

2024-09-17 Thread via GitHub
nastra closed issue #11103: Mixed usage of snapshotCreationTs, metadataCommitTs & tableAccessTs when using REST Catalog URL: https://github.com/apache/iceberg/issues/11103 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Mixed usage of snapshotCreationTs, metadataCommitTs & tableAccessTs when using REST Catalog [iceberg]

2024-09-17 Thread via GitHub
nastra commented on issue #11103: URL: https://github.com/apache/iceberg/issues/11103#issuecomment-2357550633 fixed by https://github.com/apache/iceberg/pull/11151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Core: Update metadata location without updating lastUpdatedMillis [iceberg]

2024-09-17 Thread via GitHub
nastra merged PR #11151: URL: https://github.com/apache/iceberg/pull/11151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Iceberg to configure AWS S3 configuration with the Hadoop and Hive4 setup is hanging without giving ant error [iceberg]

2024-09-17 Thread via GitHub
AwasthiSomesh commented on issue #11145: URL: https://github.com/apache/iceberg/issues/11145#issuecomment-2357518430 @pvary Thanks a lot for your quick response . I have 2 below question could you please help me with your comments. **Q1**. As mentioned in iceberg official docum

Re: [I] Create table format version constants [iceberg-python]

2024-09-17 Thread via GitHub
tanmayrauth commented on issue #851: URL: https://github.com/apache/iceberg-python/issues/851#issuecomment-2357506218 @kevinjqliu I found this TableVersion [declaration already present](https://github.com/apache/iceberg-python/blob/de47590c6ac4f507cb2337c20504a62c484339f9/pyiceberg/typedef.p

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-17 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1764382306 ## .palantir/revapi.yml: ## @@ -1136,6 +1136,78 @@ acceptedBreaks: new: "method org.apache.iceberg.BaseMetastoreOperations.CommitStatus org.apache.iceberg.Base

[PR] OpenAPI: Add planning-mode to loadTable response [iceberg]

2024-09-17 Thread via GitHub
rahil-c opened a new pull request, #11156: URL: https://github.com/apache/iceberg/pull/11156 Recently in the iceberg community we landed a new set of scan planning apis within the rest spec https://github.com/apache/iceberg/pull/9695. The following spec change in this pr aims to prov

Re: [PR] Core: Move internal struct projection to SupportsIndexProjection [iceberg]

2024-09-17 Thread via GitHub
aokolnychyi commented on code in PR #11132: URL: https://github.com/apache/iceberg/pull/11132#discussion_r1764368384 ## core/src/main/java/org/apache/iceberg/avro/SupportsIndexProjection.java: ## @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on PR #11143: URL: https://github.com/apache/iceberg/pull/11143#issuecomment-2357452412 Will take a look with fresh eyes tomorrow morning, thanks for reviewing @singhpk234 @rahil-c . I'll trigger the CI. -- This is an automated message from the Apache Git Servic

Re: [PR] API: Add RemoveUnusedSpecs in Table [iceberg]

2024-09-17 Thread via GitHub
advancedxy commented on PR #10755: URL: https://github.com/apache/iceberg/pull/10755#issuecomment-2357370672 Gently ping @amogh-jahagirdar @rdblue @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] procedure add_files parallelism > 1 -> NotSerializableException [iceberg]

2024-09-17 Thread via GitHub
manuzhang commented on issue #11147: URL: https://github.com/apache/iceberg/issues/11147#issuecomment-2357361916 @zzeekk Thanks for reporting this bug. I will look into it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
rahil-c commented on PR #11143: URL: https://github.com/apache/iceberg/pull/11143#issuecomment-2357353277 Thanks @rcjverhoef for your contribution. @amogh-jahagirdar I was wondering if you can also take a look, and merge this if it looks good to you? -- This is an automated message

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
rahil-c commented on code in PR #11143: URL: https://github.com/apache/iceberg/pull/11143#discussion_r1764306523 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2409,7 +2409,7 @@ public void testPaginationForListTables() { RESTCatalog catalog =

Re: [I] Iceberg spark procedure argument does not support empty map or empty array. [iceberg]

2024-09-17 Thread via GitHub
wForget closed issue #8448: Iceberg spark procedure argument does not support empty map or empty array. URL: https://github.com/apache/iceberg/issues/8448 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Spark 3.4: Supports empty map and empty array expressions [iceberg]

2024-09-17 Thread via GitHub
wForget closed pull request #8449: Spark 3.4: Supports empty map and empty array expressions URL: https://github.com/apache/iceberg/pull/8449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] IcebergParseException.getMessage does not show the below line [iceberg]

2024-09-17 Thread via GitHub
wForget closed issue #8462: IcebergParseException.getMessage does not show the below line URL: https://github.com/apache/iceberg/issues/8462 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Core: Avoid NPE when getting updateEvent in FastAppend [iceberg]

2024-09-17 Thread via GitHub
wForget closed pull request #8507: Core: Avoid NPE when getting updateEvent in FastAppend URL: https://github.com/apache/iceberg/pull/8507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Spark 3.4: Remove unused parameters [iceberg]

2024-09-17 Thread via GitHub
wForget closed pull request #8463: Spark 3.4: Remove unused parameters URL: https://github.com/apache/iceberg/pull/8463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Test: Add coverage for Output [iceberg-go]

2024-09-17 Thread via GitHub
alex-kar commented on PR #149: URL: https://github.com/apache/iceberg-go/pull/149#issuecomment-2357320736 @zeroshade Please review tests to cover `output`. I added two parameters: one with minimal table metadata and another with all fields set. For comparison, I used `github.com/stretchr/

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
rcjverhoef commented on code in PR #11143: URL: https://github.com/apache/iceberg/pull/11143#discussion_r1764263966 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2409,7 +2409,7 @@ public void testPaginationForListTables() { RESTCatalog catalog =

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-17 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1764249470 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +128,124 @@ private static Map computeSnapshotOrdinals(Deque snapsh ret

Re: [PR] Kafka Connect: separate CI workflow [iceberg]

2024-09-17 Thread via GitHub
bryanck commented on code in PR #11075: URL: https://github.com/apache/iceberg/pull/11075#discussion_r1764246337 ## kafka-connect/kafka-connect-runtime/src/integration/java/org/apache/iceberg/connect/TestContext.java: ## @@ -51,6 +52,7 @@ public class TestContext { private Te

Re: [PR] Core: Add reference snapshot ID/timestamps to AllEntriesTable and AllManifestsTable [iceberg]

2024-09-17 Thread via GitHub
hsiang-c commented on code in PR #9335: URL: https://github.com/apache/iceberg/pull/9335#discussion_r1764238812 ## .palantir/revapi.yml: ## @@ -1136,6 +1136,78 @@ acceptedBreaks: new: "method org.apache.iceberg.BaseMetastoreOperations.CommitStatus org.apache.iceberg.Base

Re: [PR] Add Support for Dynamic Overwrite [iceberg-python]

2024-09-17 Thread via GitHub
jqin61 commented on code in PR #931: URL: https://github.com/apache/iceberg-python/pull/931#discussion_r176422 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -221,6 +276,98 @@ def test_query_filter_v1_v2_append_null( assert df.where(f"{col} is no

Re: [I] Support S3 Access Points with Access Point to Bucket mapping [iceberg-python]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #452: Support S3 Access Points with Access Point to Bucket mapping URL: https://github.com/apache/iceberg-python/issues/452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Optimize `plan_files` with filter in case whe it is fully evaluated on Iceberg metadata [iceberg-python]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #491: URL: https://github.com/apache/iceberg-python/issues/491#issuecomment-2357239753 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [I] Optimize `plan_files` with filter in case whe it is fully evaluated on Iceberg metadata [iceberg-python]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #491: Optimize `plan_files` with filter in case whe it is fully evaluated on Iceberg metadata URL: https://github.com/apache/iceberg-python/issues/491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Support S3 Access Points with Access Point to Bucket mapping [iceberg-python]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #452: URL: https://github.com/apache/iceberg-python/issues/452#issuecomment-2357239777 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apac

Re: [PR] AWS: Set better defaults for S3 retry behaviour [iceberg]

2024-09-17 Thread via GitHub
ookumuso commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1764233622 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -393,6 +403,21 @@ public class S3FileIOProperties implements Serializable { */ p

Re: [PR] AWS: Set better defaults for S3 retry behaviour [iceberg]

2024-09-17 Thread via GitHub
ookumuso commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1764233622 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -393,6 +403,21 @@ public class S3FileIOProperties implements Serializable { */ p

Re: [I] Merge Small File Error [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7919: URL: https://github.com/apache/iceberg/issues/7919#issuecomment-2357237423 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Core: check location for conflict before creating table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed pull request #8194: Core: check location for conflict before creating table URL: https://github.com/apache/iceberg/pull/8194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Merge Small File Error [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7919: Merge Small File Error URL: https://github.com/apache/iceberg/issues/7919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [I] Read is not working on Iceberg Hive table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7924: URL: https://github.com/apache/iceberg/issues/7924#issuecomment-2357237440 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Docs: Improve possible options/parameters for system procedures and usage. [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7934: Docs: Improve possible options/parameters for system procedures and usage. URL: https://github.com/apache/iceberg/issues/7934 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Docs: Improve possible options/parameters for system procedures and usage. [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7934: URL: https://github.com/apache/iceberg/issues/7934#issuecomment-2357237465 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] [Feature Request] Inspect partitions Metadata for Tables with Many Partitions [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7892: URL: https://github.com/apache/iceberg/issues/7892#issuecomment-2357237372 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Iceberg requiredNumOfPartitions method [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7918: URL: https://github.com/apache/iceberg/issues/7918#issuecomment-2357237403 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] [Feature Request] Inspect partitions Metadata for Tables with Many Partitions [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7892: [Feature Request] Inspect partitions Metadata for Tables with Many Partitions URL: https://github.com/apache/iceberg/issues/7892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Data files name collision written by Spark Streaming job after it's restart [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7890: Data files name collision written by Spark Streaming job after it's restart URL: https://github.com/apache/iceberg/issues/7890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] missing option in remove_orphan_files (prefix mismatch) [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7884: URL: https://github.com/apache/iceberg/issues/7884#issuecomment-2357237314 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Partition Filter returns incorrect results for decimal partition columns with trailing 0's [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7882: URL: https://github.com/apache/iceberg/issues/7882#issuecomment-2357237289 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] missing option in remove_orphan_files (prefix mismatch) [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7884: missing option in remove_orphan_files (prefix mismatch) URL: https://github.com/apache/iceberg/issues/7884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Data files name collision written by Spark Streaming job after it's restart [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7890: URL: https://github.com/apache/iceberg/issues/7890#issuecomment-2357237339 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Partition Filter returns incorrect results for decimal partition columns with trailing 0's [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7882: Partition Filter returns incorrect results for decimal partition columns with trailing 0's URL: https://github.com/apache/iceberg/issues/7882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] DataFrame inconsistency after MERGE operation [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7863: DataFrame inconsistency after MERGE operation URL: https://github.com/apache/iceberg/issues/7863 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] PartitionSpec field name should be consistent for bucket and trunc in $partitions metadata table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7849: URL: https://github.com/apache/iceberg/issues/7849#issuecomment-2357237214 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on PR #8202: URL: https://github.com/apache/iceberg/pull/8202#issuecomment-2357237629 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: check location for conflict before creating table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on PR #8194: URL: https://github.com/apache/iceberg/pull/8194#issuecomment-2357237595 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If y

Re: [PR] Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed pull request #8202: Core: Add KLL Datasketch and Hive ColumnStatisticsObj as standard blo… URL: https://github.com/apache/iceberg/pull/8202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] How to remove orphan manifest and manifest list file [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7937: URL: https://github.com/apache/iceberg/issues/7937#issuecomment-2357237483 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] How to remove orphan manifest and manifest list file [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7937: How to remove orphan manifest and manifest list file URL: https://github.com/apache/iceberg/issues/7937 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Read is not working on Iceberg Hive table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7924: Read is not working on Iceberg Hive table URL: https://github.com/apache/iceberg/issues/7924 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Iceberg requiredNumOfPartitions method [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7918: Iceberg requiredNumOfPartitions method URL: https://github.com/apache/iceberg/issues/7918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] delete with clause IN [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7850: delete with clause IN URL: https://github.com/apache/iceberg/issues/7850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] delete with clause IN [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] commented on issue #7850: URL: https://github.com/apache/iceberg/issues/7850#issuecomment-2357237240 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] PartitionSpec field name should be consistent for bucket and trunc in $partitions metadata table [iceberg]

2024-09-17 Thread via GitHub
github-actions[bot] closed issue #7849: PartitionSpec field name should be consistent for bucket and trunc in $partitions metadata table URL: https://github.com/apache/iceberg/issues/7849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Add Support for Dynamic Overwrite [iceberg-python]

2024-09-17 Thread via GitHub
jqin61 commented on code in PR #931: URL: https://github.com/apache/iceberg-python/pull/931#discussion_r176422 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -221,6 +276,98 @@ def test_query_filter_v1_v2_append_null( assert df.where(f"{col} is no

Re: [PR] fix: fixing tests to work with s3Express [iceberg]

2024-09-17 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1764150636 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -298,8 +298,13 @@ private List deleteBatch(String bucket, Collection keysToDelete) @Overrid

Re: [PR] fix: fixing tests to work with s3Express [iceberg]

2024-09-17 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1764140100 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3Express.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more c

Re: [PR] fix: fixing tests to work with s3Express [iceberg]

2024-09-17 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1764135625 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -589,4 +623,13 @@ private void createRandomObjects(String objectPrefix, i

Re: [PR] fix: fixing tests to work with s3Express [iceberg]

2024-09-17 Thread via GitHub
jackye1995 commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1764131672 ## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ## @@ -106,7 +109,7 @@ public static String testMultiRegionAccessPointAlias() { retu

Re: [PR] AWS: Set better defaults for S3 retry behaviour [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1764104918 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -393,6 +403,21 @@ public class S3FileIOProperties implements Serializable {

[PR] Bump pydantic from 2.9.1 to 2.9.2 [iceberg-python]

2024-09-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1182: URL: https://github.com/apache/iceberg-python/pull/1182 Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.9.1 to 2.9.2. Release notes Sourced from https://github.com/pydantic/pydantic/releases";>pydantic's releases.

[PR] Bump pypa/cibuildwheel from 2.21.0 to 2.21.1 [iceberg-python]

2024-09-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1181: URL: https://github.com/apache/iceberg-python/pull/1181 Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.21.0 to 2.21.1. Release notes Sourced from https://github.com/pypa/cibuildwheel/releases";>pypa/cibuildwh

Re: [PR] Spark: revert delete procedure [iceberg]

2024-09-17 Thread via GitHub
danielcweeks commented on code in PR #11084: URL: https://github.com/apache/iceberg/pull/11084#discussion_r1764041902 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RevertDeleteProcedure.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Spark: revert delete procedure [iceberg]

2024-09-17 Thread via GitHub
danielcweeks commented on code in PR #11084: URL: https://github.com/apache/iceberg/pull/11084#discussion_r1764036538 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRevertDeleteProcedure.java: ## @@ -0,0 +1,220 @@ +/* + * Licensed to the Apa

Re: [PR] Spark: revert delete procedure [iceberg]

2024-09-17 Thread via GitHub
danielcweeks commented on code in PR #11084: URL: https://github.com/apache/iceberg/pull/11084#discussion_r1764035223 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RevertDeleteProcedure.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1764033379 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputStream.java: ## @@ -57,6 +64,14 @@ class S3InputStream extends SeekableInputStream implements RangeRea

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1764030252 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3InputStream.java: ## @@ -195,14 +230,20 @@ private void openStream() throws IOException { } } - p

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
stevenzwu commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763635282 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/AsyncDeleteFiles.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-17 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1763998376 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +128,124 @@ private static Map computeSnapshotOrdinals(Deque snapsh ret

Re: [I] to_pandas(), to_arrow() fail because case_sensitive doesn't work if column in row_filter doesn't match the case even if case_sensitive is set to False in scan [iceberg-python]

2024-09-17 Thread via GitHub
kevinjqliu commented on issue #1177: URL: https://github.com/apache/iceberg-python/issues/1177#issuecomment-2356876742 You can test out the main branch by downloading the repo and installing the library in edit mode. Run this command in your `iceberg-python` repo directory: ```

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-17 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1763979054 ## core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java: ## @@ -133,51 +131,149 @@ private static Map computeSnapshotOrdinals(Deque snapsh ret

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-17 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1763976533 ## core/src/test/java/org/apache/iceberg/TestBaseIncrementalChangelogScan.java: ## @@ -132,6 +131,139 @@ public void testFileDeletes() { assertThat(t1.existingDel

Re: [PR] Support changelog scan for table with delete files [iceberg]

2024-09-17 Thread via GitHub
wypoon commented on code in PR #10935: URL: https://github.com/apache/iceberg/pull/10935#discussion_r1763976533 ## core/src/test/java/org/apache/iceberg/TestBaseIncrementalChangelogScan.java: ## @@ -132,6 +131,139 @@ public void testFileDeletes() { assertThat(t1.existingDel

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763973114 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/ExpireSnapshots.java: ## @@ -0,0 +1,161 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
singhpk234 commented on code in PR #11143: URL: https://github.com/apache/iceberg/pull/11143#discussion_r1763972002 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2409,7 +2409,7 @@ public void testPaginationForListTables() { RESTCatalog catalog =

Re: [PR] REST: Handle Requests with Page Sizes Exceeding Available Number of Namespaces /Tables/Views [iceberg]

2024-09-17 Thread via GitHub
rahil-c commented on code in PR #11143: URL: https://github.com/apache/iceberg/pull/11143#discussion_r1763957845 ## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ## @@ -2409,7 +2409,7 @@ public void testPaginationForListTables() { RESTCatalog catalog =

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763922289 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/TableMaintenance.java: ## @@ -0,0 +1,356 @@ +/* + * Licensed to the Apache Software Found

Re: [I] javax.net.ssl.SSLException: Connection reset on S3 w/ S3FileIO and Apache HTTP client [iceberg]

2024-09-17 Thread via GitHub
SandeepSinghGahir commented on issue #10340: URL: https://github.com/apache/iceberg/issues/10340#issuecomment-2356794643 > @SandeepSinghGahir I'm really surprised that you're hitting this issue so frequently. Is there something specific about this workload that you think might be triggering

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1763890714 ## aws/src/test/java/org/apache/iceberg/aws/s3/TestFlakyS3InputStream.java: ## @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-09-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1763885839 ## aws/src/test/java/org/apache/iceberg/aws/s3/TestFlakyS3InputStream.java: ## @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763828430 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/AsyncDeleteFiles.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763858710 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/MaintenanceTaskBuilder.java: ## @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763843986 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/MaintenanceTaskBuilder.java: ## @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763830142 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/ExpireSnapshots.java: ## @@ -0,0 +1,161 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763828430 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/AsyncDeleteFiles.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Fou

Re: [I] to_pandas(), to_arrow() fail because case_sensitive doesn't work if column in row_filter doesn't match the case even if case_sensitive is set to False in scan [iceberg-python]

2024-09-17 Thread via GitHub
leonidmakarovsky commented on issue #1177: URL: https://github.com/apache/iceberg-python/issues/1177#issuecomment-2356756573 Do I need to install the different pyiceberg version to confirm this? On Mon, Sep 16, 2024 at 2:07 PM Kevin Liu ***@***.***> wrote: > thanks for reportin

Re: [PR] Flink: Maintenance - TableManager + ExpireSnapshots [iceberg]

2024-09-17 Thread via GitHub
pvary commented on code in PR #11144: URL: https://github.com/apache/iceberg/pull/11144#discussion_r1763800052 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/stream/ExpireSnapshots.java: ## @@ -0,0 +1,161 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Revert "Cache Manifest files" [iceberg-python]

2024-09-17 Thread via GitHub
sungwy commented on PR #1167: URL: https://github.com/apache/iceberg-python/pull/1167#issuecomment-2356731020 > My suspicion is that this is due to the generators in [`read_manifest_list` ](https://github.com/apache/iceberg-python/blob/de47590c6ac4f507cb2337c20504a62c484339f9/pyiceberg/mani

Re: [PR] Revert "Cache Manifest files" [iceberg-python]

2024-09-17 Thread via GitHub
kevinjqliu commented on PR #1167: URL: https://github.com/apache/iceberg-python/pull/1167#issuecomment-2356727162 @sungwy thanks for following up on this. I added more details in the PR description. My suspicion is that this is due to the generators in [`read_manifest_list` ](https:/

Re: [PR] API, AWS: Add RetryableInputStream and use that in S3InputStream [iceberg]

2024-09-17 Thread via GitHub
danielcweeks commented on code in PR #10433: URL: https://github.com/apache/iceberg/pull/10433#discussion_r1763770630 ## core/src/main/java/org/apache/iceberg/io/RetryableInputStream.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [I] Adding RESTCatalog based Spark Smoke Test [iceberg]

2024-09-17 Thread via GitHub
haizhou-zhao commented on issue #11079: URL: https://github.com/apache/iceberg/issues/11079#issuecomment-2356671027 https://github.com/apache/iceberg/issues/11154 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[I] Reference REST Catalog does not validate "to" identifier on rename table [iceberg]

2024-09-17 Thread via GitHub
haizhou-zhao opened a new issue, #11154: URL: https://github.com/apache/iceberg/issues/11154 ### Query engine Spark ### Question # Background Spark will pass `catalog` name to `renameTable` operations as part of its `to` identifier, and if that `catalog` name is not h

  1   2   >