date:20231017

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-17 Thread via GitHub

Xuanwo commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1363344838 ## crates/iceberg/src/avro/schema.rs: ## @@ -96,7 +98,13 @@ impl SchemaVisitor for SchemaToAvroSchema { _struct: &StructType, Review Comment: Undetstood. T

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub

nastra commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1767781293 > Additionally, Iceberg has a UUID type, which seems to be supported in Spark but not in Flink: https://github.com/apache/iceberg/pull/7496 So Spark itself doesn't support UUIDs as a

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub

fengjiajie commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1767764825 > @fengjiajie: Checked the codepath for the Spark readers and I have 2 questions: > > * What about ORC and Avro files? Don't we have the same issue there? > * Would it worth t

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub

nastra commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-176772 Thanks everyone for the feedback. I'll go ahead and include this in the 1.4.1 RC -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub

nastra merged PR #8843: URL: https://github.com/apache/iceberg/pull/8843 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] [1.4.x] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

nastra merged PR #8861: URL: https://github.com/apache/iceberg/pull/8861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[PR] fix: Add catalog pages to docs. [iceberg-docs]

2023-10-17 Thread via GitHub

liurenjie1024 opened a new pull request, #284: URL: https://github.com/apache/iceberg-docs/pull/284 Close https://github.com/apache/iceberg/issues/8850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra commented on PR #8852: URL: https://github.com/apache/iceberg/pull/8852#issuecomment-1767711923 thanks for reviewing @amogh-jahagirdar and @nk1506. @nk1506 would you be interested in backporting this to older Flink versions? -- This is an automated message from the Apache Git Servi

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra merged PR #8852: URL: https://github.com/apache/iceberg/pull/8852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub

liurenjie1024 commented on PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#issuecomment-1767686793 > The PR looks good to me overall, but I'm slightly concerned about the use of `Intointo(literal)`. Will our users have to write code in this manner? Or does it occur internall

Re: [I] Distributed execution of DeleteReachableFilesSparkAction [iceberg]

2023-10-17 Thread via GitHub

tmnd1991 commented on issue #8862: URL: https://github.com/apache/iceberg/issues/8862#issuecomment-1767681093 Hi @RussellSpitzer , yes I mean having the delete operations distributed. I guess it’s difficult because you don’t know how many spark executor cores you might have in any given mom

Re: [PR] Nessie: retain authorship information when creating a namespace [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1767680941 > It also switches to Nessie API V2 for the commit operation. I didn't see this change (mentioned in PR description). It currently works for both v1 and v2 right? -- This is

Re: [PR] Nessie: retain authorship information when creating a namespace [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1767679787 cc: @snazy, @dimas-b -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Nessie: retain authorship information when creating a namespace [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1363197313 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +181,38 @@ public IcebergTable table(TableIdentifier tableIdentifier) {

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-17 Thread via GitHub

liurenjie1024 commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1363213683 ## crates/iceberg/src/avro/schema.rs: ## @@ -96,7 +98,13 @@ impl SchemaVisitor for SchemaToAvroSchema { _struct: &StructType, Review Comment: The `_

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub

ZENOTME commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1363200948 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Write

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub

pvary commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1767664345 @fengjiajie: Checked the codepath for the Spark readers and I have 2 questions: - What about ORC and Avro files? Don't we have the same issue there? - Would it worth to add the same fi

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub

Xuanwo commented on PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#issuecomment-1767645545 The PR looks good to me overall, but I'm slightly concerned about the use of `Intointo(literal)`. Will our users have to write code in this manner? Or does it occur internally? --

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-17 Thread via GitHub

Xuanwo commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1363171863 ## crates/iceberg/src/avro/schema.rs: ## @@ -96,7 +98,13 @@ impl SchemaVisitor for SchemaToAvroSchema { _struct: &StructType, results: Vec, ) ->

[I] java.lang.IllegalArgumentException: requirement failed while read migrated parquet table [iceberg]

2023-10-17 Thread via GitHub

camper42 opened a new issue, #8863: URL: https://github.com/apache/iceberg/issues/8863 ### Apache Iceberg version 1.4.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 We have some irrationally partitioned parquet table, which have 3 le

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-17 Thread via GitHub

zhangminglei commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1363068688 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software

Re: [I] Query fails when executed without filter i.e. aggregate pushdown [iceberg]

2023-10-17 Thread via GitHub

huaxingao commented on issue #8859: URL: https://github.com/apache/iceberg/issues/8859#issuecomment-1767476206 @atifiu Do you have a program that can reproduce the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] Build: Bump urllib3 from 1.26.17 to 1.26.18 [iceberg-python]

2023-10-17 Thread via GitHub

dependabot[bot] opened a new pull request, #84: URL: https://github.com/apache/iceberg-python/pull/84 Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18. Release notes Sourced from https://github.com/urllib3/urllib3/releases";>urllib3's releases. 1.2

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-17 Thread via GitHub

barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1362965809 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [I] Add checks in create/alter table if table location conflicts [iceberg]

2023-10-17 Thread via GitHub

github-actions[bot] commented on issue #7238: URL: https://github.com/apache/iceberg/issues/7238#issuecomment-1767384670 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] should we throw the exception when we have no privilege to access the directory in findVersion method [iceberg]

2023-10-17 Thread via GitHub

github-actions[bot] closed issue #7285: should we throw the exception when we have no privilege to access the directory in findVersion method URL: https://github.com/apache/iceberg/issues/7285 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] should we throw the exception when we have no privilege to access the directory in findVersion method [iceberg]

2023-10-17 Thread via GitHub

github-actions[bot] commented on issue #7285: URL: https://github.com/apache/iceberg/issues/7285#issuecomment-1767384651 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] column define by NestedField.required for orc can still insert null [iceberg]

2023-10-17 Thread via GitHub

github-actions[bot] closed issue #7288: column define by NestedField.required for orc can still insert null URL: https://github.com/apache/iceberg/issues/7288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] column define by NestedField.required for orc can still insert null [iceberg]

2023-10-17 Thread via GitHub

github-actions[bot] commented on issue #7288: URL: https://github.com/apache/iceberg/issues/7288#issuecomment-1767384603 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Distributed execution of DeleteReachableFilesSparkAction [iceberg]

2023-10-17 Thread via GitHub

RussellSpitzer commented on issue #8862: URL: https://github.com/apache/iceberg/issues/8862#issuecomment-1767364918 It is distributed in file discovery. Do you mean have the deletes distributed? Previously we didn't want to do this because it's very difficult to control parallelism of delet

Re: [I] DeleteOrphanFiles or ExpireSnapshots outofmemory [iceberg]

2023-10-17 Thread via GitHub

RussellSpitzer commented on issue #3703: URL: https://github.com/apache/iceberg/issues/3703#issuecomment-1767362201 Not for broadcast, for that you just need to disable broadcast join in spark -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] DeleteOrphanFiles or ExpireSnapshots outofmemory [iceberg]

2023-10-17 Thread via GitHub

RLashofRegas commented on issue #3703: URL: https://github.com/apache/iceberg/issues/3703#issuecomment-1767331436 @RussellSpitzer You mentioned you were working on a patch that might affect this. Should I expect the issue I mentioned above to go away if we upgraded to a more recent version

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-17 Thread via GitHub

puchengy commented on PR #83: URL: https://github.com/apache/iceberg-python/pull/83#issuecomment-1767128295 @Fokko Can I get a review? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Spark 3.5: Fix specific field values treated as unequal while comparing rows for carry-over removal [iceberg]

2023-10-17 Thread via GitHub

flyrain commented on PR #8799: URL: https://github.com/apache/iceberg/pull/8799#issuecomment-1767090521 Hi @rdblue, yes, we don't have to include it in 1.4.1 if it is behavior change. I'd consider it more a bug fix though. But open for ideas. -- This is an automated message from the Apach

[PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar opened a new pull request, #8861: URL: https://github.com/apache/iceberg/pull/8861 Cherry pick #8860 onto the 1.4.x branch for 1.4.1 release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

rdblue commented on PR #8860: URL: https://github.com/apache/iceberg/pull/8860#issuecomment-1767011681 Merged. Thanks for getting this read, @amogh-jahagirdar! For context, this catches bad metadata written by 1.4.0 and ignores it. This is needed if tables have bad split offsets in or

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

rdblue merged PR #8860: URL: https://github.com/apache/iceberg/pull/8860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.5: Fix specific field values treated as unequal while comparing rows for carry-over removal [iceberg]

2023-10-17 Thread via GitHub

rdblue commented on PR #8799: URL: https://github.com/apache/iceberg/pull/8799#issuecomment-1767003787 What's the argument for including this in 1.4.1? It seems like a behavior change that isn't fixing a regression. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub

singhpk234 commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-1767000250 > It seems like a behavior change that would only be safe if the behavior is always the same when creating multiple credentials providers. checked the DEFAULT_CREDENTIALS_PROVIDE

Re: [PR] Spec: Add section on `null_value_counts` [iceberg]

2023-10-17 Thread via GitHub

Fokko commented on code in PR #8611: URL: https://github.com/apache/iceberg/pull/8611#discussion_r1362621678 ## format/spec.md: ## @@ -450,6 +451,48 @@ Notes: 2. For `float` and `double`, the value `-0.0` must precede `+0.0`, as in the IEEE 754 `totalOrder` predicate. NaNs are

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

singhpk234 commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362614788 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

singhpk234 commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362607781 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

singhpk234 commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362607781 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362567101 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362567101 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362567101 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Spark 3.5: Fix specific field values treated as unequal while comparing rows for carry-over removal [iceberg]

2023-10-17 Thread via GitHub

flyrain commented on PR #8799: URL: https://github.com/apache/iceberg/pull/8799#issuecomment-1766912688 cc @aokolnychyi @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-17 Thread via GitHub

stevenzwu commented on PR #8803: URL: https://github.com/apache/iceberg/pull/8803#issuecomment-1766844420 @pvary I think we probably want to push the `copyStatsForColumns` down to ManifestReader. https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/ManifestReade

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

singhpk234 commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362493799 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362490912 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362487232 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub

stevenzwu commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-1766827281 > would only be safe if the behavior is always the same when creating multiple credentials providers. @rdblue I think it is a safe change from a static global singleton to creati

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362487232 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362467796 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362467796 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub

nastra commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-1766801596 > @nastra and @singhpk234, is this safe for a patch release? It seems like a behavior change that would only be safe if the behavior is always the same when creating multiple credentials p

Re: [I] Replace Thread.sleep() usage in test code with Awaitility [iceberg]

2023-10-17 Thread via GitHub

nastra commented on issue #7154: URL: https://github.com/apache/iceberg/issues/7154#issuecomment-1766792091 I've opened https://github.com/apache/iceberg/pull/8853 and https://github.com/apache/iceberg/pull/8852 to give an idea about places that are good candidates to replace with Awaitilit

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362445708 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362441102 ## aws/src/integration/java/org/apache/iceberg/aws/dynamodb/TestDynamoDbLockManager.java: ## @@ -141,11 +142,8 @@ public void testAcquireSingleProcess() throws Exception

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362431071 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -189,7 +192,17 @@ public void testAssumeRoleS3FileIO() throws Exception {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362429576 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -189,7 +192,17 @@ public void testAssumeRoleS3FileIO() throws Exception {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362427190 ## api/src/test/java/org/apache/iceberg/TestHelpers.java: ## @@ -62,6 +70,54 @@ public static long waitUntilAfter(long timestampMillis) { return current; } +

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362428279 ## api/src/test/java/org/apache/iceberg/metrics/TestDefaultTimer.java: ## @@ -101,14 +103,7 @@ public void closeableTimer() throws InterruptedException { @Test pub

Re: [PR] [1.4.x] Flink: Reverting the default custom partitioner for bucket column (#8848) [iceberg]

2023-10-17 Thread via GitHub

nastra merged PR #8858: URL: https://github.com/apache/iceberg/pull/8858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362421006 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362421006 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362419546 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362417563 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Python: Add support for Python 3.12 [iceberg-python]

2023-10-17 Thread via GitHub

steinsgateted commented on PR #35: URL: https://github.com/apache/iceberg-python/pull/35#issuecomment-1766760247 @jayceslesar Thank you for the information -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1362413695 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362408168 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,11 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362407399 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,11 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on PR #8860: URL: https://github.com/apache/iceberg/pull/8860#issuecomment-1766749771 cc @bryanck -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-17 Thread via GitHub

stevenzwu commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1362401859 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -177,4 +191,26 @@ default Long fileSequenceNumber() { default F copy(boolean withStats) { retu

Re: [PR] [1.4.x] Core: Do not use a lazy split offset list in manifests (#8834) [iceberg]

2023-10-17 Thread via GitHub

rdblue merged PR #8845: URL: https://github.com/apache/iceberg/pull/8845 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-17 Thread via GitHub

nastra closed issue #8847: Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution URL: https://github.com/apache/iceberg/issues/8847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-17 Thread via GitHub

nastra commented on issue #8847: URL: https://github.com/apache/iceberg/issues/8847#issuecomment-1766722846 Closing this as #8848 has been merged to main and I backported it to 1.4.x in https://github.com/apache/iceberg/pull/8858 -- This is an automated message from the Apache Git Service

[PR] Flink: Reverting the default custom partitioner for bucket column (#8848) [iceberg]

2023-10-17 Thread via GitHub

nastra opened a new pull request, #8858: URL: https://github.com/apache/iceberg/pull/8858 This backports #8848 to 1.4.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Flink: Reverting the default custom partitioner for bucket column [iceberg]

2023-10-17 Thread via GitHub

stevenzwu merged PR #8848: URL: https://github.com/apache/iceberg/pull/8848 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362375826 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -111,14 +113,19 @@ public void testConsumeWithoutStartSnapsho

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362315314 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -98,9 +98,9 @@ protected List generateRecords(int numRecords, lo

[PR] Nessie: retain authorship information when creating a namespace [iceberg]

2023-10-17 Thread via GitHub

adutra opened a new pull request, #8857: URL: https://github.com/apache/iceberg/pull/8857 This change enhances the process of creating new namespaces by retaining commit authorship information when committing the new namespace. It also switches to Nessie API V2 for the commit operatio

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362309604 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

Re: [PR] Spark 3.5: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra merged PR #8853: URL: https://github.com/apache/iceberg/pull/8853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.5: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nastra commented on code in PR #8853: URL: https://github.com/apache/iceberg/pull/8853#discussion_r1362301067 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/SparkSQLExecutionHelper.java: ## @@ -42,28 +45,26 @@ public static String lastExecutedMetricValue(Spark

[I] Improve `All` Metadata Tables with Snapshot Information [iceberg]

2023-10-17 Thread via GitHub

RussellSpitzer opened a new issue, #8856: URL: https://github.com/apache/iceberg/issues/8856 ### Feature Request / Improvement Currently all versions of metadata tables have the exact same schema as their not "all" versions. This is actually not very useful if you are attempting to l

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-17 Thread via GitHub

Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1362161257 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` of

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub

nk1506 commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362126880 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar closed pull request #8854: Spark: Fix Fast forward procedure output for non-main branches URL: https://github.com/apache/iceberg/pull/8854 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub

nk1506 commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362111563 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -111,14 +113,19 @@ public void testConsumeWithoutStartSnapsho

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on PR #8854: URL: https://github.com/apache/iceberg/pull/8854#issuecomment-1766410898 logged the flaky test: #8855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362101911 ## api/src/main/java/org/apache/iceberg/view/View.java: ## @@ -111,4 +112,13 @@ default ReplaceViewVersion replaceVersion() { default UpdateLocation updateLocat

[I] Flaky test: TestSparkReaderDeletes.testEqualityDeleteWithDeletedColumn [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat opened a new issue, #8855: URL: https://github.com/apache/iceberg/issues/8855 TestSparkReaderDeletes > [format = orc, vectorized = false, planningMode = DISTRIBUTED] > testEqualityDeleteWithDeletedColumn PR:8854 Build: https://github.com/apache/iceberg/actions/run

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-17 Thread via GitHub

ZENOTME commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1362079101 ## crates/iceberg/src/spec/manifest.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-10-17 Thread via GitHub

rustyconover commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1766378641 Yes it would! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

rakesh-das08 commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362071081 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362069529 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

amogh-jahagirdar commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362063516 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalR

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362051079 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub

ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362035790 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestFastForwardBranchProcedure.java: ## @@ -188,4 +188,38 @@ public void testInval

1 2 >

1 - 100 of 147 matches

Mail list logo