Re: [I] fast_forward does not work for the first commit in Spark [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on issue #8849: URL: https://github.com/apache/iceberg/issues/8849#issuecomment-1765789016 Hmm, It is not just the null check addition in the procedure, later it fails because reference MAIN does not exist during replace branch. https://github.com/apache/icebe

Re: [I] doc: The doc for catalog is missing. [iceberg]

2023-10-17 Thread via GitHub
liurenjie1024 commented on issue #8850: URL: https://github.com/apache/iceberg/issues/8850#issuecomment-1765832884 I'll take this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Implement basic full table scan. [iceberg-rust]

2023-10-17 Thread via GitHub
liurenjie1024 commented on issue #66: URL: https://github.com/apache/iceberg-rust/issues/66#issuecomment-1765838560 Blocked by #79 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: Implement Iceberg values [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #20: URL: https://github.com/apache/iceberg-rust/pull/20#discussion_r1361668958 ## crates/iceberg/src/spec/values.rs: ## @@ -0,0 +1,964 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-17 Thread via GitHub
Fokko commented on issue #81: URL: https://github.com/apache/iceberg-python/issues/81#issuecomment-1765864606 Great catch @puchengy I wasn't aware of this sanitization behavior. Do you want to write a patch for it? -- This is an automated message from the Apache Git Service. To respond to

[PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul opened a new pull request, #80: URL: https://github.com/apache/iceberg-rust/pull/80 This PR fixes a logical bug in the tests for converting `Literal`s to and from `ByteBuf`. The bug is that I wrote the wrong bytes to the avro file. -- This is an automated message from the Apache G

Re: [PR] feat: Implement Iceberg values [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #20: URL: https://github.com/apache/iceberg-rust/pull/20#discussion_r1361677800 ## crates/iceberg/src/spec/values.rs: ## @@ -0,0 +1,964 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
Xuanwo commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361682216 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Writer

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361690259 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Write

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
Xuanwo commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361704555 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Writer

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-17 Thread via GitHub
zhangminglei commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1361733500 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software

Re: [I] How to read data in the order in which files are commited? [iceberg]

2023-10-17 Thread via GitHub
pvary commented on issue #8802: URL: https://github.com/apache/iceberg/issues/8802#issuecomment-1765938139 > > Sometimes we need to do similar thing in Flink Source, and we ended up creating our own comparator for this which compares Iceberg splits (which are a wrapper above ScanTasks).

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
pvary commented on code in PR #8808: URL: https://github.com/apache/iceberg/pull/8808#discussion_r1361742600 ## flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java: ## @@ -262,7 +262,11 @@ public ParquetValueReader primitive( switch (pri

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
pvary commented on code in PR #8808: URL: https://github.com/apache/iceberg/pull/8808#discussion_r1361742942 ## flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/data/TestFlinkParquetReader.java: ## @@ -81,26 +75,87 @@ public void testTwoLevelList() throws IOException {

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
pvary commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1765954472 This is a small change, so it might not be too hard to keep the different Flink version changes in sync, but usually we introduce the changes on the latest Flink, and then create a differen

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361754339 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Write

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
ZENOTME commented on PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#issuecomment-1765964478 I'm a little confused. So this avro bytes is different with [binary encoding in avro spec](https://avro.apache.org/docs/1.11.1/specification/#binary-encoding)? What's difference between

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
Xuanwo commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361762750 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Writer

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
Xuanwo commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361762750 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Writer

[PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nastra opened a new pull request, #8851: URL: https://github.com/apache/iceberg/pull/8851 similar to https://github.com/apache/iceberg/pull/8800, adding UUID to the `View` API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#issuecomment-1766011191 I will try to explain it the best way I can. Please correct me if I'm wrong. Iceberg defines it's own [binary serialization format](https://iceberg.apache.org/spec/#binary-single-v

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#discussion_r1361806053 ## crates/iceberg/src/spec/values.rs: ## @@ -995,7 +995,7 @@ mod tests { assert_eq!(literal, expected_literal); let mut writer = apache_avro::Write

Re: [I] Iceberg table support specified column comments by flinksql create [iceberg]

2023-10-17 Thread via GitHub
372242283 commented on issue #8511: URL: https://github.com/apache/iceberg/issues/8511#issuecomment-1766049159 I also encountered this issue by checking that the field 'COMMENT' in the table 'COLUMNS-V2' in HMS is null and not written in ![image](https://github.com/apache/iceberg/assets/

Re: [PR] Build: Add note about running tests/itests on MacOS [iceberg]

2023-10-17 Thread via GitHub
Fokko commented on PR #8766: URL: https://github.com/apache/iceberg/pull/8766#issuecomment-1766051674 For me it looks like the file was created by the daemon: ``` ➜ ls -lah /var/run/docker.sock lrwxr-xr-x 1 root daemon46B Oct 16 16:32 /var/run/docker.sock -> /Users/fokkodries

Re: [PR] Build: Add note about running tests/itests on MacOS [iceberg]

2023-10-17 Thread via GitHub
jbonofre commented on PR #8766: URL: https://github.com/apache/iceberg/pull/8766#issuecomment-1766072686 @Fokko I think anyone using docker-desktop on Mac will have the same issue. So it applies for 99% of the MacOS users. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Build: Add note about running tests/itests on MacOS [iceberg]

2023-10-17 Thread via GitHub
Fokko commented on PR #8766: URL: https://github.com/apache/iceberg/pull/8766#issuecomment-1766080354 @jbonofre I'm running docker desktop as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Build: Add note about running tests/itests on MacOS [iceberg]

2023-10-17 Thread via GitHub
Fokko merged PR #8766: URL: https://github.com/apache/iceberg/pull/8766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Add note about running tests/itests on MacOS [iceberg]

2023-10-17 Thread via GitHub
jbonofre commented on PR #8766: URL: https://github.com/apache/iceberg/pull/8766#issuecomment-1766082002 @Fokko and you don't have the symbolic link ? Did you install docker-desktop from homebrew cask ? -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
fengjiajie commented on code in PR #8808: URL: https://github.com/apache/iceberg/pull/8808#discussion_r1361854341 ## flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/data/FlinkParquetReaders.java: ## @@ -262,7 +262,11 @@ public ParquetValueReader primitive( switch

Re: [PR] fix: avro bytes test for Literal [iceberg-rust]

2023-10-17 Thread via GitHub
ZENOTME commented on PR #80: URL: https://github.com/apache/iceberg-rust/pull/80#issuecomment-1766094815 Thanks for your explanation! Totally understanded it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1361891676 ## api/src/main/java/org/apache/iceberg/view/View.java: ## @@ -111,4 +112,13 @@ default ReplaceViewVersion replaceVersion() { default UpdateLocation updateLocat

Re: [I] Iceberg table support specified column comments by flinksql create [iceberg]

2023-10-17 Thread via GitHub
huyuanfeng2018 commented on issue #8511: URL: https://github.com/apache/iceberg/issues/8511#issuecomment-1766135578 @stevenzwu In flink1.16 and before, it was impossible to parse the comment in the table creation statement. Starting from 1.17, the flink community completed this part of the

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1361897579 ## api/src/main/java/org/apache/iceberg/view/View.java: ## @@ -111,4 +112,13 @@ default ReplaceViewVersion replaceVersion() { default UpdateLocation updateLocation()

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
fengjiajie commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1766144239 > This is a small change, so it might not be too hard to keep the different Flink version changes in sync, but usually we introduce the changes on the latest Flink, and then create a d

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-17 Thread via GitHub
fengjiajie commented on code in PR #8808: URL: https://github.com/apache/iceberg/pull/8808#discussion_r1361899155 ## flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/data/TestFlinkParquetReader.java: ## @@ -81,26 +75,87 @@ public void testTwoLevelList() throws IOExceptio

Re: [I] Iceberg table support specified column comments by flinksql create [iceberg]

2023-10-17 Thread via GitHub
huyuanfeng2018 commented on issue #8511: URL: https://github.com/apache/iceberg/issues/8511#issuecomment-1766177686 > @stevenzwu In flink1.16 and before, it was impossible to parse the comment in the table creation statement. Starting from 1.17, the flink community completed this part of th

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nk1506 commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1361819048 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

Re: [PR] Spark 3.5: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nk1506 commented on code in PR #8853: URL: https://github.com/apache/iceberg/pull/8853#discussion_r1361945348 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/SparkSQLExecutionHelper.java: ## @@ -42,28 +45,26 @@ public static String lastExecutedMetricValue(Spark

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1361948344 ## api/src/main/java/org/apache/iceberg/view/View.java: ## @@ -111,4 +112,13 @@ default ReplaceViewVersion replaceVersion() { default UpdateLocation updateL

Re: [PR] Spark: Fix Fast forward before/after snapshot output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1361985873 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalR

Re: [PR] Spark: Fix Fast forward before/after snapshot output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1361985873 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalR

Re: [PR] Spark: Fix Fast forward before/after snapshot output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on PR #8854: URL: https://github.com/apache/iceberg/pull/8854#issuecomment-1766265102 also cc @rakesh-das08 let me know your thoughts on this fix! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-17 Thread via GitHub
JanKaul commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1362007623 ## crates/iceberg/src/spec/manifest.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] push down min/max/count to iceberg [iceberg]

2023-10-17 Thread via GitHub
atifiu commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1766286164 @amogh-jahagirdar I think I know how these delete files are generated even though copy on write is defined at table level. I have executed the delete from Trino and since it only supports

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-10-17 Thread via GitHub
Fokko commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1766306045 I just realized that this would also speed up operations snapshot expiration, because we do need to access the manifest files, but don't need to use the metrics. -- This is an automated

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
rakesh-das08 commented on PR #8854: URL: https://github.com/apache/iceberg/pull/8854#issuecomment-1766321748 @amogh-jahagirdar the fix LGTM. Thanks for fixing this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362046505 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362035790 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestFastForwardBranchProcedure.java: ## @@ -188,4 +188,38 @@ public void testInval

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362051079 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362063516 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalR

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362069529 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
rakesh-das08 commented on code in PR #8854: URL: https://github.com/apache/iceberg/pull/8854#discussion_r1362071081 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/FastForwardBranchProcedure.java: ## @@ -77,9 +77,9 @@ public InternalRow[] call(InternalRow a

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-10-17 Thread via GitHub
rustyconover commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1766378641 Yes it would! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-17 Thread via GitHub
ZENOTME commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1362079101 ## crates/iceberg/src/spec/manifest.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[I] Flaky test: TestSparkReaderDeletes.testEqualityDeleteWithDeletedColumn [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat opened a new issue, #8855: URL: https://github.com/apache/iceberg/issues/8855 TestSparkReaderDeletes > [format = orc, vectorized = false, planningMode = DISTRIBUTED] > testEqualityDeleteWithDeletedColumn PR:8854 Build: https://github.com/apache/iceberg/actions/run

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362101911 ## api/src/main/java/org/apache/iceberg/view/View.java: ## @@ -111,4 +112,13 @@ default ReplaceViewVersion replaceVersion() { default UpdateLocation updateLocat

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
ajantha-bhat commented on PR #8854: URL: https://github.com/apache/iceberg/pull/8854#issuecomment-1766410898 logged the flaky test: #8855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nk1506 commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362111563 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -111,14 +113,19 @@ public void testConsumeWithoutStartSnapsho

Re: [PR] Spark: Fix Fast forward procedure output for non-main branches [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar closed pull request #8854: Spark: Fix Fast forward procedure output for non-main branches URL: https://github.com/apache/iceberg/pull/8854 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nk1506 commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362126880 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-17 Thread via GitHub
Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1362161257 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` of

[I] Improve `All` Metadata Tables with Snapshot Information [iceberg]

2023-10-17 Thread via GitHub
RussellSpitzer opened a new issue, #8856: URL: https://github.com/apache/iceberg/issues/8856 ### Feature Request / Improvement Currently all versions of metadata tables have the exact same schema as their not "all" versions. This is actually not very useful if you are attempting to l

Re: [PR] Spark 3.5: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nastra merged PR #8853: URL: https://github.com/apache/iceberg/pull/8853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.5: Use Awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8853: URL: https://github.com/apache/iceberg/pull/8853#discussion_r1362301067 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/SparkSQLExecutionHelper.java: ## @@ -42,28 +45,26 @@ public static String lastExecutedMetricValue(Spark

Re: [PR] API, Core: Add uuid() to View [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8851: URL: https://github.com/apache/iceberg/pull/8851#discussion_r1362309604 ## core/src/main/java/org/apache/iceberg/view/BaseView.java: ## @@ -97,4 +98,9 @@ public ReplaceViewVersion replaceVersion() { public UpdateLocation updateLocation()

[PR] Nessie: retain authorship information when creating a namespace [iceberg]

2023-10-17 Thread via GitHub
adutra opened a new pull request, #8857: URL: https://github.com/apache/iceberg/pull/8857 This change enhances the process of creating new namespaces by retaining commit authorship information when committing the new namespace. It also switches to Nessie API V2 for the commit operatio

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362315314 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -98,9 +98,9 @@ protected List generateRecords(int numRecords, lo

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8852: URL: https://github.com/apache/iceberg/pull/8852#discussion_r1362375826 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamingMonitorFunction.java: ## @@ -111,14 +113,19 @@ public void testConsumeWithoutStartSnapsho

Re: [PR] Flink: Reverting the default custom partitioner for bucket column [iceberg]

2023-10-17 Thread via GitHub
stevenzwu merged PR #8848: URL: https://github.com/apache/iceberg/pull/8848 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[PR] Flink: Reverting the default custom partitioner for bucket column (#8848) [iceberg]

2023-10-17 Thread via GitHub
nastra opened a new pull request, #8858: URL: https://github.com/apache/iceberg/pull/8858 This backports #8848 to 1.4.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-17 Thread via GitHub
nastra commented on issue #8847: URL: https://github.com/apache/iceberg/issues/8847#issuecomment-1766722846 Closing this as #8848 has been merged to main and I backported it to 1.4.x in https://github.com/apache/iceberg/pull/8858 -- This is an automated message from the Apache Git Service

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-17 Thread via GitHub
nastra closed issue #8847: Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution URL: https://github.com/apache/iceberg/issues/8847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] [1.4.x] Core: Do not use a lazy split offset list in manifests (#8834) [iceberg]

2023-10-17 Thread via GitHub
rdblue merged PR #8845: URL: https://github.com/apache/iceberg/pull/8845 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-17 Thread via GitHub
stevenzwu commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1362401859 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -177,4 +191,26 @@ default Long fileSequenceNumber() { default F copy(boolean withStats) { retu

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on PR #8860: URL: https://github.com/apache/iceberg/pull/8860#issuecomment-1766749771 cc @bryanck -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362407399 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,11 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362408168 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,11 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1362413695 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] Python: Add support for Python 3.12 [iceberg-python]

2023-10-17 Thread via GitHub
steinsgateted commented on PR #35: URL: https://github.com/apache/iceberg-python/pull/35#issuecomment-1766760247 @jayceslesar Thank you for the information -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362417563 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
bryanck commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362419546 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +//

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362421006 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362421006 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] [1.4.x] Flink: Reverting the default custom partitioner for bucket column (#8848) [iceberg]

2023-10-17 Thread via GitHub
nastra merged PR #8858: URL: https://github.com/apache/iceberg/pull/8858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362428279 ## api/src/test/java/org/apache/iceberg/metrics/TestDefaultTimer.java: ## @@ -101,14 +103,7 @@ public void closeableTimer() throws InterruptedException { @Test pub

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362427190 ## api/src/test/java/org/apache/iceberg/TestHelpers.java: ## @@ -62,6 +70,54 @@ public static long waitUntilAfter(long timestampMillis) { return current; } +

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362429576 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -189,7 +192,17 @@ public void testAssumeRoleS3FileIO() throws Exception {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362431071 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -189,7 +192,17 @@ public void testAssumeRoleS3FileIO() throws Exception {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-17 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1362441102 ## aws/src/integration/java/org/apache/iceberg/aws/dynamodb/TestDynamoDbLockManager.java: ## @@ -141,11 +142,8 @@ public void testAcquireSingleProcess() throws Exception

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362445708 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [I] Replace Thread.sleep() usage in test code with Awaitility [iceberg]

2023-10-17 Thread via GitHub
nastra commented on issue #7154: URL: https://github.com/apache/iceberg/issues/7154#issuecomment-1766792091 I've opened https://github.com/apache/iceberg/pull/8853 and https://github.com/apache/iceberg/pull/8852 to give an idea about places that are good candidates to replace with Awaitilit

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub
nastra commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-1766801596 > @nastra and @singhpk234, is this safe for a patch release? It seems like a behavior change that would only be safe if the behavior is always the same when creating multiple credentials p

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362467796 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362467796 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362487232 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] [1.4.x] AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) [iceberg]

2023-10-17 Thread via GitHub
stevenzwu commented on PR #8843: URL: https://github.com/apache/iceberg/pull/8843#issuecomment-1766827281 > would only be safe if the behavior is always the same when creating multiple credentials providers. @rdblue I think it is a safe change from a static global singleton to creati

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362487232 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,12 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets()

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362490912 ## core/src/test/java/org/apache/iceberg/TableTestBase.java: ## @@ -110,7 +110,7 @@ public class TableTestBase { static final DataFile FILE_C = DataF

Re: [PR] Core: Ignore split offsets when the last split offset is past the file length [iceberg]

2023-10-17 Thread via GitHub
singhpk234 commented on code in PR #8860: URL: https://github.com/apache/iceberg/pull/8860#discussion_r1362493799 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -460,6 +460,16 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { +

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-17 Thread via GitHub
stevenzwu commented on PR #8803: URL: https://github.com/apache/iceberg/pull/8803#issuecomment-1766844420 @pvary I think we probably want to push the `copyStatsForColumns` down to ManifestReader. https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/ManifestReade

  1   2   >