[GitHub] [iceberg] InvisibleProgrammer commented on pull request #6337: Docs: Update Iceberg Hive documentation

2022-12-05 Thread GitBox
InvisibleProgrammer commented on PR #6337: URL: https://github.com/apache/iceberg/pull/6337#issuecomment-1336973794 @pvary , Can I ask for workflow approval? Thank you, Zsolt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] ajantha-bhat commented on pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
ajantha-bhat commented on PR #6250: URL: https://github.com/apache/iceberg/pull/6250#issuecomment-1337013120 cc: @pvary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] ajantha-bhat commented on pull request #6234: Docs: Remove parent-version-id from the view spec example

2022-12-05 Thread GitBox
ajantha-bhat commented on PR #6234: URL: https://github.com/apache/iceberg/pull/6234#issuecomment-1337015494 cc: @stevenzwu, @jackye1995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [iceberg] KarlManong opened a new issue, #6359: Sometimes org.apache.iceberg.jdbc.TestJdbcTableConcurrency#testConcurrentFastAppends never end

2022-12-05 Thread GitBox
KarlManong opened a new issue, #6359: URL: https://github.com/apache/iceberg/issues/6359 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine Other ### Please describe the bug 🐞 When I running `gradle build`, it stuck at `> :iceberg-core:test >

[GitHub] [iceberg] KarlManong closed issue #6359: Sometimes org.apache.iceberg.jdbc.TestJdbcTableConcurrency#testConcurrentFastAppends never end

2022-12-05 Thread GitBox
KarlManong closed issue #6359: Sometimes org.apache.iceberg.jdbc.TestJdbcTableConcurrency#testConcurrentFastAppends never end URL: https://github.com/apache/iceberg/issues/6359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6091: Spark-3.3: Handle statistics file clean up from expireSnapshots action/procedure

2022-12-05 Thread GitBox
ajantha-bhat commented on code in PR #6091: URL: https://github.com/apache/iceberg/pull/6091#discussion_r1039525785 ## core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java: ## @@ -1234,6 +1245,40 @@ public void testMultipleRefsAndCleanExpiredFilesFailsForIncrementalCl

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6090: Core: Handle statistics file clean up from expireSnapshots

2022-12-05 Thread GitBox
ajantha-bhat commented on code in PR #6090: URL: https://github.com/apache/iceberg/pull/6090#discussion_r1039527780 ## core/src/main/java/org/apache/iceberg/FileCleanupStrategy.java: ## @@ -79,4 +80,15 @@ protected void deleteFiles(Set pathsToDelete, String fileType) {

[GitHub] [iceberg] XN137 commented on pull request #6221: Change SingleBufferInputStream .read signature to match super-method.

2022-12-05 Thread GitBox
XN137 commented on PR #6221: URL: https://github.com/apache/iceberg/pull/6221#issuecomment-1337298037 additional context: the failing check can only be performed when the compiled bytecode contains parameter name information for methods: https://github.com/palantir/gradle-baseline

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039570176 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -0,0 +1,713 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039570407 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -0,0 +1,713 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039570783 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -0,0 +1,713 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039571147 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkMetaDataTable.java: ## @@ -0,0 +1,713 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] Fokko merged pull request #6221: Change SingleBufferInputStream .read signature to match super-method.

2022-12-05 Thread GitBox
Fokko merged PR #6221: URL: https://github.com/apache/iceberg/pull/6221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039674684 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039683554 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #6335: Core: Avoid generating a large ManifestFile when committing

2022-12-05 Thread GitBox
ConeyLiu commented on code in PR #6335: URL: https://github.com/apache/iceberg/pull/6335#discussion_r1039686191 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -499,6 +501,40 @@ protected long snapshotId() { return snapshotId; } + protected stati

[GitHub] [iceberg] ajantha-bhat commented on pull request #6221: Change SingleBufferInputStream .read signature to match super-method.

2022-12-05 Thread GitBox
ajantha-bhat commented on PR #6221: URL: https://github.com/apache/iceberg/pull/6221#issuecomment-1337501228 > but there are some jdk vendors (i.e. on centos7) that also enable this flag on jdk8, thus this check might fail depending on the jdk distribution used. @XN137: Thank you so m

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #6335: Core: Avoid generating a large ManifestFile when committing

2022-12-05 Thread GitBox
ConeyLiu commented on code in PR #6335: URL: https://github.com/apache/iceberg/pull/6335#discussion_r1039689716 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -838,9 +839,17 @@ public Object updateEvent() { } private void cleanUncommittedApp

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039702122 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
RussellSpitzer commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039705908 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --pack

[GitHub] [iceberg] nastra commented on pull request #6335: Core: Avoid generating a large ManifestFile when committing

2022-12-05 Thread GitBox
nastra commented on PR #6335: URL: https://github.com/apache/iceberg/pull/6335#issuecomment-1337557734 It is mentioned in the docs that `MANIFEST_TARGET_SIZE_BYTES` relates to `Target size when merging manifest files`, meaning that this setting only takes effect when merging of manifest fil

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
ajantha-bhat commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039751700 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --packag

[GitHub] [iceberg] nastra commented on a diff in pull request #6353: Make sure S3 stream opened by ReadConf ctor is closed

2022-12-05 Thread GitBox
nastra commented on code in PR #6353: URL: https://github.com/apache/iceberg/pull/6353#discussion_r1039765677 ## parquet/src/main/java/org/apache/iceberg/parquet/ReadConf.java: ## @@ -46,7 +47,7 @@ * * @param type of value to read */ -class ReadConf { +class ReadConf impl

[GitHub] [iceberg] nastra commented on a diff in pull request #6348: Python: Update license-checker

2022-12-05 Thread GitBox
nastra commented on code in PR #6348: URL: https://github.com/apache/iceberg/pull/6348#discussion_r1039781132 ## python/dev/.rat-excludes: ## @@ -0,0 +1,2 @@ +.rat-excludes Review Comment: related to [this old comment](https://github.com/apache/iceberg/pull/5840#discussion_

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039789754 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
ajantha-bhat commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039831769 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --packag

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
RussellSpitzer commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039836905 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --pack

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
RussellSpitzer commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039836905 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --pack

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039841272 ## docs/flink-getting-started.md: ## @@ -712,9 +712,188 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ | compression-strategy | Table write.orc.

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6250: Docs: Remove redundant configuration from spark docs

2022-12-05 Thread GitBox
ajantha-bhat commented on code in PR #6250: URL: https://github.com/apache/iceberg/pull/6250#discussion_r1039848442 ## docs/spark-getting-started.md: ## @@ -57,8 +57,6 @@ This command creates a path-based catalog named `local` for tables under `$PWD/w ```sh spark-sql --packag

[GitHub] [iceberg] pvary commented on a diff in pull request #6299: Flink: support split discovery throttling for streaming read

2022-12-05 Thread GitBox
pvary commented on code in PR #6299: URL: https://github.com/apache/iceberg/pull/6299#discussion_r1039860822 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/enumerator/EnumerationHistory.java: ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundat

[GitHub] [iceberg] ajantha-bhat opened a new pull request, #6360: Docs: Update Zorder spark support versions.

2022-12-05 Thread GitBox
ajantha-bhat opened a new pull request, #6360: URL: https://github.com/apache/iceberg/pull/6360 Some users are using Zorder with spark-3.1 and facing a confusing error message. Hence, updating the document. Also thought about updating the code to throw the unsupported exception. But

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039891570 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputFile.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039893751 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputFile.java: ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

[GitHub] [iceberg] rdblue merged pull request #4925: API: Add view interfaces

2022-12-05 Thread GitBox
rdblue merged PR #4925: URL: https://github.com/apache/iceberg/pull/4925 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #4925: API: Add view interfaces

2022-12-05 Thread GitBox
rdblue commented on PR #4925: URL: https://github.com/apache/iceberg/pull/4925#issuecomment-1337860939 Merge! Let me know where the implementation PR is and I'll start looking at that! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039912960 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6360: Docs: Update Zorder spark support versions.

2022-12-05 Thread GitBox
RussellSpitzer commented on code in PR #6360: URL: https://github.com/apache/iceberg/pull/6360#discussion_r1039913771 ## docs/spark-procedures.md: ## @@ -271,7 +271,7 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile |---|-

[GitHub] [iceberg] szehon-ho closed issue #4362: Expose human-readable metrics in metadata tables

2022-12-05 Thread GitBox
szehon-ho closed issue #4362: Expose human-readable metrics in metadata tables URL: https://github.com/apache/iceberg/issues/4362 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] szehon-ho merged pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-12-05 Thread GitBox
szehon-ho merged PR #5376: URL: https://github.com/apache/iceberg/pull/5376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] szehon-ho commented on pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-12-05 Thread GitBox
szehon-ho commented on PR #5376: URL: https://github.com/apache/iceberg/pull/5376#issuecomment-1337877894 Thanks @RussellSpitzer @aokolnychyi @chenjunjiedada for detailed reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039937969 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039939654 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039941138 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039944476 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] rdblue commented on a diff in pull request #3231: GCM encryption stream

2022-12-05 Thread GitBox
rdblue commented on code in PR #3231: URL: https://github.com/apache/iceberg/pull/3231#discussion_r1039945676 ## core/src/main/java/org/apache/iceberg/encryption/AesGcmInputStream.java: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

[GitHub] [iceberg] InvisibleProgrammer closed pull request #6337: Docs: Update Iceberg Hive documentation

2022-12-05 Thread GitBox
InvisibleProgrammer closed pull request #6337: Docs: Update Iceberg Hive documentation URL: https://github.com/apache/iceberg/pull/6337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] InvisibleProgrammer commented on pull request #6337: Docs: Update Iceberg Hive documentation

2022-12-05 Thread GitBox
InvisibleProgrammer commented on PR #6337: URL: https://github.com/apache/iceberg/pull/6337#issuecomment-1337926448 Hi, @pvary ! First of all, thank you for the review and approval. Unfortunately, I just learned the `Close` button on an approved PR doesn't mean close and merge, it

[GitHub] [iceberg] InvisibleProgrammer commented on pull request #6337: Docs: Update Iceberg Hive documentation

2022-12-05 Thread GitBox
InvisibleProgrammer commented on PR #6337: URL: https://github.com/apache/iceberg/pull/6337#issuecomment-1337938679 Hi, @pvary ! First of all, thank you for the review and approval. Unfortunately, I just learned the `Close` button on an approved PR doesn't mean close and merge, it

[GitHub] [iceberg] tprelle commented on pull request #6327: ORC: Fix error when projecting nested indentity partition column

2022-12-05 Thread GitBox
tprelle commented on PR #6327: URL: https://github.com/apache/iceberg/pull/6327#issuecomment-1338004394 hi @shardulm94, it's seems less intrusive and better than https://github.com/apache/iceberg/pull/4599 -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [iceberg] Fokko opened a new issue, #6361: Python: Ignore home folder when running tests

2022-12-05 Thread GitBox
Fokko opened a new issue, #6361: URL: https://github.com/apache/iceberg/issues/6361 ### Feature Request / Improvement When you run tests on your local machine, and you have a `~/.pyiceberg.yaml` around, there is a possibility that the `test_missing_uri` will fail because it will pick

[GitHub] [iceberg] Fokko opened a new pull request, #6362: Python: Fix PyArrow import

2022-12-05 Thread GitBox
Fokko opened a new pull request, #6362: URL: https://github.com/apache/iceberg/pull/6362 Tested this in a fresh docker container: ``` ➜ python git:(fd-fix-pyarrow-import) docker run -v `pwd`:/vo/ -t -i python:3.9 bash root@1252c09f932c:/vo# cd /vo/ root@1252c09f932c:/vo#

issues@iceberg.apache.org

2022-12-05 Thread GitBox
danielcweeks commented on code in PR #6324: URL: https://github.com/apache/iceberg/pull/6324#discussion_r1040055218 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -567,8 +569,13 @@ Database convertToDatabase(Namespace namespace, Map meta) {

[GitHub] [iceberg] stevenzwu merged pull request #6299: Flink: support split discovery throttling for streaming read

2022-12-05 Thread GitBox
stevenzwu merged PR #6299: URL: https://github.com/apache/iceberg/pull/6299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] stevenzwu commented on pull request #6299: Flink: support split discovery throttling for streaming read

2022-12-05 Thread GitBox
stevenzwu commented on PR #6299: URL: https://github.com/apache/iceberg/pull/6299#issuecomment-1338169362 Thanks @pvary and @hililiwei for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6360: Docs: Update Zorder spark support versions.

2022-12-05 Thread GitBox
RussellSpitzer commented on code in PR #6360: URL: https://github.com/apache/iceberg/pull/6360#discussion_r1040172058 ## docs/spark-procedures.md: ## @@ -271,7 +271,7 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile |---|-

[GitHub] [iceberg] stevenzwu opened a new pull request, #6363: Flink: backport split discovery throttling for FLIP-27 source to 1.14…

2022-12-05 Thread GitBox
stevenzwu opened a new pull request, #6363: URL: https://github.com/apache/iceberg/pull/6363 … and 1.15 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5967: Flink: Support read options in flink source

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #5967: URL: https://github.com/apache/iceberg/pull/5967#discussion_r1040195457 ## docs/flink-getting-started.md: ## @@ -683,7 +683,58 @@ env.execute("Test Iceberg DataStream"); OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5967: Flink: Support read options in flink source

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #5967: URL: https://github.com/apache/iceberg/pull/5967#discussion_r1040196996 ## docs/flink-getting-started.md: ## @@ -683,7 +683,58 @@ env.execute("Test Iceberg DataStream"); OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5967: Flink: Support read options in flink source

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #5967: URL: https://github.com/apache/iceberg/pull/5967#discussion_r1040196996 ## docs/flink-getting-started.md: ## @@ -683,7 +683,58 @@ env.execute("Test Iceberg DataStream"); OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5967: Flink: Support read options in flink source

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #5967: URL: https://github.com/apache/iceberg/pull/5967#discussion_r1040196996 ## docs/flink-getting-started.md: ## @@ -683,7 +683,58 @@ env.execute("Test Iceberg DataStream"); OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the

[GitHub] [iceberg] github-actions[bot] closed issue #4822: ParallelIterator is using too much memory

2022-12-05 Thread GitBox
github-actions[bot] closed issue #4822: ParallelIterator is using too much memory URL: https://github.com/apache/iceberg/issues/4822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [iceberg] github-actions[bot] commented on issue #4822: ParallelIterator is using too much memory

2022-12-05 Thread GitBox
github-actions[bot] commented on issue #4822: URL: https://github.com/apache/iceberg/issues/4822#issuecomment-1338428008 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5967: Flink: Support read options in flink source

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #5967: URL: https://github.com/apache/iceberg/pull/5967#discussion_r1040205639 ## docs/flink-getting-started.md: ## @@ -683,7 +683,58 @@ env.execute("Test Iceberg DataStream"); OVERWRITE and UPSERT can't be set together. In UPSERT mode, if the

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6352: AWS: Fix inconsistent behavior of naming S3 location between read and write operations by allowing only s3 bucket name

2022-12-05 Thread GitBox
amogh-jahagirdar commented on code in PR #6352: URL: https://github.com/apache/iceberg/pull/6352#discussion_r1040279885 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -74,17 +74,14 @@ class S3URI { this.scheme = schemeSplit[0]; String[] authoritySpl

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6352: AWS: Fix inconsistent behavior of naming S3 location between read and write operations by allowing only s3 bucket name

2022-12-05 Thread GitBox
amogh-jahagirdar commented on code in PR #6352: URL: https://github.com/apache/iceberg/pull/6352#discussion_r1040280304 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3URI.java: ## @@ -74,17 +74,14 @@ class S3URI { this.scheme = schemeSplit[0]; String[] authoritySpl

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1039841272 ## docs/flink-getting-started.md: ## @@ -712,9 +712,188 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ | compression-strategy | Table write.orc.

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040359570 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040360338 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040359570 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] rbalamohan opened a new issue, #6364: Optimise POS reads

2022-12-05 Thread GitBox
rbalamohan opened a new issue, #6364: URL: https://github.com/apache/iceberg/issues/6364 ### Apache Iceberg version 0.14.1 ### Query engine Spark ### Please describe the bug 🐞 Currently combinedFileTask can have more than 1 file. Depending on the nature of

[GitHub] [iceberg] stevenzwu commented on pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on PR #6222: URL: https://github.com/apache/iceberg/pull/6222#issuecomment-1338683010 @hililiwei we should add comprehensive unit test for `StructRowData`. I have some internal DataGenerators for unit test code with very comprehensive coverage all field types (inc

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040393766 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/TestHelpers.java: ## @@ -295,6 +299,161 @@ private static void assertEquals( } } + public stati

[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

2022-12-05 Thread GitBox
chenjunjiedada commented on code in PR #6313: URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040420644 ## core/src/main/java/org/apache/iceberg/MetricsConfig.java: ## @@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) { return ne

[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

2022-12-05 Thread GitBox
chenjunjiedada commented on code in PR #6313: URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040421576 ## flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java: ## @@ -160,7 +184,8 @@ public EqualityDeleteWriter newEqDeleteWriter(

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6313: URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040423889 ## core/src/main/java/org/apache/iceberg/MetricsConfig.java: ## @@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) { return new Met

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6313: URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040425132 ## flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java: ## @@ -160,7 +184,8 @@ public EqualityDeleteWriter newEqDeleteWriter(

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

2022-12-05 Thread GitBox
stevenzwu commented on code in PR #6313: URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040423889 ## core/src/main/java/org/apache/iceberg/MetricsConfig.java: ## @@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) { return new Met

[GitHub] [iceberg] szehon-ho opened a new pull request, #6365: Core: Add position deletes metadata table

2022-12-05 Thread GitBox
szehon-ho opened a new pull request, #6365: URL: https://github.com/apache/iceberg/pull/6365 This breaks up the pr https://github.com/apache/iceberg/pull/4812 , and is just the part to add the table PositionDeletesTable. It is based on @aokolnychyi 's newly-added BatchScan interface,

[GitHub] [iceberg] RussellSpitzer commented on issue #6364: Optimise POS reads

2022-12-05 Thread GitBox
RussellSpitzer commented on issue #6364: URL: https://github.com/apache/iceberg/issues/6364#issuecomment-1338783816 What is a POS file? I’m not familiar with the acronym Sent from my iPhoneOn Dec 5, 2022, at 9:00 PM, rbalamohan ***@***.***> wrote: Apache Iceberg version 0.14.1 Quer

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040547304 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/TestHelpers.java: ## @@ -295,6 +299,161 @@ private static void assertEquals( } } + public stati

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040554968 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] nastra commented on pull request #6355: Build: Bump org.eclipse.jgit from 5.13.1.202206130422-r to 6.4.0.202211300538-r

2022-12-05 Thread GitBox
nastra commented on PR #6355: URL: https://github.com/apache/iceberg/pull/6355#issuecomment-1338859690 we can't upgrade because the latest version that supports JDK8 is 5.13.1.202206130422-r -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [iceberg] nastra closed pull request #6355: Build: Bump org.eclipse.jgit from 5.13.1.202206130422-r to 6.4.0.202211300538-r

2022-12-05 Thread GitBox
nastra closed pull request #6355: Build: Bump org.eclipse.jgit from 5.13.1.202206130422-r to 6.4.0.202211300538-r URL: https://github.com/apache/iceberg/pull/6355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [iceberg] dependabot[bot] commented on pull request #6355: Build: Bump org.eclipse.jgit from 5.13.1.202206130422-r to 6.4.0.202211300538-r

2022-12-05 Thread GitBox
dependabot[bot] commented on PR #6355: URL: https://github.com/apache/iceberg/pull/6355#issuecomment-1338859714 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let

[GitHub] [iceberg] jaehyeon-kim commented on issue #4977: Support Kafka Connect within Iceberg

2022-12-05 Thread GitBox
jaehyeon-kim commented on issue #4977: URL: https://github.com/apache/iceberg/issues/4977#issuecomment-1338869374 +1 for Kafka connect. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040596892 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/data/StructRowData.java: ## @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [iceberg] zstraw commented on issue #4550: the snapshot file is lost when write iceberg using flink Failed to open input stream for file File does not exist

2022-12-05 Thread GitBox
zstraw commented on issue #4550: URL: https://github.com/apache/iceberg/issues/4550#issuecomment-1338908066 After deeping into iceberg code and the log, I can reproduce it in debugging locally. The scenario may happens in the process of Flink cancelling. 1. IcebergFileCommitter is

[GitHub] [iceberg] hililiwei commented on pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on PR #6222: URL: https://github.com/apache/iceberg/pull/6222#issuecomment-1338913731 > @hililiwei we should add comprehensive unit test for `StructRowData`. > > I have some internal DataGenerators for unit test code with very comprehensive coverage all field types

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6222: Flink: Support inspecting table

2022-12-05 Thread GitBox
hililiwei commented on code in PR #6222: URL: https://github.com/apache/iceberg/pull/6222#discussion_r1040606681 ## docs/flink-getting-started.md: ## @@ -712,9 +712,188 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') */ | compression-strategy | Table write.orc.