[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-10 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019433513 ## python/pyiceberg/expressions/literals.py: ## @@ -125,81 +127,73 @@ def literal(value) -> Literal: @literal.register(bool) -def _(value: bool) -> Literal[bool]: +d

[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-10 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019434140 ## python/pyiceberg/expressions/literals.py: ## @@ -125,81 +127,73 @@ def literal(value) -> Literal: @literal.register(bool) -def _(value: bool) -> Literal[bool]: +d

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-10 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1019448647 ## .palantir/revapi.yml: ## @@ -1,4 +1,85 @@ acceptedBreaks: + "1.0.0": +org.apache.iceberg:iceberg-core: +- code: "java.class.defaultSerializationChanged" +

[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-10 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019453639 ## python/pyiceberg/expressions/literals.py: ## @@ -125,81 +127,73 @@ def literal(value) -> Literal: @literal.register(bool) -def _(value: bool) -> Literal[bool]: +d

[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-10 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019454610 ## python/pyiceberg/expressions/literals.py: ## @@ -125,81 +127,71 @@ def literal(value) -> Literal: @literal.register(bool) -def _(value: bool) -> Literal[bool]: +d

[GitHub] [iceberg] Fokko merged pull request #6170: Python: Move FileIO initialization to the catalog

2022-11-10 Thread GitBox
Fokko merged PR #6170: URL: https://github.com/apache/iceberg/pull/6170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on pull request #6170: Python: Move FileIO initialization to the catalog

2022-11-10 Thread GitBox
Fokko commented on PR #6170: URL: https://github.com/apache/iceberg/pull/6170#issuecomment-1310735121 Thanks @rdblue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [iceberg] singhpk234 commented on pull request #5888: Core: Rollback compaction on conflicts

2022-11-10 Thread GitBox
singhpk234 commented on PR #5888: URL: https://github.com/apache/iceberg/pull/5888#issuecomment-1310757061 Apologies for the delay in updating the pr, i was in the middle of changing my work location. Have addressed the feedback and moved the changes to SnapshotProducer from BaseTran

[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-10 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1019523027 ## python/pyiceberg/expressions/literals.py: ## @@ -58,25 +60,25 @@ timestamp_to_micros, timestamptz_to_micros, ) -from pyiceberg.utils.singleton import Single

[GitHub] [iceberg] RussellSpitzer commented on issue #6171: iceberg cant read parquet after configuration

2022-11-10 Thread GitBox
RussellSpitzer commented on issue #6171: URL: https://github.com/apache/iceberg/issues/6171#issuecomment-1310831081 IN your note you seem to be including both Scala 2.12 and 2.13 libraries, this is probably the issue. Usually when I see noClassDef with Scala classes like ```cala/$less$colon

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
RussellSpitzer commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019584432 ## core/src/test/java/org/apache/iceberg/TestPartitioning.java: ## @@ -43,6 +43,12 @@ public class TestPartitioning { required(1, "id", Types.IntegerT

[GitHub] [iceberg] RussellSpitzer commented on issue #6164: The Literals class does not handle literals of type LocalDateTime. This causes errors in expressions involving Timestamp.

2022-11-10 Thread GitBox
RussellSpitzer commented on issue #6164: URL: https://github.com/apache/iceberg/issues/6164#issuecomment-1310882642 Timestamp expressions can be created as in this test case https://github.com/apache/iceberg/blob/master/core/src/test/java/org/apache/iceberg/expressions/TestExpressionP

[GitHub] [iceberg] singhpk234 commented on pull request #4479: Spark 3.2: support rate limit in Spark Streaming

2022-11-10 Thread GitBox
singhpk234 commented on PR #4479: URL: https://github.com/apache/iceberg/pull/4479#issuecomment-1310887312 cc @rdblue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [iceberg] XAZAD opened a new issue, #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-10 Thread GitBox
XAZAD opened a new issue, #6172: URL: https://github.com/apache/iceberg/issues/6172 ### Apache Iceberg version 0.13.0 ### Query engine Spark ### Please describe the bug 🐞 Method rewriteDataFiles throws `org.apache.spark.sql.connector.catalog.CatalogNotFo

[GitHub] [iceberg] Fokko commented on a diff in pull request #6159: Python: Update mypy version

2022-11-10 Thread GitBox
Fokko commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1019658267 ## python/pyproject.toml: ## @@ -107,7 +107,7 @@ force_grid_wrap = 4 all = true [tool.mypy] -no_implicit_optional = true Review Comment: @LuigiCerone I think we w

[GitHub] [iceberg] alec-heif opened a new pull request, #6173: Fix typo in unused python iceberg paramter

2022-11-10 Thread GitBox
alec-heif opened a new pull request, #6173: URL: https://github.com/apache/iceberg/pull/6173 I happened to notice this typo, which seems obviously wrong. The code isn't yet exercised anywhere so it's not a big deal but figured I might as well fix it. -- This is an automated message from

[GitHub] [iceberg] dmgcodevil opened a new pull request, #6174: iss5675: limit total size of data files for compaction

2022-11-10 Thread GitBox
dmgcodevil opened a new pull request, #6174: URL: https://github.com/apache/iceberg/pull/6174 Sometimes it's not possible to use a filter to limit the size of files for compaction (generic data pipelines) or the total size per partition exceeds JVM heap. Using `totalSize` a user can limit t

[GitHub] [iceberg] dmgcodevil commented on issue #5675: Limit the number of files for rewrite/compaction action

2022-11-10 Thread GitBox
dmgcodevil commented on issue #5675: URL: https://github.com/apache/iceberg/issues/5675#issuecomment-1311065885 PR: https://github.com/apache/iceberg/pull/6174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [iceberg] dmgcodevil closed issue #3025: Schema field ids overridden when using nested structs

2022-11-10 Thread GitBox
dmgcodevil closed issue #3025: Schema field ids overridden when using nested structs URL: https://github.com/apache/iceberg/issues/3025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [iceberg] dmgcodevil closed issue #3168: How to sort Spark DataFrame ?

2022-11-10 Thread GitBox
dmgcodevil closed issue #3168: How to sort Spark DataFrame ? URL: https://github.com/apache/iceberg/issues/3168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [iceberg] github-actions[bot] closed issue #4634: Backport updated method signature of IcebergTableSource::getScanRuntimeProvider and IcebergTableSink::getSinkRuntimeProvider from Flink 1.15

2022-11-10 Thread GitBox
github-actions[bot] closed issue #4634: Backport updated method signature of IcebergTableSource::getScanRuntimeProvider and IcebergTableSink::getSinkRuntimeProvider from Flink 1.15 to Flink 1.14 URL: https://github.com/apache/iceberg/issues/4634 -- This is an automated message from the Apac

[GitHub] [iceberg] github-actions[bot] commented on issue #4631: Python: PartitionSpec Construction

2022-11-10 Thread GitBox
github-actions[bot] commented on issue #4631: URL: https://github.com/apache/iceberg/issues/4631#issuecomment-1311066654 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] closed issue #4621: [Feature Request] Iceberg integrates with Pulsar, supports java to read iceberg tables sequentially

2022-11-10 Thread GitBox
github-actions[bot] closed issue #4621: [Feature Request] Iceberg integrates with Pulsar, supports java to read iceberg tables sequentially URL: https://github.com/apache/iceberg/issues/4621 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [iceberg] github-actions[bot] commented on issue #4762: API: The return value of method is always null in IndexById

2022-11-10 Thread GitBox
github-actions[bot] commented on issue #4762: URL: https://github.com/apache/iceberg/issues/4762#issuecomment-1311066602 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #4634: Backport updated method signature of IcebergTableSource::getScanRuntimeProvider and IcebergTableSink::getSinkRuntimeProvider from Flin

2022-11-10 Thread GitBox
github-actions[bot] commented on issue #4634: URL: https://github.com/apache/iceberg/issues/4634#issuecomment-1311066629 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] commented on issue #4749: Add sequenceNumber method for ContentFile interface

2022-11-10 Thread GitBox
github-actions[bot] commented on issue #4749: URL: https://github.com/apache/iceberg/issues/4749#issuecomment-1311066615 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] closed issue #4631: Python: PartitionSpec Construction

2022-11-10 Thread GitBox
github-actions[bot] closed issue #4631: Python: PartitionSpec Construction URL: https://github.com/apache/iceberg/issues/4631 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] github-actions[bot] commented on issue #4621: [Feature Request] Iceberg integrates with Pulsar, supports java to read iceberg tables sequentially

2022-11-10 Thread GitBox
github-actions[bot] commented on issue #4621: URL: https://github.com/apache/iceberg/issues/4621#issuecomment-1311066680 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019733840 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019734099 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019734173 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019734281 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -298,4 +324,33 @@ private static boolean compatibleTransforms(Transform t1, Transform ||

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019735287 ## core/src/test/java/org/apache/iceberg/TestPartitioning.java: ## @@ -43,6 +43,12 @@ public class TestPartitioning { required(1, "id", Types.IntegerType

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019736124 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,75 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019735287 ## core/src/test/java/org/apache/iceberg/TestPartitioning.java: ## @@ -43,6 +43,12 @@ public class TestPartitioning { required(1, "id", Types.IntegerType

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019736124 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,75 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1019738831 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,75 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on issue #6162: Respect fileSequenceNumber in RewriteManifestsSparkAction

2022-11-10 Thread GitBox
aokolnychyi commented on issue #6162: URL: https://github.com/apache/iceberg/issues/6162#issuecomment-1311095921 I looking into this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-10 Thread GitBox
lvyanquan commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1019761610 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] manuzhang commented on pull request #5392: Spark: Fix a separate table cache being created for each rewriteFiles

2022-11-10 Thread GitBox
manuzhang commented on PR #5392: URL: https://github.com/apache/iceberg/pull/5392#issuecomment-1311199510 @RussellSpitzer @rdblue @ajantha-bhat please take another look. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #2276: Core: Add a util method to combine tasks by partition

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1019834880 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -128,6 +136,61 @@ public static CloseableIterable> planTaskG combinedTasks -> new

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #2276: Core: Add a util method to combine tasks by partition

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1019833535 ## api/src/main/java/org/apache/iceberg/util/StructProjection.java: ## @@ -171,6 +178,11 @@ public StructProjection wrap(StructLike newStruct) { return this;

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #2276: Core: Add a util method to combine tasks by partition

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1019833535 ## api/src/main/java/org/apache/iceberg/util/StructProjection.java: ## @@ -171,6 +178,11 @@ public StructProjection wrap(StructLike newStruct) { return this;

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #2276: Core: Add a util method to combine tasks by partition

2022-11-10 Thread GitBox
aokolnychyi commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1019836554 ## api/src/main/java/org/apache/iceberg/util/StructProjection.java: ## @@ -90,6 +90,13 @@ public static StructProjection createAllowMissing( private final Struct

[GitHub] [iceberg] lirui-apache opened a new pull request, #6175: Hive: Add UGI to the key in CachedClientPool

2022-11-10 Thread GitBox
lirui-apache opened a new pull request, #6175: URL: https://github.com/apache/iceberg/pull/6175 This addresses the issue #6071 by adding current UGI to the key of `CachedClientPool.clientPoolCache`. -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [iceberg] V-yg commented on issue #4065: Class conflict in ORC benchmark

2022-11-10 Thread GitBox
V-yg commented on issue #4065: URL: https://github.com/apache/iceberg/issues/4065#issuecomment-1311312103 @zhongyujiang Did you solve that problem? I had the same problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] nastra commented on a diff in pull request #6175: Hive: Add UGI to the key in CachedClientPool

2022-11-10 Thread GitBox
nastra commented on code in PR #6175: URL: https://github.com/apache/iceberg/pull/6175#discussion_r1019948958 ## hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java: ## @@ -87,4 +92,50 @@ public R run(Action action, boolean retry) throws TExceptio

[GitHub] [iceberg] nastra commented on issue #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-10 Thread GitBox
nastra commented on issue #6172: URL: https://github.com/apache/iceberg/issues/6172#issuecomment-1311340587 The error message mentions `default_iceberg`/`ice` so you'd have to make sure that this catalog is propertly set up: `spark.sql.catalog.(catalog_name): ...`. Addtional details can be

[GitHub] [iceberg-docs] hililiwei commented on pull request #175: Docs: Update spark-3.0 removal

2022-11-11 Thread GitBox
hililiwei commented on PR #175: URL: https://github.com/apache/iceberg-docs/pull/175#issuecomment-1311366768 @ajantha-bhat Copy that, thx. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [iceberg] ConeyLiu commented on issue #4626: Get null values for for the nested field partition column

2022-11-11 Thread GitBox
ConeyLiu commented on issue #4626: URL: https://github.com/apache/iceberg/issues/4626#issuecomment-1311375929 Hi @kbendick @szehon-ho @nastra, hope your guys could help take a look at this issue again, thanks a lot. -- This is an automated message from the Apache Git Service. To respond t

[GitHub] [iceberg] XAZAD commented on issue #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-11 Thread GitBox
XAZAD commented on issue #6172: URL: https://github.com/apache/iceberg/issues/6172#issuecomment-1311380547 As a mentioned before catalogs are configured and working properly `{ "hive.metastore.uris": "thrift://*", "iceberg.engine.hive.enabled": "true", "spark.sql.extensions":

[GitHub] [iceberg] zhongyujiang commented on issue #4065: Class conflict in ORC benchmark

2022-11-11 Thread GitBox
zhongyujiang commented on issue #4065: URL: https://github.com/apache/iceberg/issues/4065#issuecomment-1311445140 @V-yg No, I haven't. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] XAZAD commented on issue #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-11 Thread GitBox
XAZAD commented on issue #6172: URL: https://github.com/apache/iceberg/issues/6172#issuecomment-1311676676 some more details: https://user-images.githubusercontent.com/13484463/201346541-cf319f21-dd02-4781-956b-7ba98e64b599.png";> full stack trace: ``` java.lang.RuntimeException:

[GitHub] [iceberg] nastra commented on issue #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-11 Thread GitBox
nastra commented on issue #6172: URL: https://github.com/apache/iceberg/issues/6172#issuecomment-1311769354 Could you maybe check whether the issue exists on the latest Iceberg version (1.0.0)? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [iceberg] XAZAD commented on issue #6172: rewriteDataFiles throws exception in spark 3.2

2022-11-11 Thread GitBox
XAZAD commented on issue #6172: URL: https://github.com/apache/iceberg/issues/6172#issuecomment-1311770953 I'll try but it will take some tome cause infrastructure I'm using is huge and it will take time to deploy new version on library even in dev. -- This is an automated message from th

[GitHub] [iceberg] renshangtao commented on pull request #5544: Encryption integration and test

2022-11-11 Thread GitBox
renshangtao commented on PR #5544: URL: https://github.com/apache/iceberg/pull/5544#issuecomment-1311808301 @ggershinsky Hello, excuse me I tested the code, and the file wasn't encrypted,test case execution failed. All the testxxxWithoutKeys() are failed,Is my configuration incorrect

[GitHub] [iceberg] krvikash commented on pull request #6174: iss5675: limit total size of data files for compaction

2022-11-11 Thread GitBox
krvikash commented on PR #6174: URL: https://github.com/apache/iceberg/pull/6174#issuecomment-1311835852 nit: I think all commits can be squashed into one single commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] danielcweeks commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-11-11 Thread GitBox
danielcweeks commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1020398769 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -141,6 +142,8 @@ public CloseableIterable planFiles() { doPlanFiles(), ()

[GitHub] [iceberg] danielcweeks merged pull request #6058: Core,Spark: Add metadata to Scan Report

2022-11-11 Thread GitBox
danielcweeks merged PR #6058: URL: https://github.com/apache/iceberg/pull/6058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[GitHub] [iceberg] aokolnychyi commented on issue #6162: Respect fileSequenceNumber in RewriteManifestsSparkAction

2022-11-11 Thread GitBox
aokolnychyi commented on issue #6162: URL: https://github.com/apache/iceberg/issues/6162#issuecomment-1311964778 PR #6176 for Spark 3.3 is out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6176: Spark 3.3: Preserve file seq numbers while rewriting manifests

2022-11-11 Thread GitBox
aokolnychyi commented on code in PR #6176: URL: https://github.com/apache/iceberg/pull/6176#discussion_r1020417580 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -374,8 +378,9 @@ private static ManifestFile writeManifes

[GitHub] [iceberg] ddrinka opened a new issue, #6177: Allow setting the Parquet format version for datafile writes

2022-11-11 Thread GitBox
ddrinka opened a new issue, #6177: URL: https://github.com/apache/iceberg/issues/6177 ### Feature Request / Improvement The Parquet support in Iceberg currently exposes the `writerVersion` in `Parquet.WriteBuilder`, but this configuration should be extended out to a table property.

[GitHub] [iceberg] ddrinka commented on issue #6177: Allow setting the Parquet format version for datafile writes

2022-11-11 Thread GitBox
ddrinka commented on issue #6177: URL: https://github.com/apache/iceberg/issues/6177#issuecomment-1312058900 There's some relevant conversation here: https://github.com/apache/iceberg/pull/2551 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [iceberg] LuigiCerone commented on a diff in pull request #6159: Python: Update mypy version

2022-11-11 Thread GitBox
LuigiCerone commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020505411 ## python/pyproject.toml: ## @@ -107,7 +107,7 @@ force_grid_wrap = 4 all = true [tool.mypy] -no_implicit_optional = true Review Comment: @Fokko Yep, you're

[GitHub] [iceberg] flyrain commented on a diff in pull request #6012: Spark 3.3: Add a procedure to generate table changes

2022-11-11 Thread GitBox
flyrain commented on code in PR #6012: URL: https://github.com/apache/iceberg/pull/6012#discussion_r1020522314 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/GenerateChangesProcedure.java: ## @@ -0,0 +1,271 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] flyrain commented on a diff in pull request #6012: Spark 3.3: Add a procedure to generate table changes

2022-11-11 Thread GitBox
flyrain commented on code in PR #6012: URL: https://github.com/apache/iceberg/pull/6012#discussion_r1020522314 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/GenerateChangesProcedure.java: ## @@ -0,0 +1,271 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] LuigiCerone commented on a diff in pull request #6159: Python: Update mypy version

2022-11-11 Thread GitBox
LuigiCerone commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020523338 ## python/tests/avro/test_decoder.py: ## @@ -106,9 +107,11 @@ def read(self, size: int = 0) -> bytes: self.pos += 1 return int.to_bytes(1, self.po

[GitHub] [iceberg] LuigiCerone commented on a diff in pull request #6159: Python: Update mypy version

2022-11-11 Thread GitBox
LuigiCerone commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020523338 ## python/tests/avro/test_decoder.py: ## @@ -106,9 +107,11 @@ def read(self, size: int = 0) -> bytes: self.pos += 1 return int.to_bytes(1, self.po

[GitHub] [iceberg] Fokko merged pull request #6173: Fix typo in unused python iceberg paramter

2022-11-11 Thread GitBox
Fokko merged PR #6173: URL: https://github.com/apache/iceberg/pull/6173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on a diff in pull request #6159: Python: Update mypy version

2022-11-11 Thread GitBox
Fokko commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020553202 ## python/tests/avro/test_decoder.py: ## @@ -106,9 +107,11 @@ def read(self, size: int = 0) -> bytes: self.pos += 1 return int.to_bytes(1, self.pos, byt

[GitHub] [iceberg-docs] willshen opened a new pull request, #176: Fix broken Spark 2.4 and 3.0 runtime jar link on Releases page

2022-11-11 Thread GitBox
willshen opened a new pull request, #176: URL: https://github.com/apache/iceberg-docs/pull/176 Fix broken Spark 2.4 and 3.0 runtime jar link on Releases page to keep up with the changes in the module name/path (https://github.com/apache/iceberg/pull/4158). This is related to #174 but

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-11 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1020581546 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -365,6 +374,13 @@ public boolean dropNamespace(Namespace namespace) { @Overrid

[GitHub] [iceberg] haizhou-zhao commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-11 Thread GitBox
haizhou-zhao commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-131748 @gaborkaszab Thanks for the last round of review. I have some different opinions on whether the preconditions for createNamespace and setProperty are the same or different. Feel free

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-11 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1020582495 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -426,6 +510,194 @@ public void testSetNamespaceProperties() throws TException

[GitHub] [iceberg] flyrain commented on a diff in pull request #6012: Spark 3.3: Add a procedure to generate table changes

2022-11-11 Thread GitBox
flyrain commented on code in PR #6012: URL: https://github.com/apache/iceberg/pull/6012#discussion_r1020588386 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/GenerateChangesProcedure.java: ## @@ -0,0 +1,271 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] [iceberg] github-actions[bot] closed issue #3726: Slove the problem that failed to remove the data files when using HiveCatalog.dropTable

2022-11-11 Thread GitBox
github-actions[bot] closed issue #3726: Slove the problem that failed to remove the data files when using HiveCatalog.dropTable URL: https://github.com/apache/iceberg/issues/3726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [iceberg] github-actions[bot] commented on issue #3710: CVE-2021-44228 - Log4j Remote Execution

2022-11-11 Thread GitBox
github-actions[bot] commented on issue #3710: URL: https://github.com/apache/iceberg/issues/3710#issuecomment-1312279301 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #3726: Slove the problem that failed to remove the data files when using HiveCatalog.dropTable

2022-11-11 Thread GitBox
github-actions[bot] commented on issue #3726: URL: https://github.com/apache/iceberg/issues/3726#issuecomment-1312279284 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] commented on issue #3716: flink iceberg source reading array types fail with Cast Exception

2022-11-11 Thread GitBox
github-actions[bot] commented on issue #3716: URL: https://github.com/apache/iceberg/issues/3716#issuecomment-1312279293 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] sunchao commented on a diff in pull request #2276: Core: Add a util method to combine tasks by partition

2022-11-11 Thread GitBox
sunchao commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1020624495 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -128,6 +136,61 @@ public static CloseableIterable> planTaskG combinedTasks -> new Bas

[GitHub] [iceberg] LuigiCerone commented on a diff in pull request #6159: Python: Update mypy version

2022-11-12 Thread GitBox
LuigiCerone commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020777087 ## python/tests/avro/test_decoder.py: ## @@ -106,9 +107,11 @@ def read(self, size: int = 0) -> bytes: self.pos += 1 return int.to_bytes(1, self.po

[GitHub] [iceberg] krvikash opened a new pull request, #6178: Core: Remove redundant initialization

2022-11-12 Thread GitBox
krvikash opened a new pull request, #6178: URL: https://github.com/apache/iceberg/pull/6178 Core: Remove redundant initialization -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [iceberg] dmgcodevil commented on pull request #6174: iss5675: limit total size of data files for compaction

2022-11-12 Thread GitBox
dmgcodevil commented on PR #6174: URL: https://github.com/apache/iceberg/pull/6174#issuecomment-1312545162 I've noticed one thing: `isPartialFileScan(task)` check is redundant b/c `task.files().size() > 1` is always true. see `filteredGroupedTasks`. -- This is an automated message from t

[GitHub] [iceberg] netanelm-upstream commented on issue #4065: Class conflict in ORC benchmark

2022-11-12 Thread GitBox
netanelm-upstream commented on issue #4065: URL: https://github.com/apache/iceberg/issues/4065#issuecomment-1312548043 I have the same issue trying to write Iceberg ORC files using the JAVA API with hive metastore as a catalog. Somehow, it uploads this class: org.apache.orc.storage.ql.ex

[GitHub] [iceberg] github-actions[bot] commented on issue #4739: apache iceberg对接s3存储,创建存储位置为s3的表格,java代码操作遇到了问题Unable to load region

2022-11-12 Thread GitBox
github-actions[bot] commented on issue #4739: URL: https://github.com/apache/iceberg/issues/4739#issuecomment-1312601765 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #3709: Partition Metadata table breaks with a partition column named "partition"

2022-11-12 Thread GitBox
github-actions[bot] commented on issue #3709: URL: https://github.com/apache/iceberg/issues/3709#issuecomment-1312601781 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] JonasJ-ap opened a new pull request, #6179: AWS: Re-tag files when renaming tables in GlueCatalog

2022-11-12 Thread GitBox
JonasJ-ap opened a new pull request, #6179: URL: https://github.com/apache/iceberg/pull/6179 Follows PR #4402 . As mentioned in https://github.com/apache/iceberg/pull/4402#issuecomment-1261096282: In `GlueCatalog`, if `s3.write.table-name-tag-enabled` and `s3.write.namespace-name-tag

[GitHub] [iceberg] ggershinsky commented on a diff in pull request #3471: Core: Envelope encryption

2022-11-12 Thread GitBox
ggershinsky commented on code in PR #3471: URL: https://github.com/apache/iceberg/pull/3471#discussion_r1020852734 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -349,4 +350,27 @@ private TableProperties() {} public static final String UPSERT_ENABLED =

[GitHub] [iceberg] get1boat opened a new issue, #6180: can't upsert v2 table by add hint /*+ OPTIONS('upsert-enabled'='true') */

2022-11-13 Thread GitBox
get1boat opened a new issue, #6180: URL: https://github.com/apache/iceberg/issues/6180 ### Apache Iceberg version 1.0.0 (latest release) ### Query engine Flink ### Please describe the bug 🐞 table DDL: ` CREATE TABLE hive_catalog.ods_ice.tmp_lkj_kafka2i

[GitHub] [iceberg] krvikash commented on issue #6180: can't upsert v2 table by add hint /*+ OPTIONS('upsert-enabled'='true') */

2022-11-13 Thread GitBox
krvikash commented on issue #6180: URL: https://github.com/apache/iceberg/issues/6180#issuecomment-1312779669 Hi @get1boat, You are missing the primary key while creating the table. > Enabling UPSERT mode using upsert-enabled in the [write options](#Write options) provides mor

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-13 Thread GitBox
stevenzwu commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1020938017 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2022-11-13 Thread GitBox
stevenzwu commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1020940001 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -21,6 +21,23 @@ /** API for configuring an incremental scan. */ public interface IncrementalScan

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2022-11-13 Thread GitBox
stevenzwu commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1020940001 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -21,6 +21,23 @@ /** API for configuring an incremental scan. */ public interface IncrementalScan

[GitHub] [iceberg] Samrose-Ahmed commented on issue #5997: Iceberg table maintenance/compaction within AWS

2022-11-13 Thread GitBox
Samrose-Ahmed commented on issue #5997: URL: https://github.com/apache/iceberg/issues/5997#issuecomment-1312796748 I would recommend running a Spark job. An AWS Glue job is the easiest to get started but considering you're running this once, it'll likely be cheaper to run on EMR (serverless

[GitHub] [iceberg] Fokko commented on a diff in pull request #6159: Python: Update mypy version

2022-11-13 Thread GitBox
Fokko commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1020961137 ## python/pyiceberg/catalog/__init__.py: ## @@ -120,16 +120,16 @@ def load_catalog(name: str, **properties: Optional[str]) -> Catalog: or if it could not de

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-13 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1020961770 ## python/pyiceberg/expressions/__init__.py: ## @@ -90,16 +110,25 @@ def eval(self, struct: StructProtocol) -> T: """ return self.accessor.get(struct)

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-13 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1020961425 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +64,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-13 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1020962953 ## python/pyiceberg/expressions/literals.py: ## @@ -108,7 +110,7 @@ def __ge__(self, other): @singledispatch -def literal(value) -> Literal: +def literal(value: Any)

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-13 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1020969932 ## python/tests/expressions/test_expressions.py: ## @@ -365,7 +344,7 @@ def test_bound_greater_than_or_equal_invert(table_schema_simple: Schema): def test_bound_gre

[GitHub] [iceberg] github-actions[bot] commented on issue #4735: [HadoopFileIO] Empty table directory left after being dropped

2022-11-13 Thread GitBox
github-actions[bot] commented on issue #4735: URL: https://github.com/apache/iceberg/issues/4735#issuecomment-1312867245 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

<    13   14   15   16   17   18   19   20   21   22   >