[GitHub] [iceberg] ConeyLiu commented on pull request #4577: Fixes read metadata table failed due to illegal character

2022-11-08 Thread GitBox
ConeyLiu commented on PR #4577: URL: https://github.com/apache/iceberg/pull/4577#issuecomment-1308090077 Thanks @szehon-ho @nastra @chenjunjiedada @RussellSpitzer for your time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] ConeyLiu closed issue #4576: Read metadata table failed due to illegal character

2022-11-08 Thread GitBox
ConeyLiu closed issue #4576: Read metadata table failed due to illegal character URL: https://github.com/apache/iceberg/issues/4576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-08 Thread GitBox
stevenzwu commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1017328011 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] sunchao commented on a diff in pull request #2276: Core: Add option to combine tasks by partition

2022-11-08 Thread GitBox
sunchao commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1017341461 ## core/src/test/java/org/apache/iceberg/util/TestTableScanUtil.java: ## @@ -136,6 +141,129 @@ public void testTaskMerging() { Assert.assertEquals("Appropriate tas

[GitHub] [iceberg] sunchao commented on a diff in pull request #2276: Core: Add option to combine tasks by partition

2022-11-08 Thread GitBox
sunchao commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1017342018 ## core/src/main/java/org/apache/iceberg/util/TableScanUtil.java: ## @@ -128,6 +137,66 @@ public static CloseableIterable> planTaskG combinedTasks -> new Bas

[GitHub] [iceberg-docs] ajantha-bhat opened a new pull request, #175: Docs: Update spark-3.0 removal

2022-11-08 Thread GitBox
ajantha-bhat opened a new pull request, #175: URL: https://github.com/apache/iceberg-docs/pull/175 Follow up of https://github.com/apache/iceberg/pull/6093 for docs repo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg-docs] ajantha-bhat commented on pull request #175: Docs: Update spark-3.0 removal

2022-11-08 Thread GitBox
ajantha-bhat commented on PR #175: URL: https://github.com/apache/iceberg-docs/pull/175#issuecomment-1308117214 cc: @hililiwei, we might have to do the same for flink 1.13 removal and 1.16 addition. -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [iceberg] lirui-apache commented on issue #6071: Should ClientPool consider UGI when reusing a connection?

2022-11-08 Thread GitBox
lirui-apache commented on issue #6071: URL: https://github.com/apache/iceberg/issues/6071#issuecomment-1308151509 OK, I'll work on a PR for this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6151: Docs: Update table snapshot retention property descriptions

2022-11-08 Thread GitBox
amogh-jahagirdar opened a new pull request, #6151: URL: https://github.com/apache/iceberg/pull/6151 This PR updates the table snapshot retention property descriptions so that they explicitly mention they control the min snapshots to keep and max age of snapshots on table's main branch. -

[GitHub] [iceberg] amogh-jahagirdar closed pull request #6151: Docs: Update table snapshot retention property descriptions

2022-11-08 Thread GitBox
amogh-jahagirdar closed pull request #6151: Docs: Update table snapshot retention property descriptions URL: https://github.com/apache/iceberg/pull/6151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] amogh-jahagirdar commented on pull request #6151: Docs: Update table snapshot retention property descriptions

2022-11-08 Thread GitBox
amogh-jahagirdar commented on PR #6151: URL: https://github.com/apache/iceberg/pull/6151#issuecomment-1308203757 closing this to avoid confusion, and will raise another one. This change should basically be the opposite. This configuration gets used by default for any branch not just main.

[GitHub] [iceberg] amogh-jahagirdar opened a new pull request, #6152: Docs: Update table snapshot retention property descriptions

2022-11-08 Thread GitBox
amogh-jahagirdar opened a new pull request, #6152: URL: https://github.com/apache/iceberg/pull/6152 Docs: Update table snapshot retention property descriptions to explicitly mention that it is a default for all the table's branches. -- This is an automated message from the Apache Git Serv

[GitHub] [iceberg] zhangpengbigdata opened a new issue, #6153: I found duplicate records when i was repeatedly exporting records from CDC Stream into iceberg partitioned table

2022-11-08 Thread GitBox
zhangpengbigdata opened a new issue, #6153: URL: https://github.com/apache/iceberg/issues/6153 ### Query engine Iceberg 1.0.0 Flink1.13 ### Question Hi all, I found duplicate records when i was repeatedly exporting records from CDC Stream into iceberg partitioned tab

[GitHub] [iceberg] gaborkaszab opened a new pull request, #6154: Core: Rename HMS_TABLE_OWNER to follow naming convention

2022-11-09 Thread GitBox
gaborkaszab opened a new pull request, #6154: URL: https://github.com/apache/iceberg/pull/6154 I introduced this property in #5763, however, I learned since that its value doesn't follow the naming conventions for a table property. Fortunately the change hasn't been released yet so we can s

[GitHub] [iceberg] nastra commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules

2022-11-09 Thread GitBox
nastra commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1017593969 ## core/src/main/java/org/apache/iceberg/BaseReplacePartitions.java: ## @@ -79,6 +79,20 @@ public ReplacePartitions validateNoConflictingData() { return this; }

[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
gaborkaszab commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1017594660 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -360,5 +360,7 @@ private TableProperties() {} public static final String UPSERT_ENABLED = "w

[GitHub] [iceberg] nastra opened a new issue, #6155: Remove API deprecations for 1.2.0

2022-11-09 Thread GitBox
nastra opened a new issue, #6155: URL: https://github.com/apache/iceberg/issues/6155 ### Feature Request / Improvement iceberg-api / iceberg-core have a few deprecated methods that we should remove before releasing 1.2.0 ### Query engine _No response_ -- This is an au

[GitHub] [iceberg] nastra commented on pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules

2022-11-09 Thread GitBox
nastra commented on PR #6146: URL: https://github.com/apache/iceberg/pull/6146#issuecomment-1308384970 I've also created https://github.com/apache/iceberg/issues/6155 to remove all of those deprecated methods once 1.1.0 has been released -- This is an automated message from the Apache Git

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017626877 ## python/tests/expressions/test_expressions.py: ## @@ -272,6 +235,22 @@ def test_in_empty(): assert In(Reference("foo"), ()) == AlwaysFalse() +def test_in_set()

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-11-09 Thread GitBox
szehon-ho commented on code in PR #5376: URL: https://github.com/apache/iceberg/pull/5376#discussion_r1017644858 ## api/src/main/java/org/apache/iceberg/DataFile.java: ## @@ -99,10 +99,24 @@ public interface DataFile extends ContentFile { optional(140, "sort_order_id", In

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017645513 ## python/tests/expressions/test_expressions.py: ## @@ -272,6 +235,22 @@ def test_in_empty(): assert In(Reference("foo"), ()) == AlwaysFalse() +def test_in_set()

[GitHub] [iceberg] szehon-ho commented on pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-11-09 Thread GitBox
szehon-ho commented on PR #5376: URL: https://github.com/apache/iceberg/pull/5376#issuecomment-1308426921 Update. chatted offline with @RussellSpitzer will spend a few days if its possible to make the type dynamic struct instead of static map, to get the right types for lower, upper bounds.

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017649986 ## python/tests/expressions/test_expressions.py: ## @@ -281,15 +260,15 @@ def test_not_in_equal(): def test_bind_in(table_schema_simple: Schema): -bound = BoundI

[GitHub] [iceberg] ethan7811 commented on issue #5963: Aliyun-OssFileIO: Premature end of Content-Length delimited message body

2022-11-09 Thread GitBox
ethan7811 commented on issue #5963: URL: https://github.com/apache/iceberg/issues/5963#issuecomment-1308429585 maybe the root cause is similar to this ... https://github.com/apache/hadoop/pull/2692/files -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017652341 ## python/tests/expressions/test_visitors.py: ## @@ -284,13 +277,14 @@ def test_boolean_expression_visit_raise_not_implemented_error(): def test_bind_visitor_alread

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017654076 ## python/tests/expressions/test_expressions.py: ## @@ -365,7 +344,7 @@ def test_bound_greater_than_or_equal_invert(table_schema_simple: Schema): def test_bound_gre

[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
gaborkaszab commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1017654923 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -518,11 +522,36 @@ private Map convertToMetadata(Database database) { if (data

[GitHub] [iceberg] Fokko opened a new issue, #6156: Python: Fix the explicit annotations of Reference

2022-11-09 Thread GitBox
Fokko opened a new issue, #6156: URL: https://github.com/apache/iceberg/issues/6156 ### Feature Request / Improvement https://github.com/apache/iceberg/pull/6139#issuecomment-1307812458 I did notice some funky behavior with the type system. When using a `Reference(UnboundTerm[T

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1017656095 ## python/tests/expressions/test_expressions.py: ## @@ -365,7 +344,7 @@ def test_bound_greater_than_or_equal_invert(table_schema_simple: Schema): def test_bound_gre

[GitHub] [iceberg] lvyanquan opened a new pull request, #6157: Docs: Update spotless apply command

2022-11-09 Thread GitBox
lvyanquan opened a new pull request, #6157: URL: https://github.com/apache/iceberg/pull/6157 add Flink 1.16 and remove Spark 3.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [iceberg] luoyuxia commented on issue #6153: I found duplicate records when i was repeatedly exporting records from CDC Stream into iceberg partitioned table

2022-11-09 Thread GitBox
luoyuxia commented on issue #6153: URL: https://github.com/apache/iceberg/issues/6153#issuecomment-1308443668 May the reason is that the key in source is `id`, but the primary key in sink `dt,id`. The primary keys for source and sink aren't equal. Is possible for the following case? ``

[GitHub] [iceberg] zhangpengbigdata commented on issue #6153: I found duplicate records when i was repeatedly exporting records from CDC Stream into iceberg partitioned table

2022-11-09 Thread GitBox
zhangpengbigdata commented on issue #6153: URL: https://github.com/apache/iceberg/issues/6153#issuecomment-1308479071 Think you @luoyuxia for your replay. I just confirmed that the problem is caused by Trino which i used for query, because the result is correct when i use Spark to query.

[GitHub] [iceberg] Fokko commented on a diff in pull request #6141: Python: Make invalid Literal conversions explicit

2022-11-09 Thread GitBox
Fokko commented on code in PR #6141: URL: https://github.com/apache/iceberg/pull/6141#discussion_r1017702146 ## python/pyiceberg/expressions/literals.py: ## @@ -125,81 +127,71 @@ def literal(value) -> Literal: @literal.register(bool) -def _(value: bool) -> Literal[bool]: +d

[GitHub] [iceberg] nastra commented on issue #5970: Spark: Iceberg: java.io.InvalidClassException: org.apache.iceberg.Schema; local class incompatible: stream classdesc serialVersionUID = 332036701241

2022-11-09 Thread GitBox
nastra commented on issue #5970: URL: https://github.com/apache/iceberg/issues/5970#issuecomment-1308488755 @jornfranke were you able to figure out what caused the issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] szehon-ho merged pull request #6154: Core: Rename HMS_TABLE_OWNER to follow naming convention

2022-11-09 Thread GitBox
szehon-ho merged PR #6154: URL: https://github.com/apache/iceberg/pull/6154 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] nastra commented on issue #5945: Read Iceberg Table Bug(cannot find field start_date from [org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObj)

2022-11-09 Thread GitBox
nastra commented on issue #5945: URL: https://github.com/apache/iceberg/issues/5945#issuecomment-1308490335 @95liu is this still an issue or can this be closed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] szehon-ho commented on pull request #6154: Core: Rename HMS_TABLE_OWNER to follow naming convention

2022-11-09 Thread GitBox
szehon-ho commented on PR #6154: URL: https://github.com/apache/iceberg/pull/6154#issuecomment-1308490447 Makes sense based on discussion: https://github.com/apache/iceberg/pull/6045#discussion_r1010730010 . Thanks @gaborkaszab and @nastra for review -- This is an automated message from

[GitHub] [iceberg] Fokko commented on issue #6156: Python: Fix the explicit annotations of Reference

2022-11-09 Thread GitBox
Fokko commented on issue #6156: URL: https://github.com/apache/iceberg/issues/6156#issuecomment-1308500288 For the bound ones, I think the key is in the literal method. Turning: ```python @singledispatch def literal(value) -> Literal: ``` Into: ```python @singledispatch

[GitHub] [iceberg] luoyuxia commented on issue #6153: I found duplicate records when i was repeatedly exporting records from CDC Stream into iceberg partitioned table

2022-11-09 Thread GitBox
luoyuxia commented on issue #6153: URL: https://github.com/apache/iceberg/issues/6153#issuecomment-1308504155 I guess it's a problem of trino-iceberg-connector, you can seek for help from the trino community if the connector is maintained by it. -- This is an automated message from the Ap

[GitHub] [iceberg] Fokko merged pull request #6157: Docs: Update spotless apply command

2022-11-09 Thread GitBox
Fokko merged PR #6157: URL: https://github.com/apache/iceberg/pull/6157 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] foarsitter opened a new pull request, #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
foarsitter opened a new pull request, #6158: URL: https://github.com/apache/iceberg/pull/6158 The `setup-python@v4` action supports caching for poetry lock files as shown over here: https://github.com/actions/setup-python/blob/main/docs/advanced-usage.md#caching-packages -- This is an au

[GitHub] [iceberg] Fokko commented on pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
Fokko commented on PR #6158: URL: https://github.com/apache/iceberg/pull/6158#issuecomment-1308551127 Thanks for working on this @foarsitter, this is awesome! Looks like the CI is a little sad because it expects `poetry` to be already installed. -- This is an automated message from the Ap

[GitHub] [iceberg] nastra commented on a diff in pull request #5893: Core: Use avro compression properties from table properties while writing Manifest and Manifest list files.

2022-11-09 Thread GitBox
nastra commented on code in PR #5893: URL: https://github.com/apache/iceberg/pull/5893#discussion_r1017734825 ## core/src/main/java/org/apache/iceberg/util/NumberUtil.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contrib

[GitHub] [iceberg] Fokko closed pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
Fokko closed pull request #6158: Python: CI poetry cache URL: https://github.com/apache/iceberg/pull/6158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
gaborkaszab commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1017726904 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -358,6 +361,85 @@ public void testCreateNamespace() throws Exception {

[GitHub] [iceberg] Fokko commented on a diff in pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
Fokko commented on code in PR #6158: URL: https://github.com/apache/iceberg/pull/6158#discussion_r1017798337 ## .github/workflows/python-ci.yml: ## @@ -43,9 +43,14 @@ jobs: steps: - uses: actions/checkout@v3 +- name: Install poetry + run: pip install poetry

[GitHub] [iceberg] Fokko commented on a diff in pull request #6128: Python: Projection

2022-11-09 Thread GitBox
Fokko commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1017812043 ## python/pyiceberg/expressions/literals.py: ## @@ -213,6 +217,26 @@ class LongLiteral(Literal[int]): def __init__(self, value: int): super().__init__(value

[GitHub] [iceberg] Fokko commented on a diff in pull request #6128: Python: Projection

2022-11-09 Thread GitBox
Fokko commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1017838594 ## python/pyiceberg/transforms.py: ## @@ -173,6 +201,23 @@ def apply(self, value: Optional[S]) -> Optional[int]: def result_type(self, source: IcebergType) -> Iceber

[GitHub] [iceberg] foarsitter commented on a diff in pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
foarsitter commented on code in PR #6158: URL: https://github.com/apache/iceberg/pull/6158#discussion_r1017838716 ## .github/workflows/python-ci.yml: ## @@ -43,9 +43,14 @@ jobs: steps: - uses: actions/checkout@v3 +- name: Install poetry + run: pip install po

[GitHub] [iceberg] Fokko commented on a diff in pull request #6128: Python: Projection

2022-11-09 Thread GitBox
Fokko commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1017839219 ## python/pyiceberg/transforms.py: ## @@ -237,7 +282,7 @@ class TimeResolution(IntEnum): SECOND = 0 -class TimeTransform(Transform[S, int], Singleton): +class Da

[GitHub] [iceberg] ggershinsky commented on pull request #5432: AES GCM Stream Spec

2022-11-09 Thread GitBox
ggershinsky commented on PR #5432: URL: https://github.com/apache/iceberg/pull/5432#issuecomment-1308656393 All suggestions SGTM, I'll update the doc accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [iceberg] Fokko merged pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
Fokko merged PR #6158: URL: https://github.com/apache/iceberg/pull/6158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on pull request #6158: Python: CI poetry cache

2022-11-09 Thread GitBox
Fokko commented on PR #6158: URL: https://github.com/apache/iceberg/pull/6158#issuecomment-1308669435 Thanks @foarsitter ! 🥳 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] luoyuxia commented on issue #2650: flink cdc 2 iceberg0.11.1

2022-11-09 Thread GitBox
luoyuxia commented on issue #2650: URL: https://github.com/apache/iceberg/issues/2650#issuecomment-1308671497 What's your hive version? You may need to add the dependency: ` org.apache.thrift libfb303 0.9.3

[GitHub] [iceberg] LuigiCerone opened a new pull request, #6159: Python: Update mypy version

2022-11-09 Thread GitBox
LuigiCerone opened a new pull request, #6159: URL: https://github.com/apache/iceberg/pull/6159 Closes #6148 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [iceberg] Fokko commented on pull request #6159: Python: Update mypy version

2022-11-09 Thread GitBox
Fokko commented on PR #6159: URL: https://github.com/apache/iceberg/pull/6159#issuecomment-1308709343 Hey @LuigiCerone thanks for opening this PR! It looks like some of the rules of mypy were updated, and we also need to brush some of the annotations. For example, there is now support for r

[GitHub] [iceberg] hililiwei opened a new pull request, #6160: Flink: Support locality with LocalitySplitAssigner

2022-11-09 Thread GitBox
hililiwei opened a new pull request, #6160: URL: https://github.com/apache/iceberg/pull/6160 Create locality assigner that hands out splits with guarantee in locality. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [iceberg] Fokko commented on a diff in pull request #6128: Python: Projection

2022-11-09 Thread GitBox
Fokko commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1018100513 ## python/pyiceberg/transforms.py: ## @@ -249,6 +294,20 @@ def satisfies_order_of(self, other: Transform) -> bool: def result_type(self, source: IcebergType) -> Iceb

[GitHub] [iceberg] ajantha-bhat commented on pull request #4826: Nessie: Use unique path for different table with same name

2022-11-09 Thread GitBox
ajantha-bhat commented on PR #4826: URL: https://github.com/apache/iceberg/pull/4826#issuecomment-1308978553 @RussellSpitzer: Thanks for the review and approval. I think we can merge this as there is already a consensus for the fix (from everyone). The confusion and discussions we

[GitHub] [iceberg] RussellSpitzer merged pull request #4826: Nessie: Use unique path for different table with same name

2022-11-09 Thread GitBox
RussellSpitzer merged PR #4826: URL: https://github.com/apache/iceberg/pull/4826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-09 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1018178779 ## python/tests/expressions/test_expressions.py: ## @@ -365,7 +344,7 @@ def test_bound_greater_than_or_equal_invert(table_schema_simple: Schema): def test_bound_gr

[GitHub] [iceberg] rdblue merged pull request #5150: Spark Integration to read from Snapshot ref

2022-11-09 Thread GitBox
rdblue merged PR #5150: URL: https://github.com/apache/iceberg/pull/5150 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] aokolnychyi opened a new issue, #6162: Respect fileSequenceNumber in RewriteManifestsSparkAction

2022-11-09 Thread GitBox
aokolnychyi opened a new issue, #6162: URL: https://github.com/apache/iceberg/issues/6162 ### Feature Request / Improvement The action for rewriting manifest must take into account file sequence numbers. See [here](https://github.com/apache/iceberg/pull/6002#discussion_r1006106088

[GitHub] [iceberg] gaborkaszab commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
gaborkaszab commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1309096900 @danielcweeks I made a separate PR to rename the one existing property so we won't break backward compatibility with this patch. It has already gone in https://github.com/apache/icebe

[GitHub] [iceberg] gaborkaszab commented on issue #6042: Add delete file information to partitions table

2022-11-09 Thread GitBox
gaborkaszab commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1309123141 Hey @szehon-ho, This improvement is also on my list and planned to take a look at in the near future. Is the work already ongoing? Can we sync to avoid duplicate work? -- Th

[GitHub] [iceberg] ajantha-bhat commented on issue #6042: Add delete file information to partitions table

2022-11-09 Thread GitBox
ajantha-bhat commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1309132003 @gaborkaszab : As mentioned, I was planning to work on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018262364 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -358,6 +361,85 @@ public void testCreateNamespace() throws Exception {

[GitHub] [iceberg] szehon-ho commented on issue #6042: Add delete file information to partitions table

2022-11-09 Thread GitBox
szehon-ho commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1309167197 Yep im not working on it, you can sync with @ajantha-bhat . I think the main input needed here was about the value to display, if you have any thoughts about it? For sure

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018274521 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -426,6 +508,167 @@ public void testSetNamespaceProperties() throws TException

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018262364 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -358,6 +361,85 @@ public void testCreateNamespace() throws Exception {

[GitHub] [iceberg] aokolnychyi opened a new pull request, #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
aokolnychyi opened a new pull request, #6163: URL: https://github.com/apache/iceberg/pull/6163 This PR adds a method to `Partitioning` to build an intersection of all partition types. This intersection can be used as a clustering key for storage-partitioned joins. -- This is an automated

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1018295880 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] aokolnychyi commented on pull request #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
aokolnychyi commented on PR #6163: URL: https://github.com/apache/iceberg/pull/6163#issuecomment-1309208030 @sunchao, here is how we can compute a clustering key for storage-partitioned joins. cc @RussellSpitzer @flyrain @szehon-ho @rdblue -- This is an automated message from the

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018359737 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -448,6 +691,153 @@ public void testRemoveNamespaceProperties() throws TExcepti

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018360013 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -448,6 +691,153 @@ public void testRemoveNamespaceProperties() throws TExcepti

[GitHub] [iceberg] haizhou-zhao commented on a diff in pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on code in PR #6045: URL: https://github.com/apache/iceberg/pull/6045#discussion_r1018274521 ## hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java: ## @@ -426,6 +508,167 @@ public void testSetNamespaceProperties() throws TException

[GitHub] [iceberg] jornfranke commented on issue #5970: Spark: Iceberg: java.io.InvalidClassException: org.apache.iceberg.Schema; local class incompatible: stream classdesc serialVersionUID = 33203670

2022-11-09 Thread GitBox
jornfranke commented on issue #5970: URL: https://github.com/apache/iceberg/issues/5970#issuecomment-1309356700 No, I had other focus. I suspect there is an older version of Iceberg somewhere on the cluster. -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [iceberg] joao-parana opened a new issue, #6164: The Literals class does not handle literals of type LocalDateTime. This causes errors in expressions involving Timestamp.

2022-11-09 Thread GitBox
joao-parana opened a new issue, #6164: URL: https://github.com/apache/iceberg/issues/6164 ### Apache Iceberg version 1.0.0 (latest release) ### Query engine _No response_ ### Please describe the bug 🐞 The `Literals` class of the `org.apache.iceberg.expressio

[GitHub] [iceberg] rdblue commented on pull request #5836: Cache dropStats result for ManifestReader iterator

2022-11-09 Thread GitBox
rdblue commented on PR #5836: URL: https://github.com/apache/iceberg/pull/5836#issuecomment-1309365412 I think this is correct and the test looks fine to me. It doesn't exercise the specific case I think was a bug originally, but that probably isn't the purpose. -- This is an automated m

[GitHub] [iceberg] rdblue merged pull request #5836: Cache dropStats result for ManifestReader iterator

2022-11-09 Thread GitBox
rdblue merged PR #5836: URL: https://github.com/apache/iceberg/pull/5836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue merged pull request #6113: Core: Reduce code duplication around writing JSON collections

2022-11-09 Thread GitBox
rdblue merged PR #6113: URL: https://github.com/apache/iceberg/pull/6113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6113: Core: Reduce code duplication around writing JSON collections

2022-11-09 Thread GitBox
rdblue commented on PR #6113: URL: https://github.com/apache/iceberg/pull/6113#issuecomment-1309366911 Thanks, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [iceberg] rdblue merged pull request #6140: Python: Fix Evaluator tests

2022-11-09 Thread GitBox
rdblue merged PR #6140: URL: https://github.com/apache/iceberg/pull/6140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] danielcweeks merged pull request #6150: Core: Sync client/server properties in REST catalog

2022-11-09 Thread GitBox
danielcweeks merged PR #6150: URL: https://github.com/apache/iceberg/pull/6150 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

[GitHub] [iceberg] rdblue commented on a diff in pull request #6145: Python: Add initial TableScan implementation

2022-11-09 Thread GitBox
rdblue commented on code in PR #6145: URL: https://github.com/apache/iceberg/pull/6145#discussion_r1018424359 ## python/tests/cli/test_console.py: ## @@ -538,6 +542,7 @@ def test_json_describe_namespace_does_not_exists(_): def test_json_describe_table(_): runner = CliRunne

[GitHub] [iceberg] rdblue commented on a diff in pull request #6145: Python: Add initial TableScan implementation

2022-11-09 Thread GitBox
rdblue commented on code in PR #6145: URL: https://github.com/apache/iceberg/pull/6145#discussion_r1018424138 ## python/pyiceberg/table/__init__.py: ## @@ -32,13 +33,19 @@ from pyiceberg.table.snapshots import Snapshot, SnapshotLogEntry from pyiceberg.table.sorting import Sort

[GitHub] [iceberg] haizhou-zhao commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-09 Thread GitBox
haizhou-zhao commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1309388616 Hey Gabor, Thanks for your last round of review. All your comments make sense to me and taken. Major changes in the latest commit: 1. createNamespace, setProp, removeProp ea

[GitHub] [iceberg] Fokko commented on a diff in pull request #6145: Python: Add initial TableScan implementation

2022-11-09 Thread GitBox
Fokko commented on code in PR #6145: URL: https://github.com/apache/iceberg/pull/6145#discussion_r1018428906 ## python/tests/cli/test_console.py: ## @@ -538,6 +542,7 @@ def test_json_describe_namespace_does_not_exists(_): def test_json_describe_table(_): runner = CliRunner

[GitHub] [iceberg] Fokko commented on a diff in pull request #6131: Python: Add initial TableScan implementation

2022-11-09 Thread GitBox
Fokko commented on code in PR #6131: URL: https://github.com/apache/iceberg/pull/6131#discussion_r1018441052 ## python/pyiceberg/table/__init__.py: ## @@ -90,3 +103,90 @@ def snapshot_by_name(self, name: str) -> Optional[Snapshot]: def history(self) -> List[SnapshotLogEntry

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018458745 ## .palantir/revapi.yml: ## @@ -1,4 +1,85 @@ acceptedBreaks: + "1.0.0": +org.apache.iceberg:iceberg-core: +- code: "java.class.defaultSerializationChanged" +

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018459238 ## .palantir/revapi.yml: ## @@ -1,4 +1,85 @@ acceptedBreaks: + "1.0.0": +org.apache.iceberg:iceberg-core: +- code: "java.class.defaultSerializationChanged" +

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018461087 ## .palantir/revapi.yml: ## @@ -1,4 +1,85 @@ acceptedBreaks: + "1.0.0": +org.apache.iceberg:iceberg-core: +- code: "java.class.defaultSerializationChanged" +

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018461583 ## .palantir/revapi.yml: ## @@ -1,4 +1,85 @@ acceptedBreaks: + "1.0.0": +org.apache.iceberg:iceberg-core: +- code: "java.class.defaultSerializationChanged" +

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
RussellSpitzer commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1018462633 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018463495 ## .palantir/revapi.yml: ## @@ -11,15 +92,21 @@ acceptedBreaks: - code: "java.method.addedToInterface" new: "method java.lang.String org.apache.iceberg.expr

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018464014 ## .palantir/revapi.yml: ## @@ -11,15 +92,21 @@ acceptedBreaks: - code: "java.method.addedToInterface" new: "method java.lang.String org.apache.iceberg.expr

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
RussellSpitzer commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1018464405 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] rdblue commented on a diff in pull request #6146: Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks

2022-11-09 Thread GitBox
rdblue commented on code in PR #6146: URL: https://github.com/apache/iceberg/pull/6146#discussion_r1018465265 ## core/src/main/java/org/apache/iceberg/rest/HTTPClient.java: ## @@ -269,6 +269,20 @@ public T post( return execute(Method.POST, path, null, body, responseType, h

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6163: Core: Method for building common partition type

2022-11-09 Thread GitBox
RussellSpitzer commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1018466921 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,68 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-11-09 Thread GitBox
rdblue commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1018469473 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java: ## @@ -532,6 +537,24 @@ public final void initialize(String name, CaseInsensitiveStringMap

<    11   12   13   14   15   16   17   18   19   20   >