[GitHub] [iceberg] pvary commented on pull request #6175: Hive: Add UGI to the key in CachedClientPool

2022-11-15 Thread GitBox
pvary commented on PR #6175: URL: https://github.com/apache/iceberg/pull/6175#issuecomment-1315286360 @lirui-apache: I am not sure how the Key/UGI equals/hashCode works, this could create funny situations. Maybe we could add another test case where ``` UserGroupInformation f

[GitHub] [iceberg] Fokko opened a new pull request, #6197: Python: Fix rough edges around literals

2022-11-15 Thread GitBox
Fokko opened a new pull request, #6197: URL: https://github.com/apache/iceberg/pull/6197 From https://github.com/apache/iceberg/pull/6141, for follow-up: > Fix types returned by to when AboveMax/BelowMin are returned Fixed - Add tests for binding with invalid conversions

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1022503190 ## python/pyiceberg/expressions/literals.py: ## @@ -108,7 +110,7 @@ def __ge__(self, other): @singledispatch -def literal(value) -> Literal: +def literal(value: Any)

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1022876945 ## python/pyiceberg/typedef.py: ## @@ -36,3 +38,7 @@ def update(self, *args: Any, **kwargs: Any) -> None: Identifier = Tuple[str, ...] Properties = Dict[str, str] Recu

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #5824: Spark: support hilbert curve when rewrite

2022-11-15 Thread GitBox
RussellSpitzer commented on code in PR #5824: URL: https://github.com/apache/iceberg/pull/5824#discussion_r1022950032 ## spark/v3.3/build.gradle: ## @@ -39,6 +39,7 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { } dependencies { +im

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #5824: Spark: support hilbert curve when rewrite

2022-11-15 Thread GitBox
ajantha-bhat commented on code in PR #5824: URL: https://github.com/apache/iceberg/pull/5824#discussion_r1022971182 ## spark/v3.3/build.gradle: ## @@ -39,6 +39,7 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { } dependencies { +impl

[GitHub] [iceberg] nastra commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
nastra commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1023007309 ## aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java: ## @@ -261,6 +267,8 @@ public class AwsProperties implements Serializable { public static final bool

[GitHub] [iceberg] nastra commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
nastra commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1023008382 ## open-api/s3-signer-open-api.yaml: ## @@ -0,0 +1,273 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. Se

[GitHub] [iceberg] stevenzwu merged pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-15 Thread GitBox
stevenzwu merged PR #6111: URL: https://github.com/apache/iceberg/pull/6111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] stevenzwu commented on pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-15 Thread GitBox
stevenzwu commented on PR #6111: URL: https://github.com/apache/iceberg/pull/6111#issuecomment-1315590390 thanks @lvyanquan for the contribution and @hililiwei and @pvary for the reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [iceberg] nastra commented on a diff in pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
nastra commented on code in PR #6169: URL: https://github.com/apache/iceberg/pull/6169#discussion_r1023039258 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -0,0 +1,328 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [iceberg] sunchao commented on a diff in pull request #6163: Core: Method for building grouping key type

2022-11-15 Thread GitBox
sunchao commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1023031018 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,75 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { } /

[GitHub] [iceberg] jackye1995 commented on pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
jackye1995 commented on PR #6169: URL: https://github.com/apache/iceberg/pull/6169#issuecomment-1315625210 1 general question regarding this PR, @nastra @rdblue @danielcweeks this is a feature very specific to AWS S3. What is the general guideline in the community for adding this as a part

[GitHub] [iceberg] nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec URL: https://github.com/apache/iceberg/pull/6169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] szehon-ho commented on pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-11-15 Thread GitBox
szehon-ho commented on PR #5376: URL: https://github.com/apache/iceberg/pull/5376#issuecomment-1315643542 Transitive error downloading, restarting -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [iceberg] szehon-ho closed pull request #5376: Core: Add readable metrics columns to files metadata tables

2022-11-15 Thread GitBox
szehon-ho closed pull request #5376: Core: Add readable metrics columns to files metadata tables URL: https://github.com/apache/iceberg/pull/5376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] szehon-ho opened a new pull request, #5376: Core: Add readable metrics columns to files metadata tables

2022-11-15 Thread GitBox
szehon-ho opened a new pull request, #5376: URL: https://github.com/apache/iceberg/pull/5376 Closes #4362 This adds following columns to all files tables: - readable_metrics, which is struct of: - column_sizes - value_counts - null_value_counts - nan_value_counts

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building grouping key type

2022-11-15 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1023081485 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -195,41 +198,75 @@ public Void alwaysNull(int fieldId, String sourceName, int sourceId) { }

[GitHub] [iceberg] nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-15 Thread GitBox
nastra closed pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec URL: https://github.com/apache/iceberg/pull/6169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] singhpk234 commented on issue #6196: How to use equality delete in Iceberg v2 table

2022-11-15 Thread GitBox
singhpk234 commented on issue #6196: URL: https://github.com/apache/iceberg/issues/6196#issuecomment-1315672501 writing Equality deletes are not supported in spark as of now, I think only flink supports it at the moment -- This is an automated message from the Apache Git Service. To r

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6163: Core: Method for building grouping key type

2022-11-15 Thread GitBox
aokolnychyi commented on code in PR #6163: URL: https://github.com/apache/iceberg/pull/6163#discussion_r1023102332 ## core/src/main/java/org/apache/iceberg/Partitioning.java: ## @@ -298,4 +331,37 @@ private static boolean compatibleTransforms(Transform t1, Transform ||

[GitHub] [iceberg] aokolnychyi merged pull request #6163: Core: Method for building grouping key type

2022-11-15 Thread GitBox
aokolnychyi merged PR #6163: URL: https://github.com/apache/iceberg/pull/6163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] aokolnychyi commented on pull request #6163: Core: Method for building grouping key type

2022-11-15 Thread GitBox
aokolnychyi commented on PR #6163: URL: https://github.com/apache/iceberg/pull/6163#issuecomment-1315675501 Thanks for reviewing, @RussellSpitzer @sunchao! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [iceberg] ahshahid opened a new issue, #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid opened a new issue, #6198: URL: https://github.com/apache/iceberg/issues/6198 ### Feature Request / Improvement Enabling this flag would allow non partition column stats (lower & upper bounds especially) to be available in the Manifest file, which can aid in pruning for ran

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #5824: Spark: support hilbert curve when rewrite

2022-11-15 Thread GitBox
RussellSpitzer commented on code in PR #5824: URL: https://github.com/apache/iceberg/pull/5824#discussion_r1023194843 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SparkSpaceCurveUDF.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #5824: Spark: support hilbert curve when rewrite

2022-11-15 Thread GitBox
RussellSpitzer commented on code in PR #5824: URL: https://github.com/apache/iceberg/pull/5824#discussion_r1023195403 ## spark/v3.3/build.gradle: ## @@ -39,6 +39,7 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { } dependencies { +im

[GitHub] [iceberg] RussellSpitzer commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
RussellSpitzer commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315787552 What exactly are your commenting on? By default we record all metrics for all columns unless there is a high number of columns. See https://github.com/apache/iceberg/bl

[GitHub] [iceberg] LuigiCerone commented on a diff in pull request #6159: Python: Update mypy version

2022-11-15 Thread GitBox
LuigiCerone commented on code in PR #6159: URL: https://github.com/apache/iceberg/pull/6159#discussion_r1023218766 ## python/pyiceberg/catalog/__init__.py: ## @@ -120,16 +120,16 @@ def load_catalog(name: str, **properties: Optional[str]) -> Catalog: or if it could

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023219441 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +65,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023220273 ## python/pyiceberg/expressions/literals.py: ## @@ -110,11 +106,9 @@ def __ge__(self, other) -> bool: return self.value >= other.value -@singledispatch Revi

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023228294 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +65,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023269577 ## python/pyiceberg/expressions/literals.py: ## @@ -356,18 +326,26 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: return DecimalLiteral(Decimal(self

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023271140 ## python/pyiceberg/expressions/literals.py: ## @@ -356,18 +326,26 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: return DecimalLiteral(Decimal(self.

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023271140 ## python/pyiceberg/expressions/literals.py: ## @@ -356,18 +326,26 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: return DecimalLiteral(Decimal(self.

[GitHub] [iceberg-docs] dipankarmazumdar opened a new pull request, #177: Docs: Add 5 new blogs from Dremio

2022-11-15 Thread GitBox
dipankarmazumdar opened a new pull request, #177: URL: https://github.com/apache/iceberg-docs/pull/177 This PR adds 5 recent blogs we've written about iceberg to the blogs page: - https://www.dremio.com/subsurface/compaction-in-apache-iceberg-fine-tuning-your-iceberg-tables-data-files

[GitHub] [iceberg] ahshahid commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315884814 What I noticed was that by default when iceberg format files are generated, the manifest file does not contain lower bound/ upper bound info for non-partition columns. For **no

[GitHub] [iceberg] ahshahid commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315888445 @RussellSpitzer Pardon my ignorance, I am little new to iceberg, but what I noticed was that because these stats were not present in Manifest file for non-partition-columns, the ra

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023280537 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +65,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] RussellSpitzer commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
RussellSpitzer commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315900993 Do you have a repo of this? Here is an example of creating a table without partitions, then reading the upper bounds of the column from the metadata table which only rea

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023297504 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +65,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] RussellSpitzer commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
RussellSpitzer commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315922002 I think I understand what you are asking about, not the Manifests but the ManifestList files. The manifest list files only store metrics for partitions since they contain entr

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-15 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1023297504 ## python/pyiceberg/expressions/__init__.py: ## @@ -48,12 +65,13 @@ class Bound(ABC): """Represents a bound value expression""" -class Unbound(Generic[B], ABC):

[GitHub] [iceberg] ahshahid commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315953425 @RussellSpitzer . Thanks for responding... Initially I also spent a good time to figure out which data structre was holding entries in ManifestList and which was holding data of Man

[GitHub] [iceberg] ahshahid commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315968642 "The current thought process there is that a manifest will most likely contain many if not thousands of data file entries but will only contain data files for a few distinct partiti

[GitHub] [iceberg] ahshahid commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
ahshahid commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1315971510 "Because it will include so many datafiles, possibly all of those for a partition, storing metric bounds for non partition columns isn't vary useful since it would be the entire ran

[GitHub] [iceberg] danielcweeks commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-15 Thread GitBox
danielcweeks commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1316034660 > @danielcweeks thx for your last round of review. I agree with you that UGI.currentUser is a better default owner, though I plan to make that change in a separate follow up PR. (See

[GitHub] [iceberg] RussellSpitzer commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
RussellSpitzer commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1316063411 The manifest contains DataFile entries, each of which contains all upper and lower bounds of the data files which were persisted in the manifest. -- This is an automated mes

[GitHub] [iceberg] github-actions[bot] commented on issue #3705: Explore spark struct streaming write iceberg and synchronize to hive Metastore

2022-11-15 Thread GitBox
github-actions[bot] commented on issue #3705: URL: https://github.com/apache/iceberg/issues/3705#issuecomment-1316066223 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] RussellSpitzer commented on issue #6198: colStats flag in TableContext remains false except in situation where delete files are present

2022-11-15 Thread GitBox
RussellSpitzer commented on issue #6198: URL: https://github.com/apache/iceberg/issues/6198#issuecomment-1316074552 To fully go over the path First you read a manifest list The manifest contains a list of manifests The partition filter is compared to each manifests upper and lowe

[GitHub] [iceberg] wypoon commented on issue #6042: Add delete file information to partitions table

2022-11-15 Thread GitBox
wypoon commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1316220819 I'm trying to understand the proposed behavior. To go back to @ajantha-bhat's example: Suppose you have a partition `{A}` with `record_count`=6 and `file_count`=2 (3 records in each

[GitHub] [iceberg] aokolnychyi opened a new pull request, #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-15 Thread GitBox
aokolnychyi opened a new pull request, #6199: URL: https://github.com/apache/iceberg/pull/6199 While working on time-related functions in the Spark function catalog, I wanted to use `Transform`. However, calling transforms on primitive values would require unnecessary boxing. That's why I d

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-15 Thread GitBox
aokolnychyi commented on code in PR #6199: URL: https://github.com/apache/iceberg/pull/6199#discussion_r1023481066 ## api/src/main/java/org/apache/iceberg/transforms/Dates.java: ## @@ -50,24 +48,19 @@ public Integer apply(Integer days) { return null; } - i

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-15 Thread GitBox
aokolnychyi commented on code in PR #6199: URL: https://github.com/apache/iceberg/pull/6199#discussion_r1023481446 ## api/src/main/java/org/apache/iceberg/util/DateTimeUtil.java: ## @@ -133,4 +134,56 @@ public static long isoTimestampToMicros(String timestampString) { retu

[GitHub] [iceberg] aokolnychyi commented on pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-15 Thread GitBox
aokolnychyi commented on PR #6199: URL: https://github.com/apache/iceberg/pull/6199#issuecomment-1316294974 @kbendick @rdblue @RussellSpitzer @szehon-ho @flyrain, could you take a look at this PR? I need to use these conversions in our Spark function catalog. -- This is an automated messa

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-15 Thread GitBox
aokolnychyi commented on code in PR #6199: URL: https://github.com/apache/iceberg/pull/6199#discussion_r1023482889 ## api/src/main/java/org/apache/iceberg/util/DateTimeUtil.java: ## @@ -133,4 +134,56 @@ public static long isoTimestampToMicros(String timestampString) { retu

[GitHub] [iceberg] ajantha-bhat commented on issue #6042: Add delete file information to partitions table

2022-11-15 Thread GitBox
ajantha-bhat commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1316362986 > But what about record_count and file_count? Will file_count be 3 (is it supposed to be the total number of data files, including delete files)? And record_count? When is it p

[GitHub] [iceberg] lirui-apache commented on pull request #6175: Hive: Add UGI to the key in CachedClientPool

2022-11-15 Thread GitBox
lirui-apache commented on PR #6175: URL: https://github.com/apache/iceberg/pull/6175#issuecomment-1316470746 @pvary That's a good question. Unfortunately the test case you mentioned won't work. I looked into how UGI implements equals/hashCode and found the behavior is intentional [1]. Hadoo

[GitHub] [iceberg] nastra commented on a diff in pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-16 Thread GitBox
nastra commented on code in PR #6199: URL: https://github.com/apache/iceberg/pull/6199#discussion_r1023681076 ## api/src/main/java/org/apache/iceberg/util/DateTimeUtil.java: ## @@ -133,4 +134,56 @@ public static long isoTimestampToMicros(String timestampString) { return mi

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #6199: API, Core: Move micros and days conversions to DateTimeUtil

2022-11-16 Thread GitBox
szehon-ho commented on code in PR #6199: URL: https://github.com/apache/iceberg/pull/6199#discussion_r1023728546 ## api/src/main/java/org/apache/iceberg/util/DateTimeUtil.java: ## @@ -133,4 +134,56 @@ public static long isoTimestampToMicros(String timestampString) { return

[GitHub] [iceberg] szehon-ho commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-16 Thread GitBox
szehon-ho commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1316669286 @danielcweeks @gaborkaszab Make sense , for this patch we should go with user.name then on Database side, to be consistent with Table? Then switch both after 1.1? Probably makes most

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4627: Parquet: Fixes get null values for the nested field partition column

2022-11-16 Thread GitBox
szehon-ho commented on code in PR #4627: URL: https://github.com/apache/iceberg/pull/4627#discussion_r1023069744 ## parquet/src/main/java/org/apache/iceberg/data/parquet/BaseParquetReaders.java: ## @@ -149,11 +153,14 @@ public ParquetValueReader struct(Types.StructType expected

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4627: Parquet: Fixes get null values for the nested field partition column

2022-11-16 Thread GitBox
szehon-ho commented on code in PR #4627: URL: https://github.com/apache/iceberg/pull/4627#discussion_r1023767954 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java: ## @@ -116,9 +120,45 @@ public void setPageSource(PageReadStore pageStore, long rowPosi

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4627: Parquet: Fixes get null values for the nested field partition column

2022-11-16 Thread GitBox
szehon-ho commented on code in PR #4627: URL: https://github.com/apache/iceberg/pull/4627#discussion_r1023767954 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java: ## @@ -116,9 +120,45 @@ public void setPageSource(PageReadStore pageStore, long rowPosi

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4627: Parquet: Fixes get null values for the nested field partition column

2022-11-16 Thread GitBox
szehon-ho commented on code in PR #4627: URL: https://github.com/apache/iceberg/pull/4627#discussion_r1023773443 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java: ## @@ -116,9 +120,45 @@ public void setPageSource(PageReadStore pageStore, long rowPosi

[GitHub] [iceberg] lvyanquan commented on pull request #6043: Core: Partial Update

2022-11-16 Thread GitBox
lvyanquan commented on PR #6043: URL: https://github.com/apache/iceberg/pull/6043#issuecomment-1316739131 Our business scenarios often require adding table fields, and the original update method will have a lot of duplicate content. I read and tested these codes, which meet our expectations

[GitHub] [iceberg] pvary commented on pull request #4627: Parquet: Fixes get null values for the nested field partition column

2022-11-16 Thread GitBox
pvary commented on PR #4627: URL: https://github.com/apache/iceberg/pull/4627#issuecomment-1316862982 @ConeyLiu: Thanks for the finding and the fix. May I ask you to put the fix in the main branches for Spark (3.3) and Flink (1.16) first, and then with another PR we can backport to all of

[GitHub] [iceberg-docs] Fokko commented on pull request #177: Docs: Add 5 new blogs from Dremio

2022-11-16 Thread GitBox
Fokko commented on PR #177: URL: https://github.com/apache/iceberg-docs/pull/177#issuecomment-1317172326 Thanks for creating this PR @dipankarmazumdar. Looks like you're on fire with the blogs 🔥 Could you rebase the conflicts? -- This is an automated message from the Apache Git Service. T

[GitHub] [iceberg] jebnix opened a new pull request, #6202: [REST] Add documentation for the REST Catalog specificiation

2022-11-16 Thread GitBox
jebnix opened a new pull request, #6202: URL: https://github.com/apache/iceberg/pull/6202 Currently, it is pretty confusing and hard to understand what exactly is the Iceberg REST Catalog and why it exists / how to use it. This page offers the start of documenting it, even though it lacks o

[GitHub] [iceberg] rdblue merged pull request #6187: Revert "Hive: Forward catalog-specific Hive configuration properties …

2022-11-16 Thread GitBox
rdblue merged PR #6187: URL: https://github.com/apache/iceberg/pull/6187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6187: Revert "Hive: Forward catalog-specific Hive configuration properties …

2022-11-16 Thread GitBox
rdblue commented on PR #6187: URL: https://github.com/apache/iceberg/pull/6187#issuecomment-1317262605 Thanks, @pavibhai! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] rdblue merged pull request #6200: Core: Add time zone info to LocalDate in ExpressionUtil tests

2022-11-16 Thread GitBox
rdblue merged PR #6200: URL: https://github.com/apache/iceberg/pull/6200 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6200: Core: Add time zone info to LocalDate in ExpressionUtil tests

2022-11-16 Thread GitBox
rdblue commented on PR #6200: URL: https://github.com/apache/iceberg/pull/6200#issuecomment-1317262970 Thanks, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024209550 ## python/pyiceberg/expressions/literals.py: ## @@ -356,18 +326,26 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: return DecimalLiteral(Decimal(self

[GitHub] [iceberg] rdblue commented on a diff in pull request #6197: Python: Fix rough edges around literals

2022-11-16 Thread GitBox
rdblue commented on code in PR #6197: URL: https://github.com/apache/iceberg/pull/6197#discussion_r1024213544 ## python/pyiceberg/utils/datetime.py: ## @@ -81,13 +81,19 @@ def timestamp_to_micros(timestamp_str: str) -> int: """Converts an ISO-9601 formatted timestamp withou

[GitHub] [iceberg] danielcweeks commented on pull request #6169: AWS,Core: Add S3 REST Signer client + REST Spec

2022-11-16 Thread GitBox
danielcweeks commented on PR #6169: URL: https://github.com/apache/iceberg/pull/6169#issuecomment-1317273978 > 1 general question regarding this PR, @nastra @rdblue @danielcweeks this is a feature very specific to AWS S3. What is the general guideline in the community for adding this as a p

[GitHub] [iceberg] rdblue commented on pull request #6197: Python: Fix rough edges around literals

2022-11-16 Thread GitBox
rdblue commented on PR #6197: URL: https://github.com/apache/iceberg/pull/6197#issuecomment-1317275633 > Do you mean when you run Python on a 32bit computer? This will reduce the precision. What I mean is that Python `int` can contain values that are larger than 64 bits because it us

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024223275 ## python/pyiceberg/typedef.py: ## @@ -14,13 +14,16 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limita

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r102476 ## python/pyiceberg/expressions/visitors.py: ## @@ -60,125 +60,127 @@ PrimitiveType, ) +R = TypeVar("R") -class BooleanExpressionVisitor(Generic[T], ABC): + +c

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024223922 ## python/pyproject.toml: ## @@ -55,6 +55,8 @@ pyyaml = "6.0.0" pydantic = "1.10.2" fsspec = "2022.10.0" +typing-extensions = 'typing-extensions==4.4.0' Review Comm

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024224914 ## python/pyiceberg/utils/datetime.py: ## @@ -38,6 +38,11 @@ def micros_to_days(timestamp: int) -> int: return timedelta(microseconds=timestamp).days +def micro

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024226741 ## python/pyiceberg/expressions/literals.py: ## @@ -53,24 +49,27 @@ UUIDType, ) from pyiceberg.utils.datetime import ( -date_str_to_days, +date_str_to_dat

[GitHub] [iceberg] rdblue merged pull request #6201: REST: Assign metadata UUID on create transaction

2022-11-16 Thread GitBox
rdblue merged PR #6201: URL: https://github.com/apache/iceberg/pull/6201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] haizhou-zhao commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-16 Thread GitBox
haizhou-zhao commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1317323654 Thank you all! In that case, I'll raise a PR for UGI change later and we can merge after 1.1 release. Let me know if there's more I can do for this PR. -- This is an automat

[GitHub] [iceberg] rdblue commented on a diff in pull request #6012: Spark 3.3: Add a procedure to generate table changes

2022-11-16 Thread GitBox
rdblue commented on code in PR #6012: URL: https://github.com/apache/iceberg/pull/6012#discussion_r1024278647 ## spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/GenerateChangesProcedure.java: ## @@ -0,0 +1,210 @@ +/* + * Licensed to the Apache Software Foundat

[GitHub] [iceberg] szehon-ho commented on pull request #6045: [iceberg-hive-metastore] Support setting individual and group ownership for Namespace

2022-11-16 Thread GitBox
szehon-ho commented on PR #6045: URL: https://github.com/apache/iceberg/pull/6045#issuecomment-1317390253 @gaborkaszab sorry i know we are super close, but do you think we should change both to use UserGroupInformation quickly before cutting final 1.1 RC? That way no need to change behavio

[GitHub] [iceberg-docs] willshen commented on pull request #176: Fix broken Spark 2.4 and 3.0 runtime jar links on the Releases page

2022-11-16 Thread GitBox
willshen commented on PR #176: URL: https://github.com/apache/iceberg-docs/pull/176#issuecomment-1317397220 @Fokko/ @rdblue the current releases page is broken, and this PR should help fix that - do you mind taking a look at it? Thank you! -- This is an automated message from the Apache

[GitHub] [iceberg-docs] Fokko merged pull request #176: Fix broken Spark 2.4 and 3.0 runtime jar links on the Releases page

2022-11-16 Thread GitBox
Fokko merged PR #176: URL: https://github.com/apache/iceberg-docs/pull/176 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024322147 ## python/tests/expressions/test_visitors.py: ## @@ -339,61 +327,61 @@ def test_always_false_or_always_true_expression_binding(table_schema_simple: Sch [

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024329290 ## python/pyiceberg/expressions/literals.py: ## @@ -356,18 +326,26 @@ def _(self, type_var: DecimalType) -> Literal[Decimal]: return DecimalLiteral(Decimal(self.

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024330356 ## python/pyiceberg/expressions/visitors.py: ## @@ -60,125 +60,127 @@ PrimitiveType, ) +R = TypeVar("R") -class BooleanExpressionVisitor(Generic[T], ABC): + +cl

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024331995 ## python/pyiceberg/typedef.py: ## @@ -14,13 +14,16 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitat

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024332336 ## python/pyproject.toml: ## @@ -55,6 +55,8 @@ pyyaml = "6.0.0" pydantic = "1.10.2" fsspec = "2022.10.0" +typing-extensions = 'typing-extensions==4.4.0' Review Comme

[GitHub] [iceberg-docs] dipankarmazumdar commented on pull request #177: Docs: Add 5 new blogs from Dremio

2022-11-16 Thread GitBox
dipankarmazumdar commented on PR #177: URL: https://github.com/apache/iceberg-docs/pull/177#issuecomment-1317517911 > Thanks for creating this PR @dipankarmazumdar. Looks like you're on fire with the blogs 🔥 Could you rebase the conflicts? @Fokko - thanks! I resolved the conflicts. Se

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024455885 ## python/pyiceberg/utils/datetime.py: ## @@ -38,6 +38,11 @@ def micros_to_days(timestamp: int) -> int: return timedelta(microseconds=timestamp).days +def micros

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024331995 ## python/pyiceberg/typedef.py: ## @@ -14,13 +14,16 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitat

[GitHub] [iceberg] Fokko commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
Fokko commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024456725 ## python/pyiceberg/typedef.py: ## @@ -14,13 +14,16 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitat

[GitHub] [iceberg] Fokko opened a new issue, #6203: Python: Add support for `{AboveMin,AboveMax}` for `{int,float,long}` columns

2022-11-16 Thread GitBox
Fokko opened a new issue, #6203: URL: https://github.com/apache/iceberg/issues/6203 ### Feature Request / Improvement When we query an integer column that is `EqualTo('float', 3.4028235e38+1)` we convert it into an `AboveMax`. We can use this information to turn the predicate in an `

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024479325 ## python/pyiceberg/expressions/__init__.py: ## @@ -194,85 +251,108 @@ def __new__(cls, child: BooleanExpression): return AlwaysTrue() elif isinsta

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024480285 ## python/pyiceberg/expressions/__init__.py: ## @@ -281,249 +361,313 @@ def __invert__(self) -> BoundIsNull: return BoundIsNull(self.term) -@dataclass(froze

[GitHub] [iceberg] rdblue commented on a diff in pull request #6139: Python: Remove dataclasses

2022-11-16 Thread GitBox
rdblue commented on code in PR #6139: URL: https://github.com/apache/iceberg/pull/6139#discussion_r1024481680 ## python/pyiceberg/expressions/__init__.py: ## @@ -281,249 +361,313 @@ def __invert__(self) -> BoundIsNull: return BoundIsNull(self.term) -@dataclass(froze

<    16   17   18   19   20   21   22   23   24   25   >