[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1015101336 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -799,6 +799,23 @@ public void testDropTable() { Assert.assertFalse("Table should not exist

[GitHub] [iceberg] chenwyi2 opened a new issue, #6136: if i lost a metadata file, how to recover

2022-11-06 Thread GitBox
chenwyi2 opened a new issue, #6136: URL: https://github.com/apache/iceberg/issues/6136 ### Query engine spark: 3.1 iceberg: 0.14.1 ### Question the situation is, i had a table partition by date, and i deleted a old manifest file from hdfs without moving to trash, and

[GitHub] [iceberg] hililiwei commented on pull request #6075: Flink 1.15: Support change log scan task

2022-11-06 Thread GitBox
hililiwei commented on PR #6075: URL: https://github.com/apache/iceberg/pull/6075#issuecomment-1305209915 @stevenzwu could you please take a look at it when you get a chance? thx. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [iceberg] Fokko commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
Fokko commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1015087384 ## python/pyiceberg/expressions/visitors.py: ## @@ -517,7 +519,7 @@ def visit_equal(self, term: BoundTerm, literal: Literal[Any]) -> bool: pos = term.ref().acce

[GitHub] [iceberg] hililiwei commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-06 Thread GitBox
hililiwei commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1015087242 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] nastra commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2022-11-06 Thread GitBox
nastra commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1015080916 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -21,6 +21,23 @@ /** API for configuring an incremental scan. */ public interface IncrementalScan>

[GitHub] [iceberg] nastra commented on a diff in pull request #6113: Core: Reduce code duplication around writing JSON collections

2022-11-06 Thread GitBox
nastra commented on code in PR #6113: URL: https://github.com/apache/iceberg/pull/6113#discussion_r1015078831 ## core/src/main/java/org/apache/iceberg/util/JsonUtil.java: ## @@ -251,6 +252,11 @@ public static Set getIntegerSet(String property, JsonNode node) { .build()

[GitHub] [iceberg] nastra commented on a diff in pull request #6113: Core: Reduce code duplication around writing JSON collections

2022-11-06 Thread GitBox
nastra commented on code in PR #6113: URL: https://github.com/apache/iceberg/pull/6113#discussion_r1015076718 ## core/src/main/java/org/apache/iceberg/util/JsonUtil.java: ## @@ -374,4 +380,40 @@ void validate(JsonNode element) { element); } } + + public stati

[GitHub] [iceberg] Fokko commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
Fokko commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1015064168 ## python/pyiceberg/expressions/__init__.py: ## @@ -68,7 +72,6 @@ def eval(self, struct: StructProtocol) -> T: # pylint: disable=W0613 """Returns the value at

[GitHub] [iceberg] hendrikmakait opened a new pull request, #6135: Use pythonic `len()` built-in instead of `length` property

2022-11-06 Thread GitBox
hendrikmakait opened a new pull request, #6135: URL: https://github.com/apache/iceberg/pull/6135 * Replaces `.length` property with the pythonic `__len__` method on `FixedReader` and `FixedType` to enable use of `len()` built-in. -- This is an automated message from the Apache Git Service

[GitHub] [iceberg] ajantha-bhat commented on pull request #6090: Core: Handle statistics file clean up from expireSnapshots

2022-11-06 Thread GitBox
ajantha-bhat commented on PR #6090: URL: https://github.com/apache/iceberg/pull/6090#issuecomment-1305146273 > @findepi: Thinking more about this, As the TableMetadata has just the list of StatisticsFile. And you have mentioned, statisticsFile.snapshotId() is "ID of the Iceberg table's snap

[GitHub] [iceberg] wang-x-xia closed pull request #6132: [0.14] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
wang-x-xia closed pull request #6132: [0.14] Dell: Fix client serialization bug. URL: https://github.com/apache/iceberg/pull/6132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] wang-x-xia commented on pull request #6132: [0.14] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
wang-x-xia commented on PR #6132: URL: https://github.com/apache/iceberg/pull/6132#issuecomment-1305142658 @ajantha-bhat Thanks! I'll close this PR~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] ajantha-bhat commented on pull request #6133: [1.0] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
ajantha-bhat commented on PR #6133: URL: https://github.com/apache/iceberg/pull/6133#issuecomment-1305140853 I think fixing only in the master branch is enough. As per my knowledge, this porting is unnecessary as Iceberg will not do a release for those branches. The next release is

[GitHub] [iceberg] ajantha-bhat commented on pull request #6132: [0.14] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
ajantha-bhat commented on PR #6132: URL: https://github.com/apache/iceberg/pull/6132#issuecomment-1305140708 I think fixing only in the master branch is enough. As per my knowledge, this porting is unnecessary as Iceberg will not do a release for those branches. The next release is

[GitHub] [iceberg] zhongyujiang commented on a diff in pull request #6118: Parquet, Core: Fix collection of Parquet metrics when column names co…

2022-11-06 Thread GitBox
zhongyujiang commented on code in PR #6118: URL: https://github.com/apache/iceberg/pull/6118#discussion_r1015035815 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetUtil.java: ## @@ -75,23 +76,27 @@ public static Metrics fileMetrics(InputFile file, MetricsConfig metri

[GitHub] [iceberg] ajantha-bhat commented on pull request #6094: Spark-3.0: Remove spark/v3.0 folder

2022-11-06 Thread GitBox
ajantha-bhat commented on PR #6094: URL: https://github.com/apache/iceberg/pull/6094#issuecomment-1305136186 cc: @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] zhongyujiang commented on pull request #6118: Parquet, Core: Fix collection of Parquet metrics when column names co…

2022-11-06 Thread GitBox
zhongyujiang commented on PR #6118: URL: https://github.com/apache/iceberg/pull/6118#issuecomment-1305083278 The failure seems unrelated to this PR: >* What went wrong: >Could not determine the dependencies of task ':iceberg-flink:iceberg-flink-runtime-1.16:shadowJar'. >See https:/

[GitHub] [iceberg] zhongyujiang commented on pull request #6118: Parquet, Core: Fix collection of Parquet metrics when column names co…

2022-11-06 Thread GitBox
zhongyujiang commented on PR #6118: URL: https://github.com/apache/iceberg/pull/6118#issuecomment-1305081026 @rdblue sure. When collecting metrics from Parquet footer, Iceberg [converts](https://github.com/apache/iceberg/blob/167a8ccd7c578296c40f8fc61c90135e71cf1183/parquet/src/main/java/

[GitHub] [iceberg] jzhuge commented on pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
jzhuge commented on PR #4925: URL: https://github.com/apache/iceberg/pull/4925#issuecomment-1305068781 Created #6134 to add the missing field `query-column-names` to SQL view representation in the view spec. -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [iceberg] jzhuge opened a new pull request, #6134: Spec: Add query-column-names to SQL view representation in view spec

2022-11-06 Thread GitBox
jzhuge opened a new pull request, #6134: URL: https://github.com/apache/iceberg/pull/6134 Current view spec misses the field `query-column-names` in SQL view representation. For SELECT star view queries, the schema for the underlying table or view may change after the view has been c

[GitHub] [iceberg] luoyuxia commented on issue #3124: When writing data to S3 using Glue Catalog, current snapshot ID is -1 and not updated in the metadata file generated

2022-11-06 Thread GitBox
luoyuxia commented on issue #3124: URL: https://github.com/apache/iceberg/issues/3124#issuecomment-1305028751 Have ever a checkpoint been done successfully? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [iceberg] wang-x-xia opened a new pull request, #6133: [1.0] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
wang-x-xia opened a new pull request, #6133: URL: https://github.com/apache/iceberg/pull/6133 From https://github.com/apache/iceberg/pull/5059. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] wang-x-xia opened a new pull request, #6132: [0.14] Dell: Fix client serialization bug.

2022-11-06 Thread GitBox
wang-x-xia opened a new pull request, #6132: URL: https://github.com/apache/iceberg/pull/6132 From https://github.com/apache/iceberg/pull/5059. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] luoyuxia commented on issue #3156: Flink reads iceberg in real time and reports errors

2022-11-06 Thread GitBox
luoyuxia commented on issue #3156: URL: https://github.com/apache/iceberg/issues/3156#issuecomment-1305008165 It fail when try to serialize `BaseCombinedScanTask`, I guss it may be fixed by #1285 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [iceberg] luoyuxia commented on issue #3009: How do I realize the upsert of flink sql through setting, in iceberg 0.12.0

2022-11-06 Thread GitBox
luoyuxia commented on issue #3009: URL: https://github.com/apache/iceberg/issues/3009#issuecomment-1305001100 Hi, you can refer to here for [upsert](https://iceberg.apache.org/docs/latest/flink/#upsert) -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-06 Thread GitBox
lvyanquan commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014938540 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] ajantha-bhat commented on pull request #6094: Spark-3.0: Remove spark/v3.0 folder

2022-11-06 Thread GitBox
ajantha-bhat commented on PR #6094: URL: https://github.com/apache/iceberg/pull/6094#issuecomment-1304974804 @rdblue: I have rebased this PR now. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

2022-11-06 Thread GitBox
lvyanquan commented on code in PR #6111: URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014937530 ## flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java: ## @@ -145,8 +145,27 @@ protected Catalog createCatalog( baseNamespace =

[GitHub] [iceberg] rdblue commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2022-11-06 Thread GitBox
rdblue commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1014926208 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -21,6 +21,23 @@ /** API for configuring an incremental scan. */ public interface IncrementalScan>

[GitHub] [iceberg] rdblue commented on a diff in pull request #5984: Core, API: Support incremental scanning with branch

2022-11-06 Thread GitBox
rdblue commented on code in PR #5984: URL: https://github.com/apache/iceberg/pull/5984#discussion_r1014926208 ## api/src/main/java/org/apache/iceberg/IncrementalScan.java: ## @@ -21,6 +21,23 @@ /** API for configuring an incremental scan. */ public interface IncrementalScan>

[GitHub] [iceberg] rdblue closed pull request #6017: Core, API: Field metadata support

2022-11-06 Thread GitBox
rdblue closed pull request #6017: Core, API: Field metadata support URL: https://github.com/apache/iceberg/pull/6017 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [iceberg] rdblue commented on pull request #6017: Core, API: Field metadata support

2022-11-06 Thread GitBox
rdblue commented on PR #6017: URL: https://github.com/apache/iceberg/pull/6017#issuecomment-1304952593 Closing this since there is discussion on the issue about whether it should be done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [iceberg] rdblue commented on pull request #5150: Spark Integration to read from Snapshot ref

2022-11-06 Thread GitBox
rdblue commented on PR #5150: URL: https://github.com/apache/iceberg/pull/5150#issuecomment-1304951951 This looks good to me. I'm rerunning CI since the failures don't look related to this. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [iceberg] rdblue commented on pull request #5150: Spark Integration to read from Snapshot ref

2022-11-06 Thread GitBox
rdblue commented on PR #5150: URL: https://github.com/apache/iceberg/pull/5150#issuecomment-1304951287 > I am unsure of how to proceed for branches and tags usecase. Just making changes to read from branch/tag in SparkScanBuilder worked before for previous versions of spark - 3.1, 3.2. But

[GitHub] [iceberg] rdblue commented on pull request #6117: Fix typo in `_ManifestEvalVisitor.visit_equal`

2022-11-06 Thread GitBox
rdblue commented on PR #6117: URL: https://github.com/apache/iceberg/pull/6117#issuecomment-1304947082 I reopened this because I think it's a good idea to get it in independently. Thanks for finding this, @ddrinka! -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [iceberg] rdblue commented on pull request #6123: Python: Support creating a DateLiteral from a date (#6120)

2022-11-06 Thread GitBox
rdblue commented on PR #6123: URL: https://github.com/apache/iceberg/pull/6123#issuecomment-1304946218 Thanks, @ddrinka! I merged this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [iceberg] rdblue closed issue #6120: [Python] The structure of a partition definition and partition instance should be consistent

2022-11-06 Thread GitBox
rdblue closed issue #6120: [Python] The structure of a partition definition and partition instance should be consistent URL: https://github.com/apache/iceberg/issues/6120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [iceberg] rdblue merged pull request #6123: Python: Support creating a DateLiteral from a date (#6120)

2022-11-06 Thread GitBox
rdblue merged PR #6123: URL: https://github.com/apache/iceberg/pull/6123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on a diff in pull request #6123: Python: Support creating a DateLiteral from a date (#6120)

2022-11-06 Thread GitBox
rdblue commented on code in PR #6123: URL: https://github.com/apache/iceberg/pull/6123#discussion_r1014922674 ## python/pyiceberg/utils/datetime.py: ## @@ -47,11 +47,16 @@ def micros_to_time(micros: int) -> time: return time(hour=hours, minute=minutes, second=seconds, micr

[GitHub] [iceberg] github-actions[bot] commented on issue #3825: Need an exmple of java code that uses hive as meta store and s3.

2022-11-06 Thread GitBox
github-actions[bot] commented on issue #3825: URL: https://github.com/apache/iceberg/issues/3825#issuecomment-1304938592 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] github-actions[bot] closed issue #3788: flink iceberg catalog support hadoop-conf-dir option to read hadoop conf?

2022-11-06 Thread GitBox
github-actions[bot] closed issue #3788: flink iceberg catalog support hadoop-conf-dir option to read hadoop conf? URL: https://github.com/apache/iceberg/issues/3788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [iceberg] github-actions[bot] closed issue #3825: Need an exmple of java code that uses hive as meta store and s3.

2022-11-06 Thread GitBox
github-actions[bot] closed issue #3825: Need an exmple of java code that uses hive as meta store and s3. URL: https://github.com/apache/iceberg/issues/3825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [iceberg] github-actions[bot] commented on issue #3788: flink iceberg catalog support hadoop-conf-dir option to read hadoop conf?

2022-11-06 Thread GitBox
github-actions[bot] commented on issue #3788: URL: https://github.com/apache/iceberg/issues/3788#issuecomment-1304938610 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

[GitHub] [iceberg] rdblue commented on pull request #6108: SparkBatchQueryScan logs too much - #6106

2022-11-06 Thread GitBox
rdblue commented on PR #6108: URL: https://github.com/apache/iceberg/pull/6108#issuecomment-1304936852 Thanks, @Omega359! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] rdblue closed issue #6106: SparkBatchQueryScan logs too much

2022-11-06 Thread GitBox
rdblue closed issue #6106: SparkBatchQueryScan logs too much URL: https://github.com/apache/iceberg/issues/6106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [iceberg] rdblue merged pull request #6108: SparkBatchQueryScan logs too much - #6106

2022-11-06 Thread GitBox
rdblue merged PR #6108: URL: https://github.com/apache/iceberg/pull/6108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on PR #6128: URL: https://github.com/apache/iceberg/pull/6128#issuecomment-1304935954 > Removes the dataclasses from the expressions I hit the same issue in #6127 and fixed it a different way. We should talk about how to do this separately, but I think my update might

[GitHub] [iceberg] rdblue commented on pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on PR #6128: URL: https://github.com/apache/iceberg/pull/6128#issuecomment-1304935137 This is looking great. There are just two issues: 1. The date/time transforms should use the same logic as truncate for numbers since they're basically truncating 2. I don't think tha

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014917239 ## python/tests/expressions/test_expressions.py: ## @@ -269,23 +283,23 @@ def test_bind_not_in_equal_term(table_schema_simple: Schema): def test_in_empty(): -a

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014917102 ## python/pyiceberg/transforms.py: ## @@ -249,6 +294,20 @@ def satisfies_order_of(self, other: Transform) -> bool: def result_type(self, source: IcebergType) -> Ice

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014916582 ## python/pyiceberg/transforms.py: ## @@ -427,6 +486,20 @@ def can_transform(self, source: IcebergType) -> bool: TimestamptzType, } +def proj

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014916040 ## python/pyiceberg/transforms.py: ## @@ -173,6 +201,23 @@ def apply(self, value: Optional[S]) -> Optional[int]: def result_type(self, source: IcebergType) -> Icebe

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014915923 ## python/pyiceberg/expressions/literals.py: ## @@ -213,6 +217,26 @@ class LongLiteral(Literal[int]): def __init__(self, value: int): super().__init__(valu

[GitHub] [iceberg] rdblue commented on a diff in pull request #6128: Python: Projection

2022-11-06 Thread GitBox
rdblue commented on code in PR #6128: URL: https://github.com/apache/iceberg/pull/6128#discussion_r1014915514 ## python/pyiceberg/expressions/__init__.py: ## @@ -68,7 +72,6 @@ def eval(self, struct: StructProtocol) -> T: # pylint: disable=W0613 """Returns the value at

[GitHub] [iceberg] rdblue commented on a diff in pull request #6131: Python: Add initial TableScan implementation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6131: URL: https://github.com/apache/iceberg/pull/6131#discussion_r1014914272 ## python/pyiceberg/table/__init__.py: ## @@ -90,3 +103,90 @@ def snapshot_by_name(self, name: str) -> Optional[Snapshot]: def history(self) -> List[SnapshotLogEntr

[GitHub] [iceberg] rdblue commented on a diff in pull request #6131: Python: Add initial TableScan implementation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6131: URL: https://github.com/apache/iceberg/pull/6131#discussion_r1014914176 ## python/pyiceberg/table/__init__.py: ## @@ -14,30 +14,43 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and

[GitHub] [iceberg] rdblue commented on pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on PR #6069: URL: https://github.com/apache/iceberg/pull/6069#issuecomment-1304927324 @Fokko, @dhruv-pratap, I posted an alternative scan API in a draft as #6131. Please take a look. That behaves like this one and allows you to specify optional arguments when creating a sca

[GitHub] [iceberg] rdblue opened a new pull request, #6131: Python: Add initial TableScan implementation

2022-11-06 Thread GitBox
rdblue opened a new pull request, #6131: URL: https://github.com/apache/iceberg/pull/6131 This adds an implementation of `TableScan` that is an alternative to the one in #6069. This doesn't implement `plan_files`, it is just to demonstrate a possible scan API. This scan API works lik

[GitHub] [iceberg] huaxingao commented on pull request #6065: Fix TestAggregateBinding

2022-11-06 Thread GitBox
huaxingao commented on PR #6065: URL: https://github.com/apache/iceberg/pull/6065#issuecomment-1304926068 Thanks! @rdblue @nastra @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014904027 ## python/pyiceberg/table/scan.py: ## @@ -0,0 +1,103 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014903928 ## python/pyiceberg/table/scan.py: ## @@ -0,0 +1,103 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014903835 ## python/pyiceberg/table/scan.py: ## @@ -0,0 +1,103 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014903669 ## python/tests/table/test_scan.py: ## @@ -0,0 +1,177 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See th

[GitHub] [iceberg] jzhuge commented on a diff in pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
jzhuge commented on code in PR #4925: URL: https://github.com/apache/iceberg/pull/4925#discussion_r1014903529 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
stevenzwu commented on code in PR #4925: URL: https://github.com/apache/iceberg/pull/4925#discussion_r1014899413 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
stevenzwu commented on code in PR #4925: URL: https://github.com/apache/iceberg/pull/4925#discussion_r1014899332 ## api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [iceberg] jzhuge commented on a diff in pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
jzhuge commented on code in PR #4925: URL: https://github.com/apache/iceberg/pull/4925#discussion_r1014898267 ## api/src/main/java/org/apache/iceberg/view/ViewBuilder.java: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] jzhuge commented on a diff in pull request #4925: API: Add view interfaces

2022-11-06 Thread GitBox
jzhuge commented on code in PR #4925: URL: https://github.com/apache/iceberg/pull/4925#discussion_r1014897964 ## api/src/main/java/org/apache/iceberg/view/ViewBuilder.java: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contri

[GitHub] [iceberg] ddrinka commented on a diff in pull request #6123: Python: Support creating a DateLiteral from a date (#6120)

2022-11-06 Thread GitBox
ddrinka commented on code in PR #6123: URL: https://github.com/apache/iceberg/pull/6123#discussion_r1014897183 ## python/pyiceberg/utils/datetime.py: ## @@ -47,11 +47,16 @@ def micros_to_time(micros: int) -> time: return time(hour=hours, minute=minutes, second=seconds, mic

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014895867 ## python/pyiceberg/expressions/visitors.py: ## @@ -526,7 +528,7 @@ def visit_equal(self, term: BoundTerm, literal: Literal[Any]) -> bool: if lower > literal.v

[GitHub] [iceberg] rdblue commented on a diff in pull request #6069: Python: TableScan Plan files API implementation without residual evaluation

2022-11-06 Thread GitBox
rdblue commented on code in PR #6069: URL: https://github.com/apache/iceberg/pull/6069#discussion_r1014895768 ## python/pyiceberg/expressions/visitors.py: ## @@ -517,7 +519,7 @@ def visit_equal(self, term: BoundTerm, literal: Literal[Any]) -> bool: pos = term.ref().acc

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014895355 ## core/src/test/java/org/apache/iceberg/rest/RESTCatalogAdapter.java: ## @@ -320,7 +321,10 @@ public T handleRequest( case DROP_TABLE: { - C

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014895276 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -222,8 +222,8 @@ public static LoadTableResponse createTable( throw new IllegalStateExcept

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014894952 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -219,7 +219,7 @@ private static Item doExecuteRequest( restClient.head(path, headers, o

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014894879 ## core/src/test/java/org/apache/iceberg/rest/RESTCatalogAdapter.java: ## @@ -320,7 +321,10 @@ public T handleRequest( case DROP_TABLE: { - C

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014894712 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -799,6 +799,23 @@ public void testDropTable() { Assert.assertFalse("Table should not exist

[GitHub] [iceberg] rdblue commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-11-06 Thread GitBox
rdblue commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1014894248 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -222,13 +222,27 @@ public static LoadTableResponse createTable( throw new IllegalStateExce

[GitHub] [iceberg] rdblue commented on pull request #6056: Parquet: Remove the row position since parquet row group has it natively

2022-11-06 Thread GitBox
rdblue commented on PR #6056: URL: https://github.com/apache/iceberg/pull/6056#issuecomment-1304896473 Do we trust this value from Parquet? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-11-06 Thread GitBox
rdblue commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1014893913 ## core/src/main/java/org/apache/iceberg/EnvironmentContext.java: ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more cont

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-11-06 Thread GitBox
rdblue commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1014893653 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -141,6 +142,8 @@ public CloseableIterable planFiles() { doPlanFiles(), () -> {

[GitHub] [iceberg] rdblue commented on pull request #6065: Fix TestAggregateBinding

2022-11-06 Thread GitBox
rdblue commented on PR #6065: URL: https://github.com/apache/iceberg/pull/6065#issuecomment-1304895651 Thanks, @huaxingao! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [iceberg] rdblue merged pull request #6065: Fix TestAggregateBinding

2022-11-06 Thread GitBox
rdblue merged PR #6065: URL: https://github.com/apache/iceberg/pull/6065 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue merged pull request #6064: Support 2-level list and maps type in RemoveIds.

2022-11-06 Thread GitBox
rdblue merged PR #6064: URL: https://github.com/apache/iceberg/pull/6064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6064: Support 2-level list and maps type in RemoveIds.

2022-11-06 Thread GitBox
rdblue commented on PR #6064: URL: https://github.com/apache/iceberg/pull/6064#issuecomment-1304895348 Looks good to me. Merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [iceberg] rdblue commented on pull request #6076: Python: Replace mmh3 with mmhash3

2022-11-06 Thread GitBox
rdblue commented on PR #6076: URL: https://github.com/apache/iceberg/pull/6076#issuecomment-1304894694 Thanks, @Fokko! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [iceberg] rdblue merged pull request #6076: Python: Replace mmh3 with mmhash3

2022-11-06 Thread GitBox
rdblue merged PR #6076: URL: https://github.com/apache/iceberg/pull/6076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6108: SparkBatchQueryScan logs too much - #6106

2022-11-06 Thread GitBox
rdblue commented on PR #6108: URL: https://github.com/apache/iceberg/pull/6108#issuecomment-1304894200 Running CI again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [iceberg] rdblue commented on pull request #6094: Spark-3.0: Remove spark/v3.0 folder

2022-11-06 Thread GitBox
rdblue commented on PR #6094: URL: https://github.com/apache/iceberg/pull/6094#issuecomment-1304894017 @ajantha-bhat, I merged #6093. Can you rebase this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [iceberg] rdblue merged pull request #6093: Spark-3.0: Remove/update spark-3.0 mention from Docs and Builds

2022-11-06 Thread GitBox
rdblue merged PR #6093: URL: https://github.com/apache/iceberg/pull/6093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on a diff in pull request #6093: Spark-3.0: Remove/update spark-3.0 mention from Docs and Builds

2022-11-06 Thread GitBox
rdblue commented on code in PR #6093: URL: https://github.com/apache/iceberg/pull/6093#discussion_r1014892191 ## docs/aws.md: ## @@ -488,7 +488,7 @@ disaster recovery, etc. For using cross-region access points, we need to additionally set `use-arn-region-enabled` catalog prope

[GitHub] [iceberg] rdblue commented on pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

2022-11-06 Thread GitBox
rdblue commented on PR #6110: URL: https://github.com/apache/iceberg/pull/6110#issuecomment-1304892916 Thanks, @fb913bf0de288ba84fe98f7a23d35edfdb22381! Looks great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [iceberg] rdblue merged pull request #6110: API: Hash floats -0.0 and 0.0 to the same bucket

2022-11-06 Thread GitBox
rdblue merged PR #6110: URL: https://github.com/apache/iceberg/pull/6110 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6118: Parquet, Core: Fix collection of Parquet metrics when column names co…

2022-11-06 Thread GitBox
rdblue commented on PR #6118: URL: https://github.com/apache/iceberg/pull/6118#issuecomment-1304892034 @zhongyujiang can you explain the fix a bit more clearly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] rdblue commented on a diff in pull request #6123: Python: Support creating a DateLiteral from a date (#6120)

2022-11-06 Thread GitBox
rdblue commented on code in PR #6123: URL: https://github.com/apache/iceberg/pull/6123#discussion_r1014891415 ## python/pyiceberg/utils/datetime.py: ## @@ -47,11 +47,16 @@ def micros_to_time(micros: int) -> time: return time(hour=hours, minute=minutes, second=seconds, micr

[GitHub] [iceberg] rdblue commented on a diff in pull request #6127: Python: Add expression evaluator

2022-11-06 Thread GitBox
rdblue commented on code in PR #6127: URL: https://github.com/apache/iceberg/pull/6127#discussion_r1014890146 ## python/pyiceberg/expressions/__init__.py: ## @@ -281,6 +289,30 @@ def __invert__(self) -> BoundIsNull: return BoundIsNull(self.term) +def coerce_unary_ar

[GitHub] [iceberg] rdblue commented on a diff in pull request #6127: Python: Add expression evaluator

2022-11-06 Thread GitBox
rdblue commented on code in PR #6127: URL: https://github.com/apache/iceberg/pull/6127#discussion_r1014889665 ## python/tests/expressions/test_evaluator.py: ## @@ -0,0 +1,203 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreemen

[GitHub] [iceberg] Fokko merged pull request #6130: Build: Bump mkdocs from 1.4.1 to 1.4.2 in /python

2022-11-06 Thread GitBox
Fokko merged PR #6130: URL: https://github.com/apache/iceberg/pull/6130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach