[GitHub] [iceberg] rubenvdg commented on issue #6397: Python Instructions currently do not work for testing

2022-12-11 Thread GitBox
rubenvdg commented on issue #6397: URL: https://github.com/apache/iceberg/issues/6397#issuecomment-1345518706 Maybe it's just my lack of Poetry experience, but this is what I did: ``` curl -sSL https://install.python-poetry.org | python3 - # install poetry poetry shell # shell

[GitHub] [iceberg] rubenvdg commented on issue #6397: Python Instructions currently do not work for testing

2022-12-11 Thread GitBox
rubenvdg commented on issue #6397: URL: https://github.com/apache/iceberg/issues/6397#issuecomment-1345519019 And so what happens is this: ``` (pyiceberg-py3.9) ➜ python git:(master) ✗ python Python 3.9.7 (default, Nov 16 2021, 15:21:45) [Clang 13.0.0 (clang-1300.0.29.3)] on

[GitHub] [iceberg] Fokko merged pull request #6254: Python: implement `to_pandas`

2022-12-11 Thread GitBox
Fokko merged PR #6254: URL: https://github.com/apache/iceberg/pull/6254 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] Fokko commented on pull request #6254: Python: implement `to_pandas`

2022-12-11 Thread GitBox
Fokko commented on PR #6254: URL: https://github.com/apache/iceberg/pull/6254#issuecomment-1345636980 Thanks @dungdm93 for working on this 🙌🏻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [iceberg] Fokko merged pull request #6403: Build: Bump duckdb from 0.6.0 to 0.6.1 in /python

2022-12-11 Thread GitBox
Fokko merged PR #6403: URL: https://github.com/apache/iceberg/pull/6403 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] rdblue opened a new pull request, #6405: API: Add Aggregate expression evaluation

2022-12-11 Thread GitBox
rdblue opened a new pull request, #6405: URL: https://github.com/apache/iceberg/pull/6405 This PR has classes for implementing aggregation expressions in the API module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [iceberg] github-actions[bot] commented on issue #5040: Difficulty to investigate hive-metastore locking issue

2022-12-11 Thread GitBox
github-actions[bot] commented on issue #5040: URL: https://github.com/apache/iceberg/issues/5040#issuecomment-1345701670 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #5025: Reduce the number of equity-deletes using bloom filter

2022-12-11 Thread GitBox
github-actions[bot] commented on issue #5025: URL: https://github.com/apache/iceberg/issues/5025#issuecomment-1345701685 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #5000: Proposal: FlinkSQL supports partition transform by computed columns

2022-12-11 Thread GitBox
github-actions[bot] commented on issue #5000: URL: https://github.com/apache/iceberg/issues/5000#issuecomment-1345701702 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] github-actions[bot] commented on issue #4959: revapi plugin will not work in an non-english locale computer

2022-12-11 Thread GitBox
github-actions[bot] commented on issue #4959: URL: https://github.com/apache/iceberg/issues/4959#issuecomment-1345701729 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

[GitHub] [iceberg] rdblue commented on pull request #6405: API: Add Aggregate expression evaluation

2022-12-11 Thread GitBox
rdblue commented on PR #6405: URL: https://github.com/apache/iceberg/pull/6405#issuecomment-1345703757 @huaxingao, I was looking at #6252 and I wanted to try out implementing aggregation in either the core or API modules so that the majority of the logic could be shared rather than needing

[GitHub] [iceberg] rdblue commented on pull request #6252: push down min/max/count to iceberg

2022-12-11 Thread GitBox
rdblue commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1345704862 @huaxingao, as I looked at this, the main thing that I think we should change is moving most of the logic into core or API so that it can be reused across query engines. I tried to do that

[GitHub] [iceberg] rdblue commented on a diff in pull request #6252: push down min/max/count to iceberg

2022-12-11 Thread GitBox
rdblue commented on code in PR #6252: URL: https://github.com/apache/iceberg/pull/6252#discussion_r1045324550 ## api/src/main/java/org/apache/iceberg/expressions/AggregateUtil.java: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

[GitHub] [iceberg] rdblue commented on a diff in pull request #6252: push down min/max/count to iceberg

2022-12-11 Thread GitBox
rdblue commented on code in PR #6252: URL: https://github.com/apache/iceberg/pull/6252#discussion_r1045324763 ## api/src/main/java/org/apache/iceberg/expressions/BoundAggregate.java: ## @@ -37,11 +37,34 @@ public BoundReference ref() { return term().ref(); } + public

[GitHub] [iceberg] dchristle commented on issue #3703: DeleteOrphanFiles or ExpireSnapshots outofmemory

2022-12-11 Thread GitBox
dchristle commented on issue #3703: URL: https://github.com/apache/iceberg/issues/3703#issuecomment-1345717163 @RussellSpitzer We have also hit this issue after doing a large copy of rows into a single Iceberg table. We could have avoided it by more carefully partitioning before the insert,

[GitHub] [iceberg] zinking commented on a diff in pull request #6405: API: Add Aggregate expression evaluation

2022-12-11 Thread GitBox
zinking commented on code in PR #6405: URL: https://github.com/apache/iceberg/pull/6405#discussion_r1045358208 ## api/src/main/java/org/apache/iceberg/expressions/BoundAggregate.java: ## @@ -44,4 +57,85 @@ public Type type() { return term().type(); } } + + public

[GitHub] [iceberg] hililiwei commented on pull request #6394: Flink: Port Support read options in flink source to 1.14 & 1.16

2022-12-11 Thread GitBox
hililiwei commented on PR #6394: URL: https://github.com/apache/iceberg/pull/6394#issuecomment-1345972897 ``` diff --git a/flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/source/IcebergTableSource.java b/flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/IcebergTab

[GitHub] [iceberg] arunb2w opened a new issue, #6406: Overlapping data in data files even after sorting

2022-12-11 Thread GitBox
arunb2w opened a new issue, #6406: URL: https://github.com/apache/iceberg/issues/6406 ### Apache Iceberg version 0.14.0 ### Query engine Spark ### Please describe the bug 🐞 I have performed below steps to analyze table metadata after rewrite based on sort s

[GitHub] [iceberg] zinking commented on a diff in pull request #6382: Implement ShuffleOperator to collect data statistics

2022-12-11 Thread GitBox
zinking commented on code in PR #6382: URL: https://github.com/apache/iceberg/pull/6382#discussion_r1045453077 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/ShuffleRecordWrapper.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundati

[GitHub] [iceberg] chenjunjiedada opened a new pull request, #6407: Flink: use SerializableTable for source

2022-12-11 Thread GitBox
chenjunjiedada opened a new pull request, #6407: URL: https://github.com/apache/iceberg/pull/6407 This revives effort from https://github.com/apache/iceberg/pull/2987. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [iceberg] Fokko commented on a diff in pull request #6342: Python: Introduce SchemaVisitorPerPrimitiveType

2022-12-11 Thread GitBox
Fokko commented on code in PR #6342: URL: https://github.com/apache/iceberg/pull/6342#discussion_r1045479465 ## python/pyiceberg/schema.py: ## @@ -317,6 +331,97 @@ def primitive(self, primitive: PrimitiveType) -> T: """Visit a PrimitiveType""" +class SchemaVisitorPe