Re: [I] select distinct on table scan [iceberg-python]

2024-03-21 Thread via GitHub
Fokko closed issue #403: select distinct on table scan URL: https://github.com/apache/iceberg-python/issues/403 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] select distinct on table scan [iceberg-python]

2024-02-13 Thread via GitHub
Fokko commented on issue #403: URL: https://github.com/apache/iceberg-python/issues/403#issuecomment-1941430839 The Iceberg metadata does not contain this information to optimize a distinct query :( What would help is a lazy implementation of an Arrow dataset should keep the memory footprin

[I] select distinct on table scan [iceberg-python]

2024-02-09 Thread via GitHub
carcmarc opened a new issue, #403: URL: https://github.com/apache/iceberg-python/issues/403 ### Feature Request / Improvement support table scan that returns distinct values of fields. Example: selected_fields=('distinct column_name',). Potentially added as a PyIceberg Expression or