Re: [I] Spark SQL Extensions: Rewrite manifests [iceberg]

via GitHub Fri, 19 Jan 2024 23:59:13 -0800


zeddit commented on issue #1598:
URL: https://github.com/apache/iceberg/issues/1598#issuecomment-1901892857


   @rdblue @aokolnychyi would spark `rewrite_manifests()` respects the order of 
data files according to column statistics for a sorted iceberg table.
   I found the engine of spark always put newly added data-files to the top of 
the minifest, which will be read out by pyiceberg as the first few lines of the 
dataframe.
   
   this behavior is good for desc sorted table but not good for asc tables. 
   
   is it possible to respect the sort-order of iceberg table field to 
manipulate the order of data files in rewritten manifests.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Spark SQL Extensions: Rewrite manifests [iceberg]

Reply via email to