zeddit commented on issue #1598: URL: https://github.com/apache/iceberg/issues/1598#issuecomment-1901892857
@rdblue @aokolnychyi would spark `rewrite_manifests()` respects the order of data files according to column statistics for a sorted iceberg table. I found the engine of spark always put newly added data-files to the top of the minifest, which will be read out by pyiceberg as the first few lines of the dataframe. this behavior is good for desc sorted table but not good for asc tables. is it possible to respect the sort-order of iceberg table field to manipulate the order of data files in rewritten manifests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org