huaxingao commented on PR #6252:
URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1761680652

   @atifiu File statistics are not accurate and can't be used any more if you 
use filters.
   
   For example, you have table (col int), the max of col is 100, and the min is 
0, so the statistics file is
   ```
   max       min
   100        1
   ```
   If you have `SELECT MAX(col) FROM table`, we can check the statistics file 
and simple return 100, but if you have `SELECT MAX(col) FROM table WHERE col < 
70`, we can't use the statistics file any more. We only know that the 
`MAX(col)` is smaller than 70, but we have no idea what value it is, so have to 
compute.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to