KartikKapur opened a new pull request, #11905:
URL: https://github.com/apache/iceberg/pull/11905

   **Background**
   At Pinterest, we've started utilizing iceberg metrics considerably for 
offline validation as well as query speedups. Counts are consistently useful 
for all columns and upper/lowerbound is useful for numeric columns.; however, 
for struct columns (typically objects with encoded strings), ranges are 
relatively useless and just cause space overhead with potential driver OOM. 
There isn't an easy way to specify metrics per data type so wanted to 
contribute a solution which allows to specify truncate limit as 0 which just 
causes strings and binaries to be fully truncated.
   
   **Changes**
   Updated checks in Truncate MetricsMode to allow for non-negative rather than 
strictly positive length. Updated appropriate checks in order to only input 
into bounds map if non null 
   
   **Tests**
   
   Added appropriate unit tests 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to