paleolimbot opened a new issue, #46205:
URL: https://github.com/apache/arrow/issues/46205

   ### Describe the enhancement requested
   
   The C++ Parquet implementation after adding variant and geometry will have 
several logical types with a sort order of UNKNOWN. The current implementation 
of statistics will not calculate a null count and add that statistic to the 
column metadata if the sort order is unknown, so this particular piece of 
information will be missing for geometry, geography, and variant. For geometry 
in particular, it will be needed to effectively push down a query rectangle (or 
else there is no mechanism to detect completely null row groups).
   
   I'm not sure what the best way is to implement this...for geometry 
specifically we could keep track of the null count in the `GeoStatistics` but 
this wouldn't help with variant. I'm also not sure if the null count + 
statistics should be written at the page level for these types or not.
   
   Noted by @wgtmac in https://github.com/apache/arrow/pull/45459
   
   ### Component(s)
   
   Parquet, C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to