wgtmac opened a new issue, #34326:
URL: https://github.com/apache/arrow/issues/34326

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Parquet ColumnWriter obtains null_count of a page from page stats as below 
([link](https://github.com/apache/arrow/blob/main/cpp/src/parquet/column_writer.cc#L952))
   ```cpp
     EncodedStatistics page_stats = GetPageStatistics();
   
     int32_t null_count = static_cast<int32_t>(page_stats.null_count);
   
     DataPageV2 page(combined, num_values, null_count, num_rows, encoding_,
                       def_levels_byte_length, rep_levels_byte_length, 
uncompressed_size,
                       pager_->has_compressor(), page_stats);
   ```
   
   However, the null_count is uninitialized if page stat is not enabled:
   ```cpp
     EncodedStatistics GetPageStatistics() override {
       EncodedStatistics result;
       if (page_statistics_) result = page_statistics_->Encode();
       return result;
     }
   ```
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to