mapleFU opened a new issue, #45257: URL: https://github.com/apache/arrow/issues/45257
### Describe the bug, including details regarding any error messages, version, and platform. The code[1] would cleanup the min-max stats in Parquet. For ByteArray, we may "Merge" multiple stats when reading from file. Things would be tricky in the code below when `min = ""` 1. Code in [2] is empty, so `PlainDecode` will not be called, and `has_min_max_` is `true`. But `ByteArray` keeps default constructor, which leaves `ptr == nullptr` [3] 2. When call `TypedStatistics::Merge`, this will call Cleanup [1], and finally, the min-max statistics would leave unchanged. So, when `min = ""` being merged, the min-max will keep the old statistics. [1] https://github.com/apache/arrow/blob/ea47172bd80b5ee040c19e605f7e4a6f872b470f/cpp/src/parquet/statistics.cc#L408 [2] https://github.com/apache/arrow/blob/ea47172bd80b5ee040c19e605f7e4a6f872b470f/cpp/src/parquet/statistics.cc#L609 [3] https://github.com/apache/arrow/blob/ea47172bd80b5ee040c19e605f7e4a6f872b470f/cpp/src/parquet/types.h#L587 ### Component(s) C++, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org