xiewajueji opened a new issue, #45061: URL: https://github.com/apache/arrow/issues/45061
### Describe the bug, including details regarding any error messages, version, and platform. I use this method to estimate the size of current RowGroup being written. In the situation 1000 columns and average 20MB a row, the sum of all dict is more than 1GB which is larger than RowGroup size setting. ```cpp int64_t EstimatedDataEncodedSize() override { return kDataPageBitWidthBytes + RlePreserveBufferSize(static_cast<int>(buffered_indices_.size()), bit_width()); } ``` I found this method not count dict size while Java implementation count. Are there any problem if dict size is counted? ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org