abbit opened a new issue, #142: URL: https://github.com/apache/arrow-go/issues/142
### Describe the usage question you have. Please include as many useful details as possible. `parquet.thrift` in parquet-format repo describes `RowGroup` `total_byte_size` field meaning [as](https://github.com/apache/parquet-format/blob/737ea12e56357e83b14fd3e27ef274145beed399/src/main/thrift/parquet.thrift#L920C7-L920C76) > Total byte size of all the uncompressed column data in this row group This is also the case for C++ implementation of parquet in arrow repo. But in case of Go implementation `total_byte_size` is described [as](https://github.com/apache/arrow-go/blob/14844aea32054a0b7cc086df58a4a74610b0b306/parquet/metadata/row_group.go#L62) > TotalByteSize is the total size of this rowgroup on disk The difference between these values can be large, when compression is applied to column chunks. My question is: Is that intentional inconsistency with format definition and other implementations? And if so, why does this distinction has been made? ### Component(s) Go, Parquet P.S.: This issue is duplicate of https://github.com/apache/arrow/issues/44205, but as I see this repo is now main location of Arrow Go -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org