adamreeve opened a new issue, #45073:
URL: https://github.com/apache/arrow/issues/45073

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   As part of adding Parquet encryption to arrow-rs 
(https://github.com/apache/arrow-rs/pull/6637), @rok and I found that arrow-rs 
could not read the example files in parquet-testing due to invalid repetition 
levels. arrow-rs complains that:
   ```
   Parquet error: first repetition level of batch must be 0
   ```
   
   This is due to the int64 list column data being written with the repetition 
levels flipped, 0 should indicate the start of a new list but 1 is used:
   
https://github.com/apache/arrow/blob/b655852b260d3b8c3fe457795df0f42a2ff9c98c/cpp/src/parquet/encryption/test_encryption_util.cc#L121
   
   Related to this, is it also a bug that Arrow would read these files without 
complaining? If I test reading one of these files into Arrow format with 
PyArrow, the first leaf value is skipped.
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to