lesterfan opened a new issue, #46094:
URL: https://github.com/apache/arrow/issues/46094

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   On the current main branch, the [specified 
behavior](https://github.com/apache/arrow/blob/409a016e5fdfa28cabd580b7ec81c42991c0748e/cpp/src/arrow/util/rle_encoding_internal.h#L114)
 of `arrow::util::RleDecoder::Get` is to return false if there are no more 
elements. However, this behavior doesn't work for some bit widths. This failing 
test case illustrates this: 
https://github.com/lesterfan/arrow/commit/27220552f7333907c3e1c2bd536e3303736f1cc1.
   
   I added some logging to the above test case for the original/decoded vector 
values and lengths below. The `RleDecoder` seems to be reading some additional 
values at the end of the buffer for this particular case with `bit_width = 17`.
   ```
   Note: Google Test filter = *Rle.DecoderGetBitWidthRoundTrip*
   [==========] Running 1 test from 1 test suite.
   [----------] Global test environment set-up.
   [----------] 1 test from Rle
   [ RUN      ] Rle.DecoderGetBitWidthRoundTrip
   values.size(): 100
   values: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   decoded_values.size(): 104
   decoded_values: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 15794567, 15794567, 15794567, 15794567, 
   /Users/lester/work/arrow/cpp/src/arrow/util/rle_encoding_test.cc:383: Failure
   Expected equality of these values:
     values
       Which is: { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... }
     decoded_values
       Which is: { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... }
   
   [  FAILED  ] Rle.DecoderGetBitWidthRoundTrip (0 ms)
   [----------] 1 test from Rle (0 ms total)
   
   [----------] Global test environment tear-down
   [==========] 1 test from 1 test suite ran. (0 ms total)
   [  PASSED  ] 0 tests.
   [  FAILED  ] 1 test, listed below:
   [  FAILED  ] Rle.DecoderGetBitWidthRoundTrip
   ```
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to