richardstartin commented on code in PR #10044:
URL: https://github.com/apache/pinot/pull/10044#discussion_r1059802361


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/BaseImmutableDictionary.java:
##########
@@ -259,8 +259,7 @@ protected int binarySearch(byte[] value) {
 
     while (low <= high) {
       int mid = (low + high) >>> 1;
-      byte[] midValue = _valueReader.getBytes(mid, _numBytesPerValue);
-      int compareResult = ByteArray.compare(midValue, value);
+      int compareResult = _valueReader.compareUnsignedBytes(mid, 
_numBytesPerValue, value);

Review Comment:
   The existing logic here appears to be broken for fixed length 
`FixedByteValueReaderWriter`: it compares `value` with the _padded_ stored 
value retrieved by `ValueReader.getBytes`. This comparison will break ties 
using the lengths when mismatches can't be found, meaning the comparison will 
never yield zero unless `value` is also padded. I can only find testing for 
this method with byte arrays with the same size, so this problem would not 
arise in tests, but I can't find where the padding required to make this work 
with arbitrary length byte arrays is taking place.
   
   I decided to roll this change back as addressing this problem is bigger than 
the scope of the change I wanted to make, and `STRING` columns are much more 
common than `BYTES`. I refactored the code to make adding an unsigned 
comparison trivial once this has been resolved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to