[ https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400387#comment-17400387 ]
Michael McCandless commented on LUCENE-10052: --------------------------------------------- I have a quick initial PR for this, adding multiple overloaded {{newBytesRef}} methods to {{LuceneTestCase}} and then calling them in just a few tests. > Add LuceneTestCase.newBytesRef methods > -------------------------------------- > > Key: LUCENE-10052 > URL: https://issues.apache.org/jira/browse/LUCENE-10052 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Major > > {{BytesRef}} is a super useful Lucene utility class, referencing a slice > (offset + length) of an underlying possibly larger {{byte[]}}. We use it all > over the place in our APIs. > But the {{offset}} is trappy – we programmers sometimes forget to add the > {{offset}} when accessing the underlying bytes. Or sometimes we accidentally > add it twice, as just happened in our (Amazon Product Search) Lucene usage. > Such errors are devious because they often do not matter since typically > {{offset}} will be zero, but then suddenly when the rare {{BytesRef}} arrives > with non-zero {{offset}}, BOOM. > I think we should improve our testing here by making it simple to randomize a > {{BytesRef}} creation to sometimes use non-zero offset and also to sometimes > leave extra padding on the end of the underlying {{byte[]}} to catch another > trappy case where we use {{bytesRef.bytes.length}} when we were supposed to > use {{bytesRef.length}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org