[
https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400387#comment-17400387
]
Michael McCandless commented on LUCENE-10052:
---------------------------------------------
I have a quick initial PR for this, adding multiple overloaded {{newBytesRef}}
methods to {{LuceneTestCase}} and then calling them in just a few tests.
> Add LuceneTestCase.newBytesRef methods
> --------------------------------------
>
> Key: LUCENE-10052
> URL: https://issues.apache.org/jira/browse/LUCENE-10052
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Major
>
> {{BytesRef}} is a super useful Lucene utility class, referencing a slice
> (offset + length) of an underlying possibly larger {{byte[]}}. We use it all
> over the place in our APIs.
> But the {{offset}} is trappy – we programmers sometimes forget to add the
> {{offset}} when accessing the underlying bytes. Or sometimes we accidentally
> add it twice, as just happened in our (Amazon Product Search) Lucene usage.
> Such errors are devious because they often do not matter since typically
> {{offset}} will be zero, but then suddenly when the rare {{BytesRef}} arrives
> with non-zero {{offset}}, BOOM.
> I think we should improve our testing here by making it simple to randomize a
> {{BytesRef}} creation to sometimes use non-zero offset and also to sometimes
> leave extra padding on the end of the underlying {{byte[]}} to catch another
> trappy case where we use {{bytesRef.bytes.length}} when we were supposed to
> use {{bytesRef.length}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]