[ 
https://issues.apache.org/jira/browse/LUCENE-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400387#comment-17400387
 ] 

Michael McCandless commented on LUCENE-10052:
---------------------------------------------

I have a quick initial PR for this, adding multiple overloaded {{newBytesRef}} 
methods to {{LuceneTestCase}} and then calling them in just a few tests.

> Add LuceneTestCase.newBytesRef methods
> --------------------------------------
>
>                 Key: LUCENE-10052
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10052
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Major
>
> {{BytesRef}} is a super useful Lucene utility class, referencing a slice 
> (offset + length) of an underlying possibly larger {{byte[]}}.  We use it all 
> over the place in our APIs.
> But the {{offset}} is trappy – we programmers sometimes forget to add the 
> {{offset}} when accessing the underlying bytes.  Or sometimes we accidentally 
> add it twice, as just happened in our (Amazon Product Search) Lucene usage.  
> Such errors are devious because they often do not matter since typically 
> {{offset}} will be zero, but then suddenly when the rare {{BytesRef}} arrives 
> with non-zero {{offset}}, BOOM.
> I think we should improve our testing here by making it simple to randomize a 
> {{BytesRef}} creation to sometimes use non-zero offset and also to sometimes 
> leave extra padding on the end of the underlying {{byte[]}} to catch another 
> trappy case where we use {{bytesRef.bytes.length}} when we were supposed to 
> use {{bytesRef.length}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to