rmuir commented on issue #11976:
URL: https://github.com/apache/lucene/issues/11976#issuecomment-1327601907

   The test is wrong: the startoffsets are correct. input stream is 4 
characters long.
   I would expect: `0, 1, 2, 2, 3` for startoffsets and `1, 2, 3, 3, 4` for 
endoffsets.
   both `1` and `月` should have same offsets as they come from same input 
character `㋀`
   
   
   nothing needs to go backwards. #9820 is not related and just a catch-all for 
misunderstandings about offsets.
   
   Test/issue should not be named "combining character" as there are no 
combining characters involved. "combining character" has a very specific 
meaning in unicode and this is not that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to