iverase commented on code in PR #13948:
URL: https://github.com/apache/lucene/pull/13948#discussion_r1877894071
##########
lucene/core/src/java/org/apache/lucene/util/UnicodeUtil.java:
##########
@@ -627,35 +629,58 @@ public static String toHexString(String s) {
}
/**
- * Interprets the given byte array as UTF-8 and converts to UTF-16. It is
the responsibility of
- * the caller to make sure that the destination array is large enough.
+ * Interprets the given {@link RandomAccessInput} slice as UTF-8 and
converts to UTF-16. It is the
+ * responsibility of the caller to make sure that the destination array is
large enough.
+ *
+ * <p>NOTE: Full characters are read, even if this reads past the length
passed (and can result in
+ * an IOException if invalid UTF-8 is passed). Explicit checks for valid
UTF-8 are not performed.
+ */
+ // TODO: broken if chars.offset != 0
+ public static int UTF8toUTF16(RandomAccessInput input, long offset, int
length, char[] out)
Review Comment:
I revert this changes. If we see it is performance sensitive we can re add
it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]