zhongyujiang commented on code in PR #11161: URL: https://github.com/apache/iceberg/pull/11161#discussion_r1770764300
########## core/src/test/java/org/apache/iceberg/TestMetricsTruncation.java: ########## @@ -202,11 +202,20 @@ public void testTruncateStringMax() { String test5 = "\uDBFF\uDFFF\uDBFF\uDFFF"; String test6 = "\uD800\uDFFF\uD800\uDFFF"; // Increment the previous character - String test6_2_expected = "\uD801\uDC00"; + String test6_1_expected = "\uD801\uDC00"; Review Comment: I think this should be a typo, "\uD800\uDFFF" is a Unicode surrogate pair, it's length is 1. ########## core/src/test/java/org/apache/iceberg/TestMetricsTruncation.java: ########## @@ -202,11 +202,20 @@ public void testTruncateStringMax() { String test5 = "\uDBFF\uDFFF\uDBFF\uDFFF"; String test6 = "\uD800\uDFFF\uD800\uDFFF"; // Increment the previous character - String test6_2_expected = "\uD801\uDC00"; + String test6_1_expected = "\uD801\uDC00"; String test7 = "\uD83D\uDE02\uD83D\uDE02\uD83D\uDE02"; String test7_2_expected = "\uD83D\uDE02\uD83D\uDE03"; String test7_1_expected = "\uD83D\uDE03"; + // Increment the max UTF-8 character will overflow + String test8 = "a\uDBFF\uDFFFc"; + String test8_2_expected = "b"; Review Comment: The characters in `test5` are all `MAX_CODE_POINT`, so the upper bound does not exist. `test8` adds a case where an overflow occurs due to `MAX_CODE_POINT`, but it is possible to increment the previous character to get an upper bound. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org