[
https://issues.apache.org/jira/browse/HADOOP-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195110#comment-14195110
]
Stephen Chu commented on HADOOP-11165:
--------------------------------------
This is related to
http://stackoverflow.com/questions/25404373/java-8-utf-8-encoding-issue-java-bug:
"It is a property of the “Modified UTF-8” encoding to store surrogate pairs (or
even unpaired chars of that range) like individual characters. And it’s an
error if a decoder claiming to use standard UTF-8 uses “Modified UTF-8”. This
seems to have been fixed with Java 8."
So when running the test in Java 8, we'll get mismatching Strings because {{new
String(UTF8.getBytes(before), "UTF-8")}} will not decode using "Modified UTF-8"
anymore.
> TestUTF8 fails when run against java 8
> --------------------------------------
>
> Key: HADOOP-11165
> URL: https://issues.apache.org/jira/browse/HADOOP-11165
> Project: Hadoop Common
> Issue Type: Test
> Reporter: Ted Yu
> Assignee: Stephen Chu
> Priority: Minor
>
> Using jdk1.8.0_20 , I got:
> {code}
> testGetBytes(org.apache.hadoop.io.TestUTF8) Time elapsed: 0.007 sec <<<
> FAILURE!
> junit.framework.ComparisonFailure:
> expected:<쑼ь⣄鬘㟻햫紖燺[?炀⑰풸낓⨵ἲꬌホ쭷㛕曬䟊⁍䴥䳠領蟭뱻宭竕昚鍳튇ꊕ혶齲쏈㠮胨䩦隼䍻킿喝벁ࢼ듿饭玳Մ剌䒤?䳛슟녚沖᯳?訨
> 牙⍖?䎠旘薑春觀葝礫⁑ﻱ⣽゚굿뒦ݦ︀偆?]古絥萟浐> but
> was:<쑼ь⣄鬘㟻햫紖燺[�炀⑰풸낓⨵ἲꬌホ쭷㛕曬䟊⁍䴥䳠領蟭뱻宭竕昚鍳튇ꊕ혶齲쏈㠮胨䩦隼䍻킿喝벁ࢼ듿饭玳Մ剌䒤�䳛슟녚᯳�訨牙⍖�䎠旘薑春觀葝礫⁑ﻱ⣽゚굿뒦ݦ︀偆�]古絥萟浐>
> at junit.framework.Assert.assertEquals(Assert.java:100)
> at junit.framework.Assert.assertEquals(Assert.java:107)
> at junit.framework.TestCase.assertEquals(TestCase.java:269)
> at org.apache.hadoop.io.TestUTF8.testGetBytes(TestUTF8.java:58)
> testIO(org.apache.hadoop.io.TestUTF8) Time elapsed: 0.002 sec <<< FAILURE!
> junit.framework.ComparisonFailure:
> expected:<...ᨍ⁖粩⧬车﹂脖朷䝄懒댵突疼資⍣眠畠忁[?]䪐ゑ鬍鍅遻ꈸ釡> but
> was:<...ᨍ⁖粩⧬车﹂脖朷䝄懒댵突疼資⍣眠畠忁[�]䪐ゑ鬍鍅遻>ꈸ釡>
> at junit.framework.Assert.assertEquals(Assert.java:100)
> at junit.framework.Assert.assertEquals(Assert.java:107)
> at junit.framework.TestCase.assertEquals(TestCase.java:269)
> at org.apache.hadoop.io.TestUTF8.testIO(TestUTF8.java:86)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)