https://issues.apache.org/bugzilla/show_bug.cgi?id=51400
Bug #: 51400
Summary: Use of "new String(byte[] b, String enc)" hits Sun JVM
bottleneck
Product: Tomcat 6
Version: 6.0.32
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Catalina
AssignedTo: [email protected]
ReportedBy: [email protected]
Classification: Unclassified
Created attachment 27186
--> https://issues.apache.org/bugzilla/attachment.cgi?id=27186
Patch with optimizations
We're using Tomcat 6 for a high-volume, high-concurrency service (Evernote).
At times, we've seen a performance slowdown within the service, which we've
traced to a concurrency flaw within the JVM code that translates named
encodings (e.g. "utf-8") into Charsets. This translates into a number of stuck
threads trying to convert a byte array to a String or vice versa, ala:
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.cs.FastCharsetProvider.charsetForName(Unknown Source)
- waiting to lock <0x00007ff3b4cc85b0> (a sun.nio.cs.StandardCharsets)
at java.nio.charset.Charset.lookup2(Unknown Source)
at java.nio.charset.Charset.lookup(Unknown Source)
at java.nio.charset.Charset.isSupported(Unknown Source)
at java.lang.StringCoding.lookupCharset(Unknown Source)
at java.lang.StringCoding.decode(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at
org.apache.tomcat.util.buf.ByteChunk.toStringInternal(ByteChunk.java:499)
at org.apache.tomcat.util.buf.StringCache.toString(StringCache.java:315)
at org.apache.tomcat.util.buf.ByteChunk.toString(ByteChunk.java:492)
at
org.apache.tomcat.util.buf.MessageBytes.toString(MessageBytes.java:213)
at
org.apache.tomcat.util.http.MimeHeaders.getHeader(MimeHeaders.java:319)
at org.apache.coyote.Request.getHeader(Request.java:330)
at org.apache.catalina.connector.Request.getHeader(Request.java:1854)
at
org.apache.catalina.connector.RequestFacade.getHeader(RequestFacade.java:643)
This isn't a true deadlock, since each thread will eventually finish, but it
can
significantly affect concurrency if there are a number of threads making heavy
use of:
new String(byte[] b, String encoding)
String.getBytes()
String.getBytes(String encoding)
This is, unfortunately, a known bottleneck within the JVM:
http://blog.inuus.com/vox/2008/05/the-mysteries-of-java-character-set-performance.html
http://halfbottle.blogspot.com/2009/07/charset-continued-i-wrote-about.html
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6790402
To avoid this bottleneck in the JVM, we've patched our server to use the
explicit Charset object for String encoding rather than the name of the
charset, and then added a ConcurrentHashMap<String, Charset> to lookup charsets
by encodings.
I've attached a patch with our fixes on 6.0.32
Just as a random FYI - the same issue hits MySQL's Java connector, so we'd
occasionally see Tomcat and MySQL fighting over this same JVM chokepoint:
http://bugs.mysql.com/bug.php?id=61105
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]