https://bz.apache.org/bugzilla/show_bug.cgi?id=69552
Bug ID: 69552
Summary: Performance regression in MessageBytes.toBytes() when
using non-default charset
Product: Tomcat 9
Version: 9.0.98
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P2
Component: Util
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: -----
Created attachment 39982
--> https://bz.apache.org/bugzilla/attachment.cgi?id=39982&action=edit
Speed test
Our high-volume, latency-sensitive application shows that
MessageBytes.toBytes() is a bottleneck on our critical path, with an aggregate
impact of 0.06% (small but real). Investigation shows there is a fast path
within the method that activates only when the charset is
ByteChunk.DEFAULT_CHARSET (ISO_8859_1), so applications using UTF-8 (such as
ours) are always sent to the slow path. I don't immediately see a reason for
the bias towards the default charset.
The attached speed test demonstrates a 40x difference between the fast and slow
paths, as well as significantly different memory allocation (fast path
allocates nothing).
MessageBytes.toBytes() is usually called by jsps referenced similar to this:
<jsp:include page="myNestedJsp">
<jsp:param .../>
</jsp:include>
which forces the nested JSP to re-parse the full set of parameters.
Ideally we can find a way to move the common case (UTF-8) to the fast path, but
I hesitate to suggest anything without understanding the bias towards
ISO_8859_1.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]