https://bz.apache.org/bugzilla/show_bug.cgi?id=69552

            Bug ID: 69552
           Summary: Performance regression in MessageBytes.toBytes() when
                    using non-default charset
           Product: Tomcat 9
           Version: 9.0.98
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Util
          Assignee: dev@tomcat.apache.org
          Reporter: jeng...@amazon.com
  Target Milestone: -----

Created attachment 39982
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=39982&action=edit
Speed test

Our high-volume, latency-sensitive application shows that
MessageBytes.toBytes() is a bottleneck on our critical path, with an aggregate
impact of 0.06% (small but real).  Investigation shows there is a fast path
within the method that activates only when the charset is
ByteChunk.DEFAULT_CHARSET (ISO_8859_1), so applications using UTF-8 (such as
ours) are always sent to the slow path.  I don't immediately see a reason for
the bias towards the default charset.

The attached speed test demonstrates a 40x difference between the fast and slow
paths, as well as significantly different memory allocation (fast path
allocates nothing).

MessageBytes.toBytes() is usually called by jsps referenced similar to this:

<jsp:include page="myNestedJsp">
 <jsp:param .../>
</jsp:include>

which forces the nested JSP to re-parse the full set of parameters.

Ideally we can find a way to move the common case (UTF-8) to the fast path, but
I hesitate to suggest anything without understanding the bias towards
ISO_8859_1.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to