Re: [tomcat] branch main updated: Add control of byte decoding errors to ByteChunk and StringCache

Mark Thomas Fri, 23 Jun 2023 01:07:25 -0700

On 23/06/2023 08:34, Rémy Maucherat wrote:

On Thu, Jun 22, 2023 at 8:55 PM <ma...@apache.org> wrote:


This is an automated email from the ASF dual-hosted git repository.

markt pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tomcat.git


The following commit(s) were added to refs/heads/main by this push:
      new 944951302e Add control of byte decoding errors to ByteChunk and 
StringCache
944951302e is described below

commit 944951302e2f478879411dbff353f5818ad44121
Author: Mark Thomas <ma...@apache.org>
AuthorDate: Wed Jun 14 12:25:21 2023 +0100

     Add control of byte decoding errors to ByteChunk and StringCache


<snip/>

+    /**
+     * Converts the current content of the byte buffer to a String using the 
configured character set.
+     *
+     * @param malformedInputAction      Action to take if the input is 
malformed
+     * @param unmappableCharacterAction Action to take if a byte sequence 
can't be mapped to a character
+     *
+     * @return The result of converting the bytes to a String
+     *
+     * @throws CharacterCodingException If an error occurs during the 
conversion
+     */
+    public String toStringInternal(CodingErrorAction malformedInputAction, 
CodingErrorAction unmappableCharacterAction)
+            throws CharacterCodingException {
          if (charset == null) {
              charset = DEFAULT_CHARSET;
          }
          // new String(byte[], int, int, Charset) takes a defensive copy of the
          // entire byte array. This is expensive if only a small subset of the
          // bytes will be used. The code below is from Apache Harmony.
-        CharBuffer cb = charset.decode(ByteBuffer.wrap(buff, start, end - 
start));
+        CharBuffer cb = 
charset.newDecoder().onMalformedInput(malformedInputAction)
+                
.onUnmappableCharacter(unmappableCharacterAction).decode(ByteBuffer.wrap(buff, 
start, end - start));


Looking at the code, this is not equivalent, like charset.decode uses
thread locals and so on. I will make a change so that charset.decode
is used if is REPLACE REPLACE, if you don't mind. I'm pretty sure
benching would show no performance difference though.


Seems reasonable to me.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Re: [tomcat] branch main updated: Add control of byte decoding errors to ByteChunk and StringCache

Reply via email to