https://bz.apache.org/bugzilla/show_bug.cgi?id=66141

            Bug ID: 66141
           Summary: useBomIfPresent removes UTF-BOM without modifying HTTP
                    Content-Length
           Product: Tomcat 9
           Version: 9.0.26
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Catalina
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: -----

Created attachment 38325
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=38325&action=edit
script to compare filesystem to HTTP-transfer (configure $SRC and wget-URL)

The DefaultServlet does modify files on transit by removing BOM and mishandling
the resulting size for UTF-8 and UTF-16 BOM resulting in a transfer-timeout.
UTF-32 is left intact.

When downloaded with wget the result file will have the last bytes appended
depending on the BOM-size due to retrying. E.g. UTF-8 3-byte-BOM makes content
"TEST" -> "TESTEST".

Looks to me that the tomcat code at
https://github.com/apache/tomcat/blob/6a667943c5da6b5d61ac6bec1d7c9de061e3217c/java/org/apache/catalina/servlets/DefaultServlet.java#L1051
does not detect conversionRequired for the removal of BOM, so at
https://github.com/apache/tomcat/blob/6a667943c5da6b5d61ac6bec1d7c9de061e3217c/java/org/apache/catalina/servlets/DefaultServlet.java#L1079
the 'Content-Length' is written before the BOM is stripped, resulting in the
clients waiting for more bytes to come that never arrive.

Additionally why does UTF-32 work? The code lacks the 'skip' like all the other
encodings:

UTF-8 skips and returns:
https://github.com/apache/tomcat/blob/6a667943c5da6b5d61ac6bec1d7c9de061e3217c/java/org/apache/catalina/servlets/DefaultServlet.java#L1275

UTF-32 does not skip, just resturns encoding name:
https://github.com/apache/tomcat/blob/6a667943c5da6b5d61ac6bec1d7c9de061e3217c/java/org/apache/catalina/servlets/DefaultServlet.java#L1287

See attached test-script for Micro Focus ZENworks which uses Tomcat and got
this bug report as #02286060 "ZCM Webserver 2020.01 is not transparent to BOM
and mishandling modified filesize" on -05-05 but refused to report upstream on
-06-21 due to:

> Our engineering team come back with the analyses. Looking from ZENworks
> perspective there is no functionality impact. It seems Tomcat is used for your
> own for purpose, where the issue is happening. For that reason the suggestion
> is that you should report this case/scenario to the tomcat team. In case it
> will fixed from the Tomcat side, with every major ZENworks update a new 
> version
> of Tomcat will be consumed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to