https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
Mark Thomas changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #8 from Mark Thomas ---
Thanks again for this bug report. It promoted me to take a much closer look at
UTF-8 decoding and I found a number of edge cases in both URI processing and
request body processing.
trunk is now using the
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #7 from Mark Thomas ---
(In reply to comment #5)
> Where do you think I started talking about request bodies?
Sorry about the confusion. You were looking at 7.0.x and I was looking at
trunk. When you mentioned an InputStream I
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #6 from Remy Maucherat ---
Yes, this is confusing to you probably. Your problem with URI processing and
trunk was that I forgot to port a call to recycle, which likely caused a
problem there [and you can ignore the rest, it does
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #5 from NateC ---
Where do you think I started talking about request bodies?
org.apache.catalina.connector.Request uses a B2CConverter for URIDecoding the
variable is called URIConverter. B2CConverter uses ReadConverter, which
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #4 from Mark Thomas ---
The original report was about URI processing. Now you are talking about request
bodies.
There are multiple issues here. So far I have found / suspect:
a) Invalid sequences are not rejected quickly enough
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #3 from NateC ---
InputStreamReader defaults to replacement characters so it won't reject those
characters just replace them with the replacement.
The underlying InputStreamReader holds on to those remaining bytes because it
is
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #2 from Mark Thomas ---
Part of the problem here is that the UTF-8 decoder should reject bytes 5-8 as
an invalid sequence but doesn't. That is a JVM bug that needs to be reported to
Oracle.
Given the widespread use of UTF-8 I s
https://issues.apache.org/bugzilla/show_bug.cgi?id=54602
--- Comment #1 from Mark Thomas ---
I do see code that is meant to recycle the converter. Do you have a test case /
can you write a Tomcat unit test that demonstrates that the converter isn't
being recycled?
Incomplete byte sequences shoul