On 29/06/2019 21:01, Mark Thomas wrote:
> On 29/06/2019 02:26, GitBox wrote:
>> alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read 
>> in checkNormalize
>> URL: https://github.com/apache/tomcat/pull/176
>>  
>>  
>>    On malformed requests, checkNormalize would throw an 
>> ArrayIndexOutOfBoundsException leading to a 500 response. This change fixes 
>> checkNormalize to return false instead of throwing exception on those 
>> inputs, and adds a few tests to check the new functionality.
> 
> That there are URI sequences that can trigger this is certainly
> something that we need to fix.
> 
> On a slightly different topic, this got me thinking.>
> checkNormalize() is a test that runs on every request. It is essentially
> there to ensure that whatever character set has been specified for the
> Connector is "safe". In this case "safe" means when the bytes are
> converted to characters we don't get any unexpected "." or "/"
> characters in the result that make the character version of the URI
> non-normalized.
> 
> Rather than run this test on every request, we could:
> - pre-calculate the character sets we consider to be safe
> - throw an exception if the user attempts to configure the Connector to
>   use one
> 
> I think we could safely exclude any character set where any of the
> following were not true:
> 
> - 0x00 <-> '\u0000'
> - 0x2e <-> '.'
> - 0x2f <-> '/'
> - 0x5c <-> '\'
> 
> We could create a unit test that maintains/checks the list of "safe"
> character set canonical names. After creating the character set in
> Connector.setURIEncoding(), if the canonical name of the resulting
> character set is not in the safe list, throw an exception.
> 
> Then remove checkNormalize() and replace with a comment that explains
> why the conversion is known to be safe.
> 
> There are several loops through the URI in checkNormalize(). I think -
> I'd need to test to confirm - that removing them would provide a
> measurable performance benefit.
> 
> Thoughts?

I thought about this some more over the weekend and I think it is a good
idea in theory but a bad idea in practice.

checkNormalize() serves a purpose that wasn't addressed in my comments
above. Broken decoders. There is a history of issues with Java's UTF-8
decoder that could be exploited in the way checkNormalize() is designed
to protect against. While I am reasonably confident Java's UTF-8 decoder
is now safe, I am not confident that every Java decoder is bug free.
Therefore, I think the checkNormalize() call needs to be retained for
every request.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to