Hi,

I am currently looking at the request line parsing. I'll try and set out
each issue in turn.

End of line parsing
===================

Prior to the recent changes, Tomcat allowed CRLF or LF to mark the end
of a line. The unwanted side effect was that CR could appear in the
header value. This caused problems and was tightened up to only allow
CRLF as a line terminator.

Currently Tomcat requires CRLF everywhere apart from the end of the
request line for a HTTP 0.9 request where it also allows LF.

This requirement to accept just LF as a line terminator first emerged in
the W3C spec [1]. RFC 1945 [2] and RFC 2616 [3] retained this as a
recommendation for all line terminators, RFC 7230 [4] no longer includes
this recommendation.

RFC 7230 also removes the expectation that a server that supports
HTTP/1.1 will support HTTP 0.9.

Arguably the current spec for HTTP/0.9 is [3].

The Servlet spec references RFC 7230 and RFC 1945 so arguably HTTP/0.9
support is expected.


SP vs whitespace
================

Tomcat currently accepts any combination of SP and HTAB where RFC 7230
calls for a single SP. This stems from a recommendation in RFC 2616
which is no longer present in RFC 7230.


I think we have three options.

1. No changes.
   CRLF is required everywhere apart from HTTP/0.9 where LF is also
   accepted.
   Any combination of SP/HTAB is accepted where SP is required.

2. Tighten up as per RFC 7230
   a) Require CRLF for all line endings
   b) Require SP where specified
   c) Drop HTTP/0.9 support

3. Relax the recent changes to allow CRLF or LF as a line terminator
   everywhere without allowing CR to appear in a request header.

I think we should follow 1) for Tomcat 7, 8 & 9.

I'm leaning towards 1 for 10.0.x as well with a view to discussing 2 in
the Servlet project. i.e. explicitly dropping HTTP 0.9 support and the
"Tolerant applications" requirements of RFC 1945 for Jakarta EE 10
(Tomcat 10.1.x).

In short this means largely do nothing apart from may be adding a few
more tests to explicitly check behaviour for various edge cases.

I'll note that the regressions reported with the recent change to
requiring CRLF as a line terminator caused issues with valid HTTP/0.9
requests but that this should now be resolved.

We have had one user issue reported where a custom client was using LFLF
as a line terminator and requests are now being rejected. Given that was
never valid, I'm OK with that.

Thoughts?

Mark



[1] https://www.w3.org/Protocols/HTTP/AsImplemented.html
[2] https://tools.ietf.org/html/rfc1945
[3] https://tools.ietf.org/html/rfc2616
[4] https://tools.ietf.org/html/rfc7230





With all of the above in mind I propose:

- Doing nothing! I think Tomcat is striking the right balance here.

This means:
GET /CRLF   -> processed as HTTP/0.9
GET /LF     -> processed as HTTP/0.9
GET / CRLF  -> processed as HTTP/1.1 and rejected as invalid
GET / LF    -> processed as HTTP/1.1 and rejected as invalid

I want to write some tests to check this is behaving as expected but I'm
not expecting any changes to the parsing at this point.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to