New submission from Meitar Moscovitz:
SUMMARY:
In the Python standard library, the BaseHTTPRequestHandler class’s
send_header() method[0] does not correctly construct HTTP/1.1 message headers
as described by Section 4.2 of RFC 2616[1] when it is given maliciously-crafted
input, leaving applications that rely on it vulnerable to HTTP response
splitting[2] if they do not take extra precautions, themselves. A similar
vulnerability affects the BaseHTTPRequestHandler class’s send_response_only()
method, as well, although this is not likely to be as exploitable in the wild.
This second vulnerability can result in HTTP response splitting due to
incorrect construction of the Reason-Phrase portion of an HTTP Status-Line.[3]
Since these APIs are designed to handle user-supplied input, it is reasonable
to assume that developers will expect the standard library to consume arbitrary
input safely. Unfortunately, the library fails to do that in these cases.
According to a simple GitHub code search, slightly more than 100,000
repositories are directly using BaseHTTPRequestHandler,[4] so it is possible
that a significant percentage of those applications are affected by this
vulnerability.
PYTHON VERSIONS AFFECTED:
* Current development tip at time of writing:
https://hg.python.org/cpython/file/tip/Lib/http/server.py#l511
* Current stable version 3.6 release
(https://hg.python.org/cpython/file/3.6/Lib/http/server.py#l508)
* Current stable version 2.7.13 release
(https://hg.python.org/cpython/file/2.7/Lib/BaseHTTPServer.py#l412)
DETAILS:
According to the HTTP specification, an HTTP message header field content *MUST
NOT* contain the sequence CRLF (carriage return, line feed). The RFC defines a
message header as:
message-header = field-name ":" [ field-value ]
field-name = token
field-value= *( field-content | LWS )
field-content =
The RFC defines *TEXT to be the same as defined in RFC 822 section 3.3:[5]
text= atoms, specials,
CR & bare LF, but NOT ; comments and
including CRLF> ; quoted-strings are
; NOT recognized.
However, the send_header() method does not perform any checking to ensure that
a message header field-name nor field-content contains no CRLF sequences. The
vulnerable Python 3.x code is in Lib/http/server.py on lines 507 and 508:
self._headers_buffer.append(
("%s: %s\r\n" % (keyword, value)).encode('latin-1', 'strict'))
An impacted application is one that passes user-provided input into the
send_header() method, such as is common for setting HTTP Cookies. An example of
an affected application may run Python code such as:
def do_POST(self):
# receive user-supplied data from a POST’ed HTTP request
form_input = parse.unquote_plus(
self.rfile.read(int(self.headers.get('content-length'))).decode('utf8')
).split('=')
username = form_input[1] # extract a user-supplied value
self.send_header('Set-Cookie', 'user={}'.format(username)) # use that value,
assuming library will provide safety!
self.end_headers()
# ... send HTTP response body ...
Assuming the code above, this HTTP POST request…
POST https://victim.example/ HTTP/1.1
Host: victim.example
Content-Type: application/x-www-form-urlencoded
Content-Length: 10
user=alice
…would produce something like this (safe) HTTP response:
HTTP/1.0 200 OK
Server: BaseHTTP/0.6 Python/3.4.5
Date: Thu, 19 Jan 2017 22:58:44 GMT
Set-Cookie: user=alice
...HTTP RESPONSE BODY...
However, if an attacker supplies the following, maliciously-crafted HTTP POST
payload…
POST https://victim.example/ HTTP/1.1
Host: victim.example
Content-Type: application/x-www-form-urlencoded
Content-Length: 46
user=%0d%0aContent-Length: 6%0d%0a%0d%0aHACKED
…then the application would serve a page that simply read “HACKED” as its
output:
HTTP/1.0 200 OK
Server: BaseHTTP/0.6 Python/3.4.5
Date: Thu, 19 Jan 2017 22:58:44 GMT
Set-Cookie: user=
Content-Length: 6
HACKED
The remainder of the application’s original, intended HTTP response would be
ignored by the client. This allows the attacker to submit arbitrary data to
clients, including hijacking complete pages and executing XSS attacks.
SOLUTION:
To fix this issue, ensure that CRLF sequences in the relevant user-supplied
values are replaced with a single SPACE character, as the RFC instructs.[1] For
example, Python code referencing
value
should become
value.replace("\r\n", ' ')
Similar replacements should be made to the other relevant arguments in the
affected methods.
Two patches are attached to this email, one for each associated affected
version of Python. Apply each patch with:
cd Python-$VERSION
patch -p1 < python-$VERSION.patch
where $VERSION is the version of Python to be patche