Jean-Marc Le Peuvedic added the comment:
The exception is raised in the start_response function provided by web.py's
WSGIGateway class in wsgiserver3.py:1997.
# According to PEP , when using Python 3, the response status
# and headers must be bytes masquerading as unicode; that is, they
# must be of type "str" but are restricted to code points in the
# "latin-1" set.
Therefore, header values must be strings whenever start_response is called.
WSGI servers must accumulate headers in some data structure and must call the
supplied "start_response" function, when they have gathered all the headers and
converted all the values to strings.
The fault I observed is not strictly speaking caused by a bug in Python lib
"server.py". Rather, it is a component interaction failure caused by
inadequately defined semantics. The interaction between web.py and server.py is
quite complex, and no component is faulty when considered alone.
I explain:
Response and headers management in server.py is handled by 3 methods of class
BaseHTTPRequestHandler:
- send_response : puts response in buffer
- send_header : converts to string and adds to buffer
("%s: %s\r\n" % (keyword, value)).encode('latin-1', 'strict'))
- end_headers : flushes buffer to socket
This implementation is correct even if send_header is called with an
int value.
Now, web.py's application.py defines a "wsgi(env, start_resp)" function, which
gets plugged into the CherryPy WSGI HTTP server.
The server is an instance of class wsgiserver.CherryPyWSGIServer created in
httpserver.py:169 (digging deeper, actually at line 195).
This server is implemented as a HTTPServer configured to use gateways of type
class WSGIGateway_10 to handle requests.
A gateway is basically an instance of class initialized with a HTTPRequest
instance, that has a "respond" method. Of course the WSGIGateway implements
"respond" as described in the WSGI standard: it calls the WSGI-compliant web
app, which is a function(environ, start_response(status, headers)) returning an
iterator (for chunked HTTP responses). The start_response function provided by
class WSGIGateway is where the failure occurs.
When the application calls web.py's app.run(), the function runwsgi in web.py's
wsgi.py get called. This function determines if it gets request via CGI or
directly. In my case it starts a HTTP server using web.py's runsimple function
(file httpserver.py:158).
This function never returns, and runs the CherryPyWSGIServer, but it first
wraps the wsgi function in two WGSI Middleware callables. Both are defined in
web.py's httpserver.py file. The interesting one is StaticMiddleWare (line
281). Its role, is to hijack URLs starting with /static, as is the case with my
missing CSS file. In order to serve those static resources quickly, its
implementation uses StaticApp (a WSGI function serving static stuff, defined
line 225), which extends Python's SimpleHTTPRequestHandler. That's where to two
libraries connect.
StaticApp changes the way headers are processed using overloaded methods for
send_response, send_header and end_headers. This means that, when StaticApp
calls SimpleHTTPRequestHandler.send_head() to send the HEAD part of the
response, the headers are managed using the overloaded methods. When
send_head() finds out that my CSS file does not exist and calls send_error() a
Content-Length header gets written, but it is not converted to string, because
the overloaded implementation just stores the header name and value in a list
as they come.
When it has finished gathering headers using Python's send_head(), it
immediately calls start_response provided by WSGIGateway, where the failure
occurs.
The bug in Python is not strictly that send_header gets called with an int in
send_error. Rather, it is a documentation bug which fails to mention that
send_header/end_headers MUST CONVERT TO STRING and ENCODE IN LATIN-1.
Therefore the correction I proposed is still invalid, because the combination
of web.py and server.py after the correction, still does not properly encode
the headers.
As a conclusion I would say that:
- In Python lib, the bug is a documentation bug, where documentation fails to
indicate that send_headers and/or end_headers can receive header names or
values which are not strings and not encoded in strict latin-1, and that it is
their responsibility to do so.
- In Web.py because the implementation of the overloaded methods fails to
properly encode the headers.
Of course, changing int to str does no harm and makes everything more
resilient, but does not fix the underlying bug.
--
___
Python tracker
<https://bugs.python.org/issue33663>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com