[issue17322] urllib.request add_header() currently allows trailing spaces (and other weird stuff)

karl Sun, 03 Mar 2013 11:14:32 -0800

karl added the comment:

Hello,


So I tested a bit. The production rules defined by the specification are clear. 
Spaces before and after are forbidden. 

    header-field   = field-name ":" OWS field-value BWS
    field-name     = token
    field-value    = *( field-content / obs-fold )
    field-content  = *( HTAB / SP / VCHAR / obs-text )
    obs-fold       = CRLF ( SP / HTAB )
                   ; obsolete line folding
                   ; see Section 3.2.4

and 

  token = 1*tchar

and tchar as 

  tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
   "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA

Here are the production rules for HTTP headers for messages (so both Request 
and Responses). 

You can have funky headers, I guess that would be interesting to add to the 
urllib tests too. Basically to have something in the library, which check if 
header contains the tchar characters and sends back a warning of exception when 
not part of it.

curl has a bug too, IMHO. Though, one might argue that it is practical for 
testing bugs. :)

On the side of parsing it's clear for the trailing space but unknown for the 
leading spaces. I sent a long email explaining the issue to the HTTP WG.

See http://lists.w3.org/Archives/Public/ietf-http-wg/2013JanMar/1166.html

Let's see what will be the answers

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue17322>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17322] urllib.request add_header() currently allows trailing spaces (and other weird stuff)

Reply via email to