[issue2193] Cookie Colon Name Bug
BM added the comment: To Carsten Klein: It would be great if you turn your eyes on and try to read more carefully before posting something here. NAME=VALUE NAME is the cookie’s name, and VALUE is its value. Thus the header Set-Cookie: id=waldo sets a cookie with name id and value waldo. Both the cookie NAME and its VALUE may be any sequence of characters except semi-colon, comma, or whitespace. In the above it says "any sequence of characters" EXCEPT a three characters: 1. semi-colon 2. comma 3. whitespace In English this means that "any sequence of characters" INCLUDES a colon and thus colon IS a valid character. BTW, this stupid bug is three years old, while the rest of the world implemented it right (Java, Ruby etc). Also Python implementation of this part is at least... strange (being polite here). Because instead of excluding illegal chars, they actually going opposite by including the entire world and then going mad in the whole code inside... :-( -- ___ Python tracker <http://bugs.python.org/issue2193> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2193] Cookie Colon Name Bug
New submission from BM: According to David M. Kristol, only comma, space and semi-colon are forbidden in the cookie Name. However, Python's Cookie.py rejects a colon too. At the same time, Java Cookie in the servlet implementation allows a colon and Perl too. The fix would be to add a colon symbol into _LegalChars variable. -- components: Library (Lib) messages: 63023 nosy: BM severity: major status: open title: Cookie Colon Name Bug versions: Python 2.4, Python 2.5, Python 2.6 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2193> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2193] Cookie Colon Name Bug
BM added the comment: OK, I see and agree there are no actually that standard that we can call as a standard. But let me try to put in the other way again: 1. This documentation refers to the same RFC2109: http://docs.python.org/lib/module-Cookie.html But the RFC is slightly older than next David's edition. 2. David M. Kristol's cookie overview also says that only comma, semi-column and a space is not allowed. Here you go: http://arxiv.org/abs/cs.SE/0105018 3. Java implements the *same* RFC2109 but supports a colon too, as oppose to Python version. Here is the link to the source of Tomcat 6 (the latest one): http://www.google.com/codesearch? hl=en&q=show:okuSsOjruck:iKnUOb7eVzc:kvBYp8tS5ms&sa=N&ct=rd&cs_p=ftp://apache.mirrors.pair.com/tomcat/tomcat- 6/v6.0.10/src/apache-tomcat-6.0.10-src.zip&cs_f=apache-tomcat-6.0.10- src/java/javax/servlet/http/Cookie.java&start=1 As you can see, there is no 0x3a to be excluded. The snippet is: --- private static final String tspecials = ",; "; private boolean isToken(String value) { int len = value.length(); for (int i = 0; i < len; i++) { char c = value.charAt(i); if (c < 0x20 || c >= 0x7f || tspecials.indexOf(c) != -1) return false; } return true; } --- I agree, Java is not a standard, but yet another (buggy) language. :-) Still it means something... 4. Perl module from CPAN does the same and allows a colon. http://search.cpan.org/~gaas/libwww-perl-5.808/lib/HTTP/Cookies.pm 5. You probably refer to the old Netscape specs (http://wp.netscape.com/newsref/std/cookie_spec.html) that for instance allows to contain an unquoted "," in the expires field, so usually new parser have to use special ad-hoc way to get it right. The difference between old format of cookies and new one is, that cookie name begins with a $. So the old format expects these cookies to be separated by semi-colon, not comma. 6. I am not very sure that tokens you are talking about are referring to NAME of Set-Cookie NAME=VALUE pair. Because the same section allows a white space between tokens, while it is not very true. Moreover, braces etc *are* allowed. The reason why comma, space and semi-colon are disallowed, because of parser should know where it what. Other symbols parsers does not care... 7. Maybe we should ask D.Kristol for this after all. :-) Hm... What do you think? :) __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2193> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2193] Cookie Colon Name Bug
BM added the comment: Well, as D.M.Kristol says: there are no any standard for this particular topic. And RFC is not any standard but a request for comments... Personally I've been added a colon in Cookie.py for let Trac and other Python-based software stop crashing, because such sort of cookies are quite often appears. For some reason people treat a colon as a namespace separator, like in XML. Thus there are plenty of cookies like "section:key=value" you can meet quite often. But this is not a fix, but just quick local fix. You also can find a lots of cookies that consists "[" and "]", slash, even a space that has been quoted as "%20", which means a "%" inside the token -- just look what godaddy.com gives to you. :-) So I see another problem here: there is not just a colon thing, but implementation should be slightly different. Currently Python code lists allowed chars. I am not sure it is correct way to implement. Rather I would list disallowed chars (see example in Java). Here is also an example how Ruby does the thing (see lib/webrick/cookie.rb): -- def self.parse(str) if str ret = [] cookie = nil ver = 0 str.split(/[;,]\s+/).each{|x| # <--- Here you go. key, val = x.split(/=/,2) val = val ? HTTPUtils::dequote(val) : "" case key when "$Version"; ver = val.to_i when "$Path";cookie.path = val when "$Domain"; cookie.domain = val when "$Port";cookie.port = val else ret << cookie if cookie cookie = self.new(key, val) cookie.version = ver end } ret << cookie if cookie ret end end -- I still have doubts that Cookie NAME is the same token, because specs still disallows only comma, semi-colon and a space. Well, I don't know, but we can either stick to Request For Comments or we can behave as the rest of the world, avoiding future interference. Would be nice to hear some comments on policies... Tim? -- nosy: +tim_one __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2193> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com