Adding more confusion to the pile, HTML5[1] now specifies that JavaScript can set Unicode characters through document.cookie and that they must be encoded as UTF-8 in the header. Quick testing with Chrome shows it does just that (i.e. U+00E1 is sent as 0xC3 0xA1). If client and server-side application code is going to interoperate then we would need to accept them in a Cookie header and allow them to be sent in a Set-Cookie header. However, this is ambiguous when compared to Netscape and its implicit assumption of ISO-8859-1.
[1] http://www.w3.org/html/wg/drafts/html/master/single-page.html#cookie On Jan 1, 2014, at 10:18 AM, Jeremy Boynes <jboy...@apache.org> wrote: > On Jan 1, 2014, at 8:59 AM, Mark Thomas <ma...@apache.org> wrote: > >> Signed PGP part >> On 26/12/2013 19:23, Jeremy Boynes wrote: >>> On Dec 26, 2013, at 2:47 AM, Mark Thomas <ma...@apache.org> wrote: >>> >>> Focusing on the 8-bit issue address by the patch, leaving the other >>> RFC6265 thread for broader discussion ... >>> >>>>> The change only allows these characters in values if version == >>>>> 0 where Netscape’s rather than RFC2109’s syntax applies (per >>>>> the Servlet spec). The Netscape spec is vague in that it does >>>>> not define “OPAQUE_STRING" at all and defines “VALUE” as >>>>> containing equally undefined “characters” although >>>>> historically[1] those have been taken to be OCTETs as permitted >>>>> by RFC2616’s “*TEXT” variant of “field-content.” The change >>>>> will continue to reject these characters in names and in >>>>> unquoted values when version != 0 (RFC2109’s “word" rule) >>>>> >>>>> [1] based on comments by Fielding et al. on http-state and >>>>> what I’ve seen in the wild >>>> >>>> Can you provide references for [1]? >>> >>> This is the mail in the run up to RFC6265 that triggered the >>> discussion: >>> http://www.ietf.org/mail-archive/web/http-state/current/msg01232.html >> >> Thanks >>> >> for that reference. What a complete mess. RFC6265 really >> dropped the ball on this. The grammar for cookie-value is a disaster. >> So far the issues include: >> - no support for 0x80 to 0xFF >> - no support for \" sequences >> - no support for using whitespace, comma, semi-colon, backslash >> >> I was beginning to think that factoring out the cookie generation / >> parsing and then providing different implementations (one for Netscape >> + RFC2109 - roughly what we have now with a few fixes, one for RFC6265 >> and maybe one very relaxed) would be the way to go. Having looked at >> the first issue that plan already looks like it needs a re-think. >> >> I'm still hoping that by documenting all the various issues in one >> place we will be able to come up with a solution that both addresses >> all the issues you have raised and is better than the handful of >> system properties we have currently. > > I think they did a reasonable job given the mess cookies are in the wild > today. They summarize this in the preamble: >> The recommendations for cookie generation provided in Section 4 represent a >> preferred subset of current server behavior, and even the more liberal >> cookie processing algorithm provided in Section 5 does not recommend all of >> the syntactic and semantic variations in use today. > > Section 4 recommends guidelines for servers generating cookies. I interpret > that as being “if you follow these guidelines, you have a good chance of > actually getting back the value you tried to set.” The rules above (no 8-bit, > no escaping, no Netscape delimiters) reflect that principle. A server > application can step outside those guidelines but "thar ther be dragons." > > — > Jeremy
signature.asc
Description: Message signed with OpenPGP using GPGMail