https://issues.apache.org/bugzilla/show_bug.cgi?id=55951

--- Comment #3 from Konstantin Kolinko <knst.koli...@gmail.com> ---
(In reply to Jeremy Boynes from comment #0)
> The HTML5 specification is specifying that cookie values may contain
> characters that are not part of US-ASCII or ISO-8859-1 and that those
> codepoints should be UTF-8 encoded for display.
> 
> http://www.w3.org/html/wg/drafts/html/master/single-page.html#cookie
>

What is the exact wording?

The above link is broken - there is no "cookie" anchor in the current version
of that document. All I see are references to [COOKIES] document (#refsCOOKIES
anchor) = RFC 6265.

http://tools.ietf.org/html/rfc6265


RFC 6265 does not allow non-ascii characters in cookie value in Set-Cookie
header. Citing from its Chapter 4.1.1. Set-Cookie / Syntax,

 set-cookie-header = "Set-Cookie:" SP set-cookie-string
 set-cookie-string = cookie-pair *( ";" SP cookie-av )
 cookie-pair       = cookie-name "=" cookie-value
 cookie-name       = token
 cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
 cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash

The cookie-value is limited to US-ASCII, even when quoted.

At the same time, attributes (cookie-av) do not have such limitation and as
such may be UTF-8:

 path-av           = "Path=" path-value
 path-value        = <any CHAR except CTLs or ";">


For reference, the place where UTF-8 is mentioned in RFC 6265 is in chapter
5.4. The Cookie Header. Citing:

   NOTE: Despite its name, the cookie-string is actually a sequence of
   octets, not a sequence of characters.  To convert the cookie-string
   (or components thereof) into a sequence of characters (e.g., for
   presentation to the user), the user agent might wish to try using the
   UTF-8 character encoding [RFC3629] to decode the octet sequence.
   This decoding might fail, however, because not every sequence of
   octets is valid UTF-8.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to