https://issues.apache.org/bugzilla/show_bug.cgi?id=45406
--- Comment #11 from Ran Rubinstein <[EMAIL PROTECTED]> 2008-07-21 03:41:16 PST --- (In reply to comment #10) Will, I'm sorry to drag this on, but I want to understand fully where I'm wrong in this. AFAIK, an ascii URL with one character represented in %-encoding such as http://www.google.com/q=%D7%05 does represent a legal UTF-16 encoded URL. UTF-16 %-Encoding does not mean the client sends two bytes or a wchar for each letter in the URL, but rather that it sends the URL in ASCII, except for the parts of the query string are not ASCII and they are encoded using %-Encoding, with the bytes there determined by the selected encoding (usually UTF-8). This is also the behavior of java's built-in URLEncoder.encode()/decode() functions. So a UTF-16 encoded URL, can look like this: http://www.google.com/q=%D7%05 and be legal. Is my concept completely off-base? If this is true, I see no reason for tomcat not to support this (except of course that the architecture right now does not support it, since the %-decoding and string building classes are separate - byteChunk expects, well, a chunk of bytes, which it translates to a string according to the given encoding. UDecoder translates the URL to this chunk of bytes. I suggest that instead of this, when processing URLs/URI's tomcat will use a combined approach that is compatible with the %-encoding rule that only non-ascii characters are %-encoded. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]