https://issues.apache.org/bugzilla/show_bug.cgi?id=48550

--- Comment #6 from Attila Király <kiralyattila...@gmail.com> 2010-12-18 
05:54:48 EST ---
(In reply to comment #5)
> (In reply to comment #3)
> > - On client side FF 3.6, Chrome 8, Opera 11 and IE9 Beta (and as I found on 
> > the
> > web older versions too) use the character encoding of the page to encode the
> > query parameters. So if the html is served with utf-8 encoding the query
> > parameters are encoded with utf-8.
> 
> Could you provide references to the above? I had trouble finding official
> default values for the URL character encoding used by browsers.

I am afraid I can not give official references. The exact browser versions
mentioned above were tested by me (with UTF-8 and ISO-8859-1 encoded
pages-links) and those work like I wrote. But it is also mentioned in
- Tomcat wiki: http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q9
"Many browsers are starting to offer (default) options of encoding URIs using
UTF-8 instead of ISO-8859-1. Some browsers appear to use the encoding of the
current page to encode URIs for links (see the note above regarding browser
behavior for POST encoding)."
- MozillaZine KB about the Firefox "network.standard-url.encode-query-utf8"
config property: 
http://kb.mozillazine.org/Network.standard-url.encode-query-utf8
"For compatibility with these websites, as well as parity with IE and Opera,
Mozilla now treats the query portion of a URI (the part following the ?)
differently than the rest.[...]
Encode the query portion of IRIs using the same encoding as the current page.
(Default)"

Additionally Jetty is also using UTF-8 by default:
Jetty wiki:
http://docs.codehaus.org/display/JETTY/International+Characters+and+Character+Encodings#InternationalCharactersandCharacterEncodings-InternationalcharactersinURLs
"The W3C organization's HTML standard now recommends the use of UTF-8:
http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars and accordingly
jetty-6 series uses a default of UTF-8."

> 
> There's also the trouble of users being able to override the default and 
> revert
> back to (most likely) ISO-8859-1 encoding.
> 
> Right now, I'm -1 for making URIEncoding="UTF-8" by default since it might
> break a lot of servers, but I'm willing to be convinced. For the record, I
> always set URIEncoding="UTF-8" on my projects but we don't want an
> out-of-the-box server configuration to surprise anyone.

This is true. However for me it seems the web is moving to an UTF-8 based
direction. So I think a change to the default encoding should be made sometimes
in Tomcat. That is a backward compatibility issue so it should be made in a
major point release. The 7.0 could be that. If it is not done now the next
possibility is at 8.0 in the future. I don't say developers can't live without
this change I can cope with it as I did it always (I only mentioned my reasons
here because this issue was already opened).

Probably my real problem is that query parameter decoding is inconsistent
between servlet containers and there is no way to regulate it on a per webapp
base (instead of a server wide option) in Tomcat (could use
"useBodyEncodingForURI=true" attribute but it still a modification in the
server.xml).

I would also be happy with a Jetty like solution. In jetty 7.2 UTF-8 is the
default for query decoding but it is overridable with
request.setAttribute("org.eclipse.jetty.server.Request.queryEncoding",
"ISO-8859-1"); on a per request base. Tomcat could have something like that. So
in a filter I could call:
request.setCharacterEncoding("UTF-8"); // for Glassfish 3 query decoding, but
it is already done anyway as it is needed for POST-s too for all serlet
containers
request.setAttribute("org.eclipse.jetty.server.Request.queryEncoding",
"UTF-8"); // for Jetty, just to be sure
request.setAttribute("org.apache.tomcat.Request.queryEncoding or similar",
"UTF-8"); // for Tomcat 7 and up
and could get a safe portable way for at least 3 servlet containers.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to