If you take that form and post it - how does the server know that the
content is UTF-8? (Answer: it doesn't)
The HTML directives tell the browser to encode everything into UTF-8 on
the way to the web server. But there is nothing that tells the webserver
explicitly what the charset is of the incoming request.
See the server spec fo more details, in particular
4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the
Content-Type header, leaving open the determination of the character
encoding for reading HTTP requests. The default encoding of a request
the container uses to create the request reader and parse POST data must
be “ISO-8859-1” if none has been specified by the client request.
However, in order to indicate to the developer in this case the failure
of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.
-Tim
Andre-John Mas wrote:
Thanks for the answer on this point. Reading section 3.7.1 of RFC 2616
indicates that request can specify a character other than the default.
For this reason the following should technically be legal:
<form action="" method="post"
enctype="application/x-www-form-urlencoded; charset=utf-8"
accept-charset="utf-8">
What I see, from testing on my Mac, is that Firefox and Safari fail to
pass the charset attribute, but Opera does. What I do notice here is
that even though Opera does specify the character set, Tomcat ignores it
replacing the submitted Japanese characters by question
marks. This is an indication that UTF-8 was accepted but it was
converted to ISO-8859-1 and no equivalent mapping was available. With
Firefox and Safari I get the same behaviour when I specify:
request.setCharacterEncoding("UTF-8");
Basically I am not getting the Japanese characters as typed in the form.
There is a problem here.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]