Thanks for the answer on this point. Reading section 3.7.1 of RFC 2616 indicates that request can specify a character other than the default. For this reason the following should technically be legal:

<form action="" method="post" enctype="application/x-www-form- urlencoded; charset=utf-8" accept-charset="utf-8">

What I see, from testing on my Mac, is that Firefox and Safari fail to pass the charset attribute, but Opera does. What I do notice here is that even though Opera does specify the character set, Tomcat ignores it replacing the submitted Japanese characters by question marks. This is an indication that UTF-8 was accepted but it was converted to ISO-8859-1 and no equivalent mapping was available. With Firefox and Safari I get the same behaviour when I specify:

   request.setCharacterEncoding("UTF-8");

Basically I am not getting the Japanese characters as typed in the form. There is a problem here.

André-John

On 6-Oct-08, at 22:22 , William A. Rowe, Jr. wrote:

Andre-John Mas wrote:
Just to repeat what I stated in the ticket:

The problem I have with the suggested approach is that it treats UTF-8
as an
exception, rather that a norm for my whole application server. I am not
sure
that I should be having to be specifying the encoding before handling every
request. For a web site that is completely in UTF-8 that is a lot of
duplicated
code.

Because of rfc 2616 3.7.1;

  The "charset" parameter is used with some media types to define the
  character set (section 3.4) of the data. When no explicit charset
  parameter is provided by the sender, media subtypes of the "text"
type are defined to have a default charset value of "ISO-8859-1" when
  received via HTTP. Data in character sets other than "ISO-8859-1" or
  its subsets MUST be labeled with an appropriate charset value. See
  section 3.4.1 for compatibility problems.

Also, I ask the question why should we allow one behaviour for the URI
in the
container and not allow for the same with regards to the POST?

because the same does not apply, it's not a specific encoding.
Header fields are 8859-1 per section 2.2, but URI's aren't defined
as *TEXT.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to