Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change notification.
The "Tomcat/UTF-8" page has been changed by KonstantinKolinko. The comment on this change is: Removed all content of the page. The up-to-date version of all this is in FAQ/CharacterEncoding.. http://wiki.apache.org/tomcat/Tomcat/UTF-8?action=diff&rev1=13&rev2=14 -------------------------------------------------- + This page is obsolete. See [[FAQ/CharacterEncoding|FAQ/CharacterEncoding]] for the up-to-date version. - 1. - JSP pages must include the header: + ---- + CategoryObsolete - {{{ <%@ page - contentType="text/html; charset=UTF-8" - %> }}} - 2. - For translation of inputs coming back from the browser there must be a - method that translates from the browser's ISO-8859-1 to UTF-8. ISO-8859-1 - is the default character encoding for servers and browsers according to the - [[http://www.ietf.org/rfc/rfc2616.txt|HTTP specification]] section 3.4.1. - - {{{ /** - * Convert ISO-8859-1 format string (which is the default sent by IE - * to the UTF-8 format that the database is in. - */ - public String toUTF8(String isoString) - { - String utf8String = null; - if (null != isoString && !isoString.equals("")) - { - try - { - byte[] stringBytesISO = isoString.getBytes("ISO-8859-1"); - utf8String = new String(stringBytesISO, "UTF-8"); - } - catch(UnsupportedEncodingException e) - { - throw new RuntimeException(e); - } - } - else - { - utf8String = isoString; - } - return utf8String; - } }}} - I have found that these three steps are all that is necessary to make your - site accept any language that UTF-8 can work with. I extend my thanks to - those of you on the Tomcat users list who helped me find these little gems. - - (from the tomcat-user mailing list) - - '''Note''' This method is not useful because it doesn't work with non-ASCII character. "stringBytesISO" is an ISO-8859-1 byte stream. We can't use it as an UTF-8 byte stream if it contains non-ASCII character. - - '''Alternative solution''' - - The solution suggested above works, but from the architecture perspective the correct way is to add a filter to the Tomcat that will do necessary correction for the application deployed without any additional changes to the rest of the code. - - 1. Make sure JSP header is set as suggested: - {{{ - <%@ page contentType="text/html; charset=UTF-8"%> - }}} - - 2. Example of filter: - - {{{import java.io.*; - import java.util.*; - import javax.servlet.*; - import javax.servlet.http.*; - - public class CharsetFilter implements Filter - { - private String encoding; - - public void init(FilterConfig config) throws ServletException - { - encoding = config.getInitParameter("requestEncoding"); - - if( encoding==null ) encoding="UTF-8"; - } - - public void doFilter(ServletRequest request, ServletResponse response, FilterChain next) - throws IOException, ServletException - { - // Respect the client-specified character encoding - // (see HTTP specification section 3.4.1) - if(null == request.getCharacterEncoding()) - request.setCharacterEncoding(encoding); - - next.doFilter(request, response); - } - - public void destroy(){} - } - }}} - - Corresponding portion of web.xml configuration will look like: - - {{{ <!--CharsetFilter start--> - - <filter> - <filter-name>Charset Filter</filter-name> - <filter-class>CharsetFilter</filter-class> - <init-param> - <param-name>requestEncoding</param-name> - <param-value>UTF-8</param-value> - </init-param> - </filter> - - <filter-mapping> - <filter-name>Charset Filter</filter-name> - <url-pattern>/*</url-pattern> - </filter-mapping> - - <!--CharsetFilter end-->}}} - - The suggested solution originates from [[http://people.comita.spb.ru/users/sergeya/java/ruschars.html|Sergey Astakhov (all texts are in russian)]] (serg...@comita.spb.ru) - - '''Important note''': Note that this filter should be as far towards the front of your filter chain as possible. If some other code calls request.getParameter (or a similar method) before this filter is invoked, then the encoding will not be set properly, and your parameters will still be decoded improperly. - - '''- TIP -''' - - Update the file $CATALINA_HOME/conf/server.xml for UTF-8 support by connectors. - Example: - - {{{<Connector port="8080" - URIEncoding="UTF-8"/>}}} - - or - - {{{<Connector port="8080" - useBodyEncodingForURI="true"/>}}} - - * ''URIEncoding'' specifies the character encoding used to decode the URI. - * ''useBodyEncodingForURI'' indicates whether to use the encoding specified in contentType (or explicitly set using Request.setCharacterEncoding() method) to decode the URI query parameters. The default value is set to "false". - - '''Note''' that this changes the behavior of reading GET parameters from the request URI and will not affect POST parameters at all. - - == See Also == - * http://wiki.apache.org/tomcat/Tomcat/UTF-8 - * http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/ - --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org