Nicolas and Yonik,

Thank you both for your excellent responses--this fixed my problem. Now it's time to go back and remove all the hacks I was using to pin this thing together without proper utf-8 support.
Thanks again,
Peter

[EMAIL PROTECTED] wrote:
I think Tomcat defaults to the operating system default, e.g. cp1252 on a
classic windows.

You need to add an attribute URIEncoding="UTF-8" to the Connector you use in
the server.xml conf.

Nicolas

-----Message d'origine-----
De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de Yonik Seeley
Envoyé : vendredi 7 mars 2008 18:53
À : solr-user@lucene.apache.org
Objet : Re: Illegal xml/html character; unicode problems near solr

On Fri, Mar 7, 2008 at 12:30 PM, Peter Cline <[EMAIL PROTECTED]> wrote:
 The following is a snippet of a link to use a facet:
 search-faceted.html?q=[* TO
 *]&amp;facet=true&amp;rows=25&amp;fq=name_facet:&#34;Brasseur de
 Bourbourg, abb%C3%A9, 1814-1874, former owner&#34;"

 These characters are correctly specified. When it returns, I get an
 illegal character error. Examining the XML, I get an fq value of:
 name_facet:"Brasseur de Bourbourg, abbÃÂ(c), 1814-1874, former owner"

Is this bad XML part of the responseHeader (parameters that are simply
being echoed back)?
If so, it's most likely the config on whatever servlet container you
are using... you need to configure it to accept UTF-8 URLs rather than
latin-1 (Tomcat defaults to the old-style latin-1 AFAIK)

-Yonik

Reply via email to