Hi,
If you use solr 4.8.1, you don't have to add URIEncoding="UTF-8" in the
tomcat conf file anymore :
https://wiki.apache.org/solr/SolrTomcat
Regards,
Aurélien MAZOYER
On 29.07.2014 14:22, Gulliver Smith wrote:
I have solr 4.8.1 under Tomcat 7 on Debian Linux. The connector in
Tomcat's server.xml has been changed to include character encoding
UTF-8:
<Connector port="8080" protocol="HTTP/1.1"
URIEncoding="UTF-8"
connectionTimeout="20000"
redirectPort="8443" />
I am posting to the server from PHP 5.5 curl. The extract POST was
intercepted and confirmed that everything is being encode in UTF-8.
However, the responses to query commands, whether XML or JSON are
returning field values such as title_fr in something that looks like
latin1 or iso-8859-1 when displayed in a browser or editor.
E.g.: "title_fr":[" appelé au téléphone"]
The highlights in the query response do have correctly displaying
character codes.
E.g. "text_fr":[" \n \n \n \n \n \n \n \n \n \n \nappelé au
téléphone\nappelé au téléphone\n
PHP's utf8_decode doesn't make sense of the title_fr.
Is there something to configure to fix this and get proper UTF8
results for everything?
Thanks
Gulliver