Jack, Thanks for the hint, but we have already set URIEncoding="UTF-8" on all our tomcats, too.
Regards Andreas >>> "Jack Krupansky" 18.10.12 17.11 Uhr >>> It may be that your container does not have UTF-8 enabled. For example, with Tomcat you need something like: Make sure your "Connector" element has URIEncoding="UTF-8" (for Tomcat.) -- Jack Krupansky -----Original Message----- From: Andreas Kahl Sent: Thursday, October 18, 2012 10:53 AM To: solr-user@lucene.apache.org Subject: How to retrieve field contents as UTF-8 from Solr-Index with SolrJ Hello everyone, we are trying to implement a simple Servlet querying a Solr 3.5-Index with SolrJ. The Query we send is an identifier in order to retrieve a single record. From the result we extract one field to return. This field contains an XML-Document with characters from several european and asian alphabets, so we need UTF-8. Now we have the problem that the string returned by marcXml = results.get(0).getFirstValue("marcxml").toString(); is not valid UTF-8, so the resulting XML-Document is not well formed. Here is what we do in Java: << ModifiableSolrParams params = new ModifiableSolrParams(); params.set("q", query.toString()); params.set("fl", "marcxml"); params.set("rows", "1"); try { QueryResponse result = server.query(params, SolrRequest.METHOD.POST); SolrDocumentList results = result.getResults(); if (!results.isEmpty()) { marcXml = results.get(0).getFirstValue("marcxml").toString(); } } catch (Exception ex) { Logger.getLogger(MarcServer.class.getName()).log(Level.SEVERE, null, ex); } >> Charset.defaultCharset() is "UTF-8" on both, the querying machine and the Solr-Server. Also we tried BinaryResponseParser as well as XMLResponseParser when instantiating CommonsHttpSolrServer. Does anyone have a solution to this? Is this related to https://issues.apache.org/jira/browse/SOLR-2034 ? Is there eventually a workaround? Regards Andreas