Some servlet containers don't do UTF-8 out of the box. There is information about this on the wiki.
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Monday, October 01, 2007 9:45 AM To: solr-user@lucene.apache.org Subject: Re: Searching combined English-Japanese index On 10/1/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote: > Yonik Seeley schrieb: > > On 10/1/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote: > >> When I search using an English term, I get results but the Japanese > >> is not encoded correctly in the response. (although it is UTF-8 > >> encoded) > > > > One quick thing to try is the python writer (wt=python) to see the > > actual unicode values of what you are getting back (since the python > > writer automatically escapes non-ascii). That can help rule out > > incorrect charset handling by clients. > > > > -Yonik > > > Thanks for the tip, it turns out that the unicode values are wrong... > I mean the browser displays correctly what is send. But I don't know > how solr gets these values. OK, so they never got into the index correctly. The most likely explanation is that the charset wasn't set correctly when the update message was sent to Solr. -Yonik