RE: Searching combined English-Japanese index

Lance Norskog Mon, 01 Oct 2007 12:10:16 -0700

Some servlet containers don't do UTF-8 out of the box. There is information
about this on the wiki.


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Monday, October 01, 2007 9:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Searching combined English-Japanese index

On 10/1/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote:
> Yonik Seeley schrieb:
> > On 10/1/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote:
> >> When I search using an English term, I get results but the Japanese 
> >> is not encoded correctly in the response. (although it is UTF-8 
> >> encoded)
> >
> > One quick thing to try is the python writer (wt=python) to see the 
> > actual unicode values of what you are getting back (since the python 
> > writer automatically escapes non-ascii).  That can help rule out 
> > incorrect charset handling by clients.
> >
> > -Yonik
> >
> Thanks for the tip, it turns out that the unicode values are wrong... 
> I mean the browser displays correctly what is send. But I don't know 
> how solr gets these values.

OK, so they never got into the index correctly.
The most likely explanation is that the charset wasn't set correctly when
the update message was sent to Solr.

-Yonik

RE: Searching combined English-Japanese index

Reply via email to