Re: Cyrillic characters

Tricia Williams Wed, 19 Jul 2006 08:29:59 -0700

Hi Yonik,

I was incorrect to describe it as _solr encoding_. Hoss suggested thatit might be a form error - I haven't checked this yet but it soundplausible. What I called the _solr url encoding_ was the q= parametertranslated into <I'm not sure what> encoding in the url. As I mention inmy ps this translated value is not the same as when I use IE to post thesame form values.

You mentioned in another earlier post that q=h%c3%e9 would findmatching hits. My experience shows that while the UTF-8 encoded querydoesn't generate any exceptions, no results are matched. Howeverq=h%e9llo would find matching results (the result set I'd match in Luke).So assuming that I can fix the form encoding errors so that the charactersare encoded as UTF-8, I believe that I would continue to return incorrectresults. Will cyrillic characters be treated any differently than thediacritic in your example?


   I have solr running in tomcat 5.5.17.

Thanks for all you help,
Tricia


On Tue, 18 Jul 2006, Yonik Seeley wrote:

On 7/18/06, Tricia Williams <[EMAIL PROTECTED]> wrote:

 My sample query is: ...... (the english word _canada_
translated into russian) or
%D0%9A%D0%B0%D0%BD%D0%B0%D0%B4%D0%B0 (utf-8) or
%26%231050%3B%26%231072%3B%26%231085%3B%26%231072%3B%26%231076%3B%26%231072%3B
(solr url encoding)


Hi Tricia,

Could you clarify what you mean by "solr url encoding"? Where do you seethis?

The servlet container decodes URLs, and I'm not sure where in Solr
that URLs are encoded.

-Yonik

Re: Cyrillic characters

Reply via email to