Hi Yonik,

I was incorrect to describe it as _solr encoding_. Hoss suggested that it might be a form error - I haven't checked this yet but it sound plausible. What I called the _solr url encoding_ was the q= parameter translated into <I'm not sure what> encoding in the url. As I mention in my ps this translated value is not the same as when I use IE to post the same form values.

You mentioned in another earlier post that q=h%c3%e9 would find matching hits. My experience shows that while the UTF-8 encoded query doesn't generate any exceptions, no results are matched. However q=h%e9llo would find matching results (the result set I'd match in Luke). So assuming that I can fix the form encoding errors so that the characters are encoded as UTF-8, I believe that I would continue to return incorrect results. Will cyrillic characters be treated any differently than the diacritic in your example?

   I have solr running in tomcat 5.5.17.

Thanks for all you help,
Tricia


On Tue, 18 Jul 2006, Yonik Seeley wrote:

On 7/18/06, Tricia Williams <[EMAIL PROTECTED]> wrote:
 My sample query is: ...... (the english word _canada_
translated into russian) or
%D0%9A%D0%B0%D0%BD%D0%B0%D0%B4%D0%B0 (utf-8) or
%26%231050%3B%26%231072%3B%26%231085%3B%26%231072%3B%26%231076%3B%26%231072%3B
(solr url encoding)

Hi Tricia,
Could you clarify what you mean by "solr url encoding"? Where do you see this?
The servlet container decodes URLs, and I'm not sure where in Solr
that URLs are encoded.

-Yonik

Reply via email to