: We have data having such symbols like :  µ
: Indexed data has  -    Dose:"0 µL"
: Now , when  it is searched as  - Dose:"0 µL"
        ...
: Query Q value observed  : <str name="q">S257:"0 µL/injection"</str>

First off: your "when searched as" example does not match up to your 
"Query Q" observed value (ie: field queries, extra "/injection" text at 
the end) suggesting that you maybe cut/paste something you didn't mean to 
-- so take the rest of this advice with a grain of salt.

If i ignore your "when it is searched as" exampleand focus entirely on 
what you say you've indexed the data as, and the Q value you are sing (in 
what looks like the echoParams output) then the first thing that jumps out 
at me is that it looks like your servlet container (or perhaps your web 
browser if that's where you tested this) is not dealing with the unicode 
correctly -- because allthough i see a "µ" in the first three lines i 
quoted above (UTF8: 0xC2 0xB5) in your value observed i'm seeing it 
preceeded by a "Â" (UTF8: 0xC3 0x82) ... suggesting that perhaps the "µ" 
did not get URL encoded properly when the request was made to your servlet 
container?

In particular, you might want to take a look at...

https://wiki.apache.org/solr/FAQ#Why_don.27t_International_Characters_Work.3F
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
The example/exampledocs/test_utf8.sh script included with solr




-Hoss

Reply via email to