When I use HttpClient and its PostMethod to post a query with some Chinese, solr fails returning any record, or return everything. ... ... method = new PostMethod(solrReq); method.getParams().setContentCharset("UTF-8"); method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"); ... ...
I used tcp dump and found out the query my application above sent is an urlencoded query string to solr (see the "q=xxx" part): ../....SPOST /solr/413/select HTTP/1.1 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Accept: */* User-Agent: Jakarta Commons-HttpClient/3.1 Host: 172.20.73.142:8080 Content-Length: 192 q=type%3Amessage+AND+customer_id%3A413+AND+subject_zhs%3A%E8%83%BD%E5%8A%9B+&hl.fl=&qt=standard&wt=standard&rows=20 17:09:55.592527 IP xxx> yyy.webcache: tcp 0 ... ... I found this urlencoding is what causing solr query failing. I found this by copying the above urlencoded query to a file and use curl command, then I got same error, but if I replace the above query with decoded string, then it works with solr: curl -v -H 'Content-type:application/x-www-form-urlencoded; charset=utf-8' http://localhost:8080/solr/413/select --data @/tmp/chinese_query when /tmp/chinese_query has following it works with solr: q=type:message+AND+customer_id:413+AND+subject_zhs:能力+&hl.fl=&qt=standard&wt=standard&rows=20 But if I switched the /tmp/chinese_query to use urlencoded string, it fails again with same error: q=type%3Amessage+AND+customer_id%3A413+AND+subject_zhs%3A%E8%83%BD%E5%8A%9B+&hl.fl=&qt=standard&wt=standard&rows=20 So, my conclusion: 1) solr (I am using 3.5) only accept decoded query string, it fails with url encoded query 2) httpclient will send out urlencoded string no matter what (there is no way seems to me to make it sends out request in POST without urlencoding the body). am I missing something, or do you have any suggestion what I am doing wrong? thanks Jie -- View this message in context: http://lucene.472066.n3.nabble.com/POST-query-with-non-ASCII-to-solr-using-httpclient-wont-work-tp4032957.html Sent from the Solr - User mailing list archive at Nabble.com.