On Feb 18, 2009, at 1:53 PM, revathy arun wrote:
I am using php curl to post data to solr
container tomcat
i have uriencoding set to utf8 in tomcats server.xml file
this is how its indexed
....
$header[] = "Content-Type: text/xml; charset=utf-8";
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt( $ch, CURLOPT_HTTPHEADER, $header );
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$post_string);
.$data = curl_exec($ch);
......
however the document i am sending does not seem to have the utf8
encoding
What does Solr have stored for the documents? If you haven't set your
indexed fields to be stored, go ahead and do so (and restart/reindex)
for troubleshooting and do a /select?q=*:* to see what got stored for
the documents you're having trouble finding. I imagine if you have
encoding issues, that will show up as mangled stored text that
couldn't be analyzed properly.
How are you getting $post_string in your code?
Erik