When I send plain utf-8 text to index(non-english text), all ok, but with HTML I have wrong characters instead of non-ASCII symbols. So
$this->solr->extractContents($url, strip_tags($code), array("literal.url"=>$url,"fmap.content"=>"body")); Works well, but just $this->solr->extractContents($url, $code, array("literal.url"=>$url,"fmap.content"=>"body")); not ! What's the problem ? SOLR-PHP client used (code.google.com/p/solr-php-client/), but I think, problem isn't here. In both cases "text/plain" content-type noted in request(i've updated standard lib code) SOLR 1.4.1 / Tomcat 6 / Fedora 12 -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-problems-with-non-english-symbols-when-extracting-HTML-tp2729126p2729126.html Sent from the Solr - User mailing list archive at Nabble.com.