Hi ,

I am trying to index various langauge documents (foroyo,chinese,japanese)
.These have been converted from pdf to text using xpdf
I am using the standard anlyzer for content analysis ,but i am not able to
search anything from some of the files.

My guess is that these documents are not in utf-8 encoding and hence solr
does not return result.


Is there any way to check the encoding of a text/pdf document or convert
them to utf -8 encoding?

while indexing i am sending the header for charset as utf-8 .

Any pointers?

Thanks

Reply via email to