Hello,
Could anyone recieve my email? I'm new to solr and I have some questions,
could anyone help me to give me some answer??
I index file directly by extracting the content of file using Tika
embeded in solr. There is no problem of normal files. While I index a word
embeded an another file, such as a pdf file embed in a word, I couldn't get the
content of embeded file. For example, I have a word(doc) and there is a pdf
embeded in the word(doc), I couldn't index the content of the pdf file. While
using the same jar of Tika to extract the content of embed file, I can get the
content of embeded file.
I know Tika could extract the embed file since version 1.3. And the
version of my solr is 4.9.1, Tika used in this version of solr is 1.5. I don't
know why I can't get the content of embed file.
Could anyone help me? Thank you very much.
Ping Liu
18 June. 2015