Hi,

Currently I am facing issue whereby the text in images file like jpg, bmp
are not being extracted out and indexed. After the indexing, Tika did
extract all the meta data out and index them under the fields attr_*.
However, the content field is always empty for images file. For other types
of document files like .doc, the content is extracted correctly.

I have already updated the tika-parsers-1.17.jar, under
\prg\apache\tika\parser\pdf\ for extractInlineImages to true.


What could be the reason?

I have just upgraded to Solr 7.3.0.

Regards,
Edwin

Reply via email to