Hi, Currently I am facing issue whereby the text in images file like jpg, bmp are not being extracted out and indexed. After the indexing, Tika did extract all the meta data out and index them under the fields attr_*. However, the content field is always empty for images file. For other types of document files like .doc, the content is extracted correctly.
I have already updated the tika-parsers-1.17.jar, under \prg\apache\tika\parser\pdf\ for extractInlineImages to true. What could be the reason? I have just upgraded to Solr 7.3.0. Regards, Edwin