Thanks Andrea. I can see that Tika1.5 supports both compressed (ZIP) and image (JPG) formats. If thats the case, why SolrCell could not index the documents of .zip and .jpg? Am I missing something here? No error is thrown in the overall process and the java program completes successfully. But when I query the Solr UI, only 8 files are indexed.
Attached is a simple screenshot of the files types I am trying to index. Thanks & Regards Vijay On 15 April 2015 at 15:27, Andrea Gazzarini <a.gazzar...@gmail.com> wrote: > Hi Vijay, > here you can find all supported formats by Tika, which is internally used > by SolrCell: > > * https://tika.apache.org/*1.4*/formats.html > * https://tika.apache.org/*1.5*/formats.html > * https://tika.apache.org/*1.6*/formats.html > * https://tika.apache.org/*1.7*/formats.html > > Best, > Andrea > > > > > On 04/15/2015 04:20 PM, Vijaya Narayana Reddy Bhoomi Reddy wrote: > >> Hi, >> >> I am trying to index various binary file types into Solr. However, some >> file types seems to be ignored and not getting indexed, though the >> metadata >> is being extracted successfuly for all the types. >> >> Specifically, zip files and jpg files are not getting indexed, where as >> pdf, MS office documents are getting indexed. Hence wondering whether >> there >> is a defined list of indexable file types. >> >> Moreover, I am just wondering why Solr could not index the jpg and zip >> documents when it was able to extract the metadata from those files? >> >> The code snippet is as below: >> >> contentStreamUpdateReq.addFile(file, fileType); >> contentStreamUpdateReq.setParam("literal.id", literalId); >> contentStreamUpdateReq.setParam("uprefix", "attr_"); >> contentStreamUpdateReq.setParam("fmap.content", "content"); >> contentStreamUpdateReq.setAction(AbstractUpdateRequest.ACTION.COMMIT, >> true, >> true); >> solrServer.request(contentStreamUpdateReq); >> >> Thanks & Regards >> Vijay >> >> > -- The contents of this e-mail are confidential and for the exclusive use of the intended recipient. If you receive this e-mail in error please delete it from your system immediately and notify us either by e-mail or telephone. You should not copy, forward or otherwise disclose the content of the e-mail. The views expressed in this communication may not necessarily be the view held by WHISHWORKS.