Thanks Andrea. I can see that Tika1.5 supports both compressed (ZIP) and
image (JPG) formats. If thats the case, why SolrCell could not index the
documents of .zip and .jpg? Am I missing something here?  No error is
thrown in the overall process and the java program completes successfully.
But when I query the Solr UI, only 8 files are indexed.

Attached is a simple screenshot of the files types I am trying to index.

Thanks & Regards
Vijay

On 15 April 2015 at 15:27, Andrea Gazzarini <a.gazzar...@gmail.com> wrote:

> Hi Vijay,
> here you can find all supported formats by Tika, which is internally used
> by SolrCell:
>
>  * https://tika.apache.org/*1.4*/formats.html
>  * https://tika.apache.org/*1.5*/formats.html
>  * https://tika.apache.org/*1.6*/formats.html
>  * https://tika.apache.org/*1.7*/formats.html
>
> Best,
> Andrea
>
>
>
>
> On 04/15/2015 04:20 PM, Vijaya Narayana Reddy Bhoomi Reddy wrote:
>
>> Hi,
>>
>> I am trying to index various binary file types into Solr. However, some
>> file types seems to be ignored and not getting indexed, though the
>> metadata
>> is being extracted successfuly for all the types.
>>
>> Specifically, zip files and jpg files are not getting indexed, where as
>> pdf, MS office documents are getting indexed. Hence wondering whether
>> there
>> is a defined list of indexable file types.
>>
>> Moreover, I am just wondering why Solr could not index the jpg and zip
>> documents when it was able to extract the metadata from those files?
>>
>> The code snippet is as below:
>>
>> contentStreamUpdateReq.addFile(file, fileType);
>> contentStreamUpdateReq.setParam("literal.id", literalId);
>> contentStreamUpdateReq.setParam("uprefix", "attr_");
>> contentStreamUpdateReq.setParam("fmap.content", "content");
>> contentStreamUpdateReq.setAction(AbstractUpdateRequest.ACTION.COMMIT,
>> true,
>> true);
>> solrServer.request(contentStreamUpdateReq);
>>
>> Thanks & Regards
>> Vijay
>>
>>
>

-- 
The contents of this e-mail are confidential and for the exclusive use of 
the intended recipient. If you receive this e-mail in error please delete 
it from your system immediately and notify us either by e-mail or 
telephone. You should not copy, forward or otherwise disclose the content 
of the e-mail. The views expressed in this communication may not 
necessarily be the view held by WHISHWORKS.

Reply via email to