Yes, Solr is distributed with Tika. Look in:
./solr/contrib/extraction/lib

Tika is upgraded when new versions come out, so the underlying files
are whatever are current at the time.

The integration is a fairly loose coupling, if you're using some
external program (say a SolrJ program) to parse the files, there's no
requirement to use the jars distributed with Solr, use whatever suits
your fancy. An external program just constructs a SolrDocument to send
to Solr. What you use to create that document is irrelevant. See:
https://lucidworks.com/2012/02/14/indexing-with-solrj/ for some
background.

If you're using the ExtractingRequestHandler, where you just send the
semi-structured docs to Solr (PDFs, Word or whatever), then needing to
know anything about individual Tika-related jar files is kind of
strange.

If your predecessors wrote some custom code that runs as part of Solr,
I don't know what to say...

Best,
Erick

On Tue, Mar 19, 2019 at 10:47 AM Tannen, Lev (USAEO) [Contractor]
<lev.tan...@usdoj.gov.invalid> wrote:
>
> Thank you Shawn.
> I assumed that tika has been integrated with solr. I the project written 
> before me they used two tika files taken from solr distribution. I am trying 
> to do the same with solr 7.7.1. However this version contains a different set 
> of tika related files. So I am confused. Does  solr does not have integrated 
> tika anymore, or I just cannot recognize them?
>
> -----Original Message-----
> From: Shawn Heisey <apa...@elyograg.org>
> Sent: Tuesday, March 19, 2019 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrading tika
>
> On 3/19/2019 9:03 AM, levtannen wrote:
> > Could anybody suggest me what files do I need to use the latest
> > version of Tika and where to find them?
>
> This mailing list is solr-user.  Tika is an entirely separate project from 
> Solr within the Apache Foundation.  To get help with Tika, you'll need to ask 
> that project.
>
> https://tika.apache.org/mail-lists.html
>
> Thanks,
> Shawn

Reply via email to