You can just download Tika from Apache site, it's a separate product
and has command line interface.

Or to use Solr  extract handler: go through Solr tutorial, it explains
it. https://lucene.apache.org/solr/4_7_0/tutorial.html
Specifically, http://wiki.apache.org/solr/ExtractingRequestHandler and
http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput
Ignore the dynamic field part for now, just try extract only first,
you may not need the rest.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Tue, Mar 18, 2014 at 4:18 PM, Anders Gustafsson
<anders.gustafs...@pedago.fi> wrote:
> Thanks for the quick reply. I am a bit of a newb when it comes to Solr, Lux 
> and Tika so I would appreciate if you could give me some quick pointers how 
> to use/call Tika directly and/or how to send one file directly and  storing 
> the dynamic field?
>
>
>
> --
> Anders Gustafsson
> Engineer, CNI, CNE6, ASE
> Pedago, The Aaland Islands (N60 E20)
> www.pedago.fi
> phone +358 18 12060
> mobile +358 40506 7099
>
>
>>>> Alexandre Rafalovitch <arafa...@gmail.com> 2014-03-18 11:13 >>>
> Have you tried just using Tika directly and seeing what gets output?
> Maybe it is all prefixed somehow. Or sending one file as a sample
> directly to the extract handler and temporarily storing the ignored_*
> dynamicField to see what actually happens?
>

Reply via email to