You can just download Tika from Apache site, it's a separate product and has command line interface.
Or to use Solr extract handler: go through Solr tutorial, it explains it. https://lucene.apache.org/solr/4_7_0/tutorial.html Specifically, http://wiki.apache.org/solr/ExtractingRequestHandler and http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput Ignore the dynamic field part for now, just try extract only first, you may not need the rest. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Tue, Mar 18, 2014 at 4:18 PM, Anders Gustafsson <anders.gustafs...@pedago.fi> wrote: > Thanks for the quick reply. I am a bit of a newb when it comes to Solr, Lux > and Tika so I would appreciate if you could give me some quick pointers how > to use/call Tika directly and/or how to send one file directly and storing > the dynamic field? > > > > -- > Anders Gustafsson > Engineer, CNI, CNE6, ASE > Pedago, The Aaland Islands (N60 E20) > www.pedago.fi > phone +358 18 12060 > mobile +358 40506 7099 > > >>>> Alexandre Rafalovitch <arafa...@gmail.com> 2014-03-18 11:13 >>> > Have you tried just using Tika directly and seeing what gets output? > Maybe it is all prefixed somehow. Or sending one file as a sample > directly to the extract handler and temporarily storing the ignored_* > dynamicField to see what actually happens? >