On 10/11/2013 9:32 AM, PeteBleackley wrote:
> I tried changing the options to -Dauto -Dfiletypes=pdf. This gave me a 404
> error, apparently caused by post.jar adding /extract to the end of the URL

In order to use post.jar, you would need the /update/extract handler,
which is not defined in the tika core under example-DIH.

The example-DIH configurations are intended to use and illustrate the
dataimport handler - documents are imported using the /dataimport
handler and its config file, not sent directly with post.jar.

Here's a page covering what you would need in order to send PDFs
directly rather than import them using DIH:

http://wiki.apache.org/solr/ExtractingRequestHandler

Thanks,
Shawn

Reply via email to