Re: document contained more than 100000 characters

Shawn Heisey Tue, 24 Mar 2015 07:57:40 -0700

On 3/23/2015 3:08 AM, Srinivas wrote:
> Present in my project we are using apache tika for reading metadata of the
> file,So whenever we handled large files(contained more than 100000
> characters file) tika generating the error is file contained more than
> 100000 characters, So is it possible or not handling large files by using
> tika,Please let me know.


This sounds like a Tika problem.  This is a solr mailing list.  You may
find some Tika expertise here, but this is the incorrect place for a
question about Tika.

Solr does use the Tika parser, in the contrib module for the
ExtractingRequestHandler.  I have never heard of such a limitation in
the context of the ExtractingRequestHandler, and I've heard some people
complain about OutOfMemory exceptions when they index 4 gigabyte PDF
files with our extracting handler ... so I am guessing that you are
using Tika in your own software.  If that is correct, you'll need to ask
your question on a Tika mailing list.

Thanks,
Shawn

Re: document contained more than 100000 characters

Reply via email to