Hi, I'm not sure but you probably met Tika exception. Have you checked Apache Tika mailing list?
Hmm, just now I googled "Your document contained more than 100000 characters", I found a page in StackOverFlow. According to it, there is API to change the limit. But I don't know whether Solr can change the limit. If there is no chance to change the limit in Solr, you can open a JIRA ticket. koji -- http://soleami.com/blog/mahout-and-machine-learning-training-course-is-here.html (13/12/23 2:17), Nutan wrote:
Why is the error as : org.apache.tika.sax.WriteOutContentHandler$WriteLimitReachedException: Your document contained more than 100000 characters, and so your requested limit has been reached. To receive the full text of the document, increase your limit. (Text up to the limit is however available). at org.apache.tika.sax.WriteOutContentHandler.characters(WriteOutContentHandler.java:140) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.xpath.MatchingContentHandler.characters(MatchingContentHandler.java:85) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) when i added this in solrconfig.xml <requestDispatcher handleSelect="false" > <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="200048" /> </requestDispatcher> -- View this message in context: http://lucene.472066.n3.nabble.com/document-contained-more-than-100000-characters-tp4107792.html Sent from the Solr - User mailing list archive at Nabble.com.