Hi,
the stack trace points to tika, which is likely in the process of
extracting indexable plain text from some document.
Tika's job is one of the dirtiest you can think of in the whole indexing
business. We throw all kinds of more or less
documented/broken/misguided/ill-designed/cruft/truncated documents at it
and want it to do miracles in understanding that stuff and getting the
plain text out.
It does quite a good job most of the time, but sometimes it just gets
trapped. There is nearly no chance to get Tika bug-free (whatever that
means when the requirements are ill-defined), so we must live with
accidents like you have here, where seemingly Tika reckoned that it
needs a large amount of memory to parse a document.
There are two ways out:
a) You very strictly control whatever document enters Tika by using a
white-list.
b) You don't let Tika run as part of Solr, but take it into a seperate
process, let it crash and restart it automatically.
Regards,
Harald.
On 25.07.2014 19:32, Ameya Aware wrote:
Please find below entire stack trace:
ERROR - 2014-07-25 13:14:22.202; org.apache.solr.common.SolrException;
null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Requested
array size exceeds VM limit
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:790)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:439)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:636)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at
org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:303)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
at
org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
at
org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
at
org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
at
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:278)
at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:88)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
WARN - 2014-07-25 13:14:22.263; org.eclipse.jetty.servlet.ServletHandler;
Error for /solr/collection1/update/extract
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at
org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:303)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
at
org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
at
org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
at
org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
at
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:278)
at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:88)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
Thanks,
Ameya
On Fri, Jul 25, 2014 at 12:36 PM, Greg Walters <greg.walt...@answers.com>
wrote:
Would you include the entire stack trace for your OOM message? Are you
seeing this on the client or server side?
Thanks,
Greg
On Jul 25, 2014, at 10:21 AM, Ameya Aware <ameya.aw...@gmail.com> wrote:
Hi,
I am in process of indexing lot of documents but after around 90000
documents i am getting below error:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
I am passing below parameters with Solr :
java -Xms6144m -Xmx6144m -XX:MaxPermSize=512m
-Dcom.sun.management.jmxremote -XX:+UseParNewGC -XX:+UseCompressedOops
-XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:ConcGCThreads=6
-XX:ParallelGCThreads=6 -jar start.jar
Also, i am Auto-committing after 20000 documents.
I searched on google for this but could not get any specific answer.
Can anybody help with this?
Thanks,
Ameya