Good morning!
Recently we slipped into an OOME by optimizing our index. It looks like it's regarding to the nio class and the memory-handling. I'll try to describe the environment, the error and what we did to solve the problem. Nevertheless, none of our approaches was successful.

The environment:

- Tested with both, SOLR 3.3 & 3.4
- SuSE SLES 11 (X64)virtual machine with 16GB RAM
- ulimi: virtual memory 14834560 (14GB)
- Java: java-1_6_0-ibm-1.6.0-124.5
- Apache Tomcat/6.0.29

- Index Size (on filesystem): ~5GB, 1.1 million text documents.

The error:
First, building the index from scratch with a mysql DIH, with an empty index-Dir works fine. Building an index with &command=full-import, when the old segment files still in place, fails with an OutOfMemoryException. Same as optimizing the index.
Doing an optimize fails after some time with:

SEVERE: java.io.IOException: background merge hit exception: _6p(3.4):Cv1150724 _70(3.4):Cv667 _73(3.4):Cv7 _72(3.4):Cv4 _71(3.4):Cv1 into _74 [optimize] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2552) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2472) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:410) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:61) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:735)
Caused by: java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:765)
at org.apache.lucene.store.MMapDirectory$MMapIndexInput.<init>(MMapDirectory.java:264) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:216) at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:89) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:115) at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:710) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4378)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3917)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:388) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:456)
Caused by: java.lang.OutOfMemoryError: Map failed
        at sun.nio.ch.FileChannelImpl.map0(Native Method)
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:762)
        ... 9 more

Then we changed mergeScheduler and mergePolicy to

<mergeScheduler class="org.apache.lucene.index.SerialMergeScheduler" />
<mergePolicy class="org.apache.lucene.index.LogByteSizeMergePolicy"/>

which lead to a slightly different error-message:

SEVERE: java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:765)
at org.apache.lucene.store.MMapDirectory$MMapIndexInput.<init>(MMapDirectory.java:264) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:216) at org.apache.lucene.index.TermVectorsReader.<init>(TermVectorsReader.java:85) at org.apache.lucene.index.SegmentCoreReaders.openDocStores(SegmentCoreReaders.java:221) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:117) at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:710) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4248)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3917)
at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2725) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2535) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2472) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:410) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:61) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:735)
Caused by: java.lang.OutOfMemoryError: Map failed
        at sun.nio.ch.FileChannelImpl.map0(Native Method)
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:762)
        ... 33 more

Both seem to have in common the "sun.nio.ch.FileChannelImpl.map" running out of memory.

We looked at "free" while doing the optimize and we saw, that the optimize ate up all the free RAM. When no RAM was left, the exception was thrown. No swapping etc. at all.


What we did to solve the problem:

We discovered a link from this list and tried the mentioned solutions:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg54094.html

No success.
We set ulimit -v unlimited, changed mergePolicy (see above).
Last thing we tried: We decreased the RAM of the JVM. At least that worked. The optimize finished.

But on a long run this is no solution. Our index is growing constantly and RAM, which is sufficient today, maybe is not tomorrow. We wonder how the memory management works. If you are running another process on that machine, using a reasonable amount of RAM in a short period of time, an optimize would fail when the optimize starts at the wrong point in time. Or what are you doing, if you got an 25GB index? Do you need a +50GB RAM Server to do an optimize? That does not make any sense and so there must be something we can do about it - something we missed.

Maybe someone get some bright ideas?
Thanks a lot and best regards
Ralf

Reply via email to