Just a point or I missed: with such a large index (not doc size large, but content wise), I imagine a lot of your 16GB of RAM is being used by the system disk cache - which is good. Another reason you don't want to give too much RAM to the JVM. But you still want to give it enough to avoid the OOM :) Assuming you are using the RAM you are legitimately. And I don't yet have a reason to think you are not.

Also, there has been a report of or two of a lockup that didn't appear to involve an OOM, so this is not guaranteed to solve that. However, seeing that the lockup comes after the OOM, its the likely first thing to fix. Once the memory problems are taken care of, the locking issue can be addressed if you find it still remains. My bet is that fixing the OOM will clear it up.

- Mark

Mark Miller wrote:
Okay, it sounds like your index is fairly small then (1 million docs). Since you are only faceting on a couple fields and sorting on one, thats going to take a bit of RAM, but not really that much. So whats likely happening is that because of all the committing so often, you are getting multiple searchers queued up at a time - with enough of them, and perhaps some other things taking up RAM, that could be your issue.

The good news is, you have 16 GB of RAM - don't take the default RAM allocation. I am guessing its default to a gig - it depends on your platform and machine capabilities generally. I'd try giving it a min mem setting of around 1 gig and a max mem setting of 2 or 3 gig. That should be more than enough, unless you have a memory leak from something else (which is the unlikely guess at this point I think).

So -Xmx2g or something is what I'd suggest. Or 3. If you have 16, unless its being used elsewhere, you might as well use a bit more of it (though not too much if not necessary - that can be troublesome for the garbage collector in certain situations). It sounds like your app needs just a bit more room.

You also might lower the max warming searchers setting if that makes sense.

- Mark

Jerome L Quinn wrote:
Hi and thanks for looking at the problem ...


Mark Miller <markrmil...@gmail.com> wrote on 01/15/2009 02:58:24 PM:

Mark Miller <markrmil...@gmail.com>
01/15/2009 02:58 PM

Re: Help with Solr 1.3 lockups?

How much RAM are you giving the JVM? Thats running out of memory loading
a FieldCache, which can be a more memory intensive data structure. It
pretty much points to the JVM not having enough RAM to do what you want.
How many fields do you sort on? How many fields do you facet on? How
much RAM do you have available and how much have you given Solr? How
many documents are you working with?

I'm using the stock tomcat and JVM settings.  I see the VM footprint
sitting at 877M
right now.  It hasn't locked up yet this time around.

There are 2 fields we facet and 1 that we sort on. The machine has 16G of
memory, and the
index is currently sitting at 38G, though I haven't run an optimize in a
while.  There are
about 1 million docs in the index, though we have 3 full copies of the data
stored in
different fields and processed in different ways.

I do a commit every 10 docs or 3 seconds, whichever comes first.  We're
approximating
real-time updating.

The index is currently sitting on NFS, which I know isn't great for
performance.  I didn't
think it could cause reliability issues though.


As far as rebooting a failed server, the best technique is generally
external. I would recommend a script/program on another machine that
hits the Solr instance with a simple query every now and again. If you
don't get a valid response within a reasonable amount of time, or after
a reasonable number of tries, fire off alert emails and issue a command
to that server to reboot the JVM. Or something to that effect.

I suspect I'll add a watchdog, no matter what's causing the problem here.

However, you should figure out why you are running out of memory. You
don't want to use more resources than you have available if you can help
it.

Definitely. That's on the agenda :-)

Thanks,
Jerry



- Mark

Jerome L Quinn wrote:
Hi, all.

I'm running solr 1.3 inside Tomcat 6.0.18.  I'm running a modified
query
parser, tokenizer, highlighter, and have a CustomScoreQuery for dates.

After some amount of time, I see solr stop responding to update
requests.
When crawling through the logs, I see the following pattern:

Jan 12, 2009 7:27:42 PM org.apache.solr.update.DirectUpdateHandler2
commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
Jan 12, 2009 7:28:11 PM org.apache.solr.common.SolrException log
SEVERE: Error during auto-warming of

key:org.apache.solr.search.queryresult...@ce0f92b9:java.lang.OutOfMemoryError

        at org.apache.lucene.index.TermBuffer.toTerm
(TermBuffer.java:122)
        at org.apache.lucene.index.SegmentTermEnum.term
(SegmentTermEnum.java:167)
        at org.apache.lucene.index.SegmentMergeInfo.next
(SegmentMergeInfo.java:66)
        at org.apache.lucene.index.MultiSegmentReader
$MultiTermEnum.next
(MultiSegmentReader.java:492)
        at org.apache.lucene.search.FieldCacheImpl$7.createValue
(FieldCacheImpl.java:267)
        at org.apache.lucene.search.FieldCacheImpl$Cache.get
(FieldCacheImpl.java:72)
        at org.apache.lucene.search.FieldCacheImpl.getInts
(FieldCacheImpl.java:245)
        at org.apache.solr.search.function.IntFieldSource.getValues
(IntFieldSource.java:50)
        at
org.apache.solr.search.function.SimpleFloatFunction.getValues
(SimpleFloatFunction.java:41)
        at org.apache.solr.search.function.BoostedQuery
$CustomScorer.<init>
(BoostedQuery.java:111)
        at org.apache.solr.search.function.BoostedQuery
$CustomScorer.<init>
(BoostedQuery.java:97)
        at org.apache.solr.search.function.BoostedQuery
$BoostedWeight.scorer(BoostedQuery.java:88)
        at org.apache.lucene.search.IndexSearcher.search
(IndexSearcher.java:132)
        at org.apache.lucene.search.Searcher.search(Searcher.java:126)
        at org.apache.lucene.search.Searcher.search(Searcher.java:105)
        at org.apache.solr.search.SolrIndexSearcher.getDocListNC
(SolrIndexSearcher.java:966)
        at org.apache.solr.search.SolrIndexSearcher.getDocListC
(SolrIndexSearcher.java:838)
        at org.apache.solr.search.SolrIndexSearcher.access$000
(SolrIndexSearcher.java:56)
        at org.apache.solr.search.SolrIndexSearcher$2.regenerateItem
(SolrIndexSearcher.java:260)
        at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
at org.apache.solr.search.SolrIndexSearcher.warm
(SolrIndexSearcher.java:1518)
        at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1018)
        at java.util.concurrent.FutureTask$Sync.innerRun
(FutureTask.java:314)
        at java.util.concurrent.FutureTask.run(FutureTask.java:149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:896)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:735)

Jan 12, 2009 7:28:11 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
run
SEVERE: Socket accept failed
Throwable occurred: java.lang.OutOfMemoryError
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:414)
        at java.net.ServerSocket.implAccept(ServerSocket.java:464)
        at java.net.ServerSocket.accept(ServerSocket.java:432)
        at
org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket
(DefaultServerSocketFactory.java:61)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run
(JIoEndpoint.java:310)
        at java.lang.Thread.run(Thread.java:735)

<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>
<< Java dumps core and heap at this point >>
<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>

Jan 12, 2009 7:28:21 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out: SingleInstanceLock: write.lock
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init
(IndexWriter.java:1140)
        at org.apache.lucene.index.IndexWriter.<init>
(IndexWriter.java:938)
        at org.apache.solr.update.SolrIndexWriter.<init>
(SolrIndexWriter.java:116)
        at org.apache.solr.update.UpdateHandler.createMainIndexWriter
(UpdateHandler.java:122)
        at org.apache.solr.update.DirectUpdateHandler2.openWriter
(DirectUpdateHandler2.java:167)
        at org.apache.solr.update.DirectUpdateHandler2.addDoc
(DirectUpdateHandler2.java:221)
        at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd
(RunUpdateProcessorFactory.java:59)
        at
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate
(XmlUpdateRequestHandler.java:196)
        at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody
(XmlUpdateRequestHandler.java:123)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest
(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
        at org.apache.solr.servlet.SolrDispatchFilter.execute
(SolrDispatchFilter.java:303)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter
(SolrDispatchFilter.java:232)
        at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter
(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke
(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke
(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke
(StandardHostValve.java:128)
        at org.apache.catalina.valves.ErrorReportValve.invoke
(ErrorReportValve.java:102)
        at org.apache.catalina.core.StandardEngineValve.invoke
(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service
(CoyoteAdapter.java:286)
        at org.apache.coyote.http11.Http11Processor.process
(Http11Processor.java:845)
        at org.apache.coyote.http11.Http11Protocol
$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run
(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:735)


After this, all future updates cause the same write lock failure.

I'm willing to believe my code is somehow causing Solr to run out of
memory, though I'd like to know if anyone sees the problem on vanilla
Solr.
An even bigger problem is the fact that once Solr is wedged, it stays
that
way until a human notices and restarts things.  The tomcat stays
running
and there's no automatic detection that will either restart Solr, or
restart the Tomcat container.

Any suggestions on either front?

Thanks,
Jerry Quinn





Reply via email to