Thanks Doug,
Would be nice to have stacktrace for btpool0-2051...
I analyzed LUCache several times, of course it has strange
synchronization statements like
public void warm(SolrIndexSearcher searcher, SolrCache old) throws
IOException {
...
LRUCache other = (LRUCache)old;
...
synchronized ( ??? other.map ??? ) {
...
/** A Query that matches documents containing a particular sequence of terms.
* A PhraseQuery is built by QueryParser for input like <code>"new
york"</code>.
*
* <p>This query may be combined with other terms or queries with a
[EMAIL PROTECTED] BooleanQuery}.
*/
org.apache.lucene.search.PhraseQuery
/** Returns true iff <code>o</code> is equal to this. */
public boolean equals(Object o) {
if (!(o instanceof PhraseQuery))
return false;
PhraseQuery other = (PhraseQuery)o;
return (this.getBoost() == other.getBoost()) //*PhraseQuery.java:286*//
&& (this.slop == other.slop)
&& this.terms.equals(other.terms)
&& this.positions.equals(other.positions);
}
terms & positions: java.util.Vector (synchronized)
- I don't think it may cause deadlock... at least we can stress-test
using "New York"
Lucene & SOLR developers should avoid using java.util.Vector which is
extremely prone to deadlocks in trivial cases...
P.S.
72 hours runtime: OOM problem with BEA JRockit R27:
jrockit-R27.4.0-jdk1.6.0_02 (AMD Opteron, 64bit, SLES 10 SP1, Tomcat
5.5.26). 100k queries a day...
Jul 17, 2008 11:08:07 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object
size: 3149016, Num elements: 393625
at org.apache.solr.util.OpenBitSet.<init>(OpenBitSet.java:86)
at
org.apache.solr.search.DocSetHitCollector.collect(DocSetHitCollector.java:63)
at
org.apache.solr.search.SolrIndexSearcher$9.collect(SolrIndexSearcher.java:1072)
at
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:320)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:146)
at org.apache.lucene.search.Searcher.search(Searcher.java:118)
at org.apache.lucene.search.Searcher.search(Searcher.java:97)
at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1069)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:804)
at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSet(SolrIndexSearcher.java:1245)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:96)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:148)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:902)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:280)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:237)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:834)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:640)
at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1286)
at java.lang.Thread.run(Thread.java:619)
P.P.S.
I'll send thread dump in separate Email
Quoting Doug Steigerwald <[EMAIL PROTECTED]>:
It happened again last night. I cronned a script that ran jstack on
the process every 5 minutes just to see what was going on. Here's a
snippet:
"btpool0-2668" prio=10 tid=0x00002aac3a905800 nid=0x76ed waiting for
monitor entry [0x000000005e584000..0x000000005e585a10]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.solr.search.LRUCache.get(LRUCache.java:129)
- waiting to lock <0x00002aaabcdd9450> (a
org.apache.solr.search.LRUCache$1)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:730)
at
org.apache.solr.search.SolrIndexSearcher.getDocList(SolrIndexSearcher.java:693)
at
org.apache.solr.search.CollapseFilter.<init>(CollapseFilter.java:137)
at
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:97)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:148)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:942)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:280)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:237)
During this log, there were 547 threads active (going by occurrences of
Thread.State in the log).
Here's some more:
"btpool0-2051" prio=10 tid=0x00002aac39144c00 nid=0x4012 waiting for
monitor entry [0x0000000045bfc000..0x0000000045bfdd90]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Vector.size(Unknown Source)
- waiting to lock <0x00002aaac0af0ea0> (a java.util.Vector)
at java.util.AbstractList.listIterator(Unknown Source)
at java.util.AbstractList.listIterator(Unknown Source)
at java.util.AbstractList.equals(Unknown Source)
at java.util.Vector.equals(Unknown Source)
- locked <0x00002aaac0ae8d30> (a java.util.Vector)
at org.apache.lucene.search.PhraseQuery.equals(PhraseQuery.java:286)
at java.util.AbstractList.equals(Unknown Source)
at
org.apache.lucene.search.DisjunctionMaxQuery.equals(DisjunctionMaxQuery.java:243)
at
org.apache.lucene.search.BooleanClause.equals(BooleanClause.java:102)
at java.util.AbstractList.equals(Unknown Source)
at
org.apache.lucene.search.BooleanQuery.equals(BooleanQuery.java:461)
at java.util.HashMap.getEntry(Unknown Source)
at java.util.LinkedHashMap.get(Unknown Source)
at org.apache.solr.search.LRUCache.get(LRUCache.java:129)
- locked <0x00002aaabcdd51f8> (a org.apache.solr.search.LRUCache$1)
This was also at the bottom of the jstack dump:
Found one Java-level deadlock:
=============================
"btpool0-2782":
waiting to lock monitor 0x0000000041e6c568 (object
0x00002aaabcdd9450, a org.apache.solr.search.LRUCache$1),
which is held by "btpool0-2063"
"btpool0-2063":
waiting to lock monitor 0x0000000041e6b068 (object
0x00002aaac0ae8d30, a java.util.Vector),
which is held by "btpool0-2051"
"btpool0-2051":
waiting to lock monitor 0x0000000041e6b110 (object
0x00002aaac0af0ea0, a java.util.Vector),
which is held by "btpool0-2063"
Java stack information for the threads listed above:
===================================================
"btpool0-2782":
at org.apache.solr.search.LRUCache.get(LRUCache.java:129)
- waiting to lock <0x00002aaabcdd9450> (a
org.apache.solr.search.LRUCache$1)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:730)
at
org.apache.solr.search.SolrIndexSearcher.getDocList(SolrIndexSearcher.java:693)
at
org.apache.solr.search.CollapseFilter.<init>(CollapseFilter.java:137)
at
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:97)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:148)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:942)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:280)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:237)
Thanks.
Doug
On Jul 15, 2008, at 11:39 AM, Noble Paul നോബിള് नोब्ळ् wrote:
Can we collect more information. It would be nice to know what the
threads are doing when it hangs.
If you are using *nix issue kill -3 <pid>
it would print out the stacktrace of all the threads in the VM . That
may tell us what is the state of each thread which could help us
suggest something
On Tue, Jul 15, 2008 at 8:59 PM, Fuad Efendi <[EMAIL PROTECTED]> wrote:
I constantly have the same problem; sometimes I have OutOfMemoryError in
logs, sometimes
not. Not-predictable. I minimized all caches, it still happens even with
8192M. CPU usage
is 375%-400% (two double-core Opterons), SUN Java 5. Moved to BEA JRockit 5
yesterday,
looks 30 times faster (25% CPU load with 4096M RAM); no any problem yet,
let's see...
Strange: Tomcat simply hangs instead of exit(...)
There are some posts related to OutOfMemoryError in solr-user list.
==============
http://www.linkedin.com/in/liferay
Quoting Doug Steigerwald <[EMAIL PROTECTED]>:
Since we pushed Solr out to production a few weeks ago, we've seen a
few issues with Solr not responding to requests (searches or admin
pages). There doesn't seem to be any reason for it from what we can
tell. We haven't seen it in QA or development.
We're running Solr with basically the example Solr setup with Jetty
(6.1.3). We package our Solr install by using 'ant example' and
replacing configs/etc. Whenever Solr stops responding, there are no
messages in the logs, nothing. Requests just time out.
We have also only seen this on our slaves. The master doesn't seem to
be hitting this issue. All the boxes are the same, version of java is
the same, etc.
We don't have a stack trace and no JMX set up. Once we see this issue,
our support folks just stop and start Solr on that machine.
Has anyone else run into anything like this with Solr?
Thanks.
Doug
--
--Noble Paul