Random OOM Exceptions

2014-08-14 Thread Scott Rankin
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase
.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
656)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
:359)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
:155)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
nFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
hain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.j
ava:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.j
ava:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:17
1)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99
)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.jav
a:118)




Scott Rankin
Corporate Reimbursement Services, Inc.
Phone: 617-467-1931
Email: sran...@crsinc.com

This email message contains information that Corporate Reimbursement Services, 
Inc. considers confidential and/or proprietary, or may later designate as 
confidential and proprietary. It is intended only for use of the individual or 
entity named above and should not be forwarded to any other persons or entities 
without the express consent of Corporate Reimbursement Services, Inc., nor 
should it be used for any purpose other than in the course of any potential or 
actual business relationship with Corporate Reimbursement Services, Inc. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.


Re: Random OOM Exceptions

2014-08-14 Thread Scott Rankin
On 8/14/14, 11:22 AM, "Shawn Heisey"  wrote:


>On 8/14/2014 7:46 AM, Scott Rankin wrote:
>> I¹m running a Solr setup and am getting occasional periods where memory
>> usage and GC just spike out of nowhere (unrelated to traffic).  I¹m
>>hoping
>> someone can shed some light.  Here¹s the setup:
>>
>> - Solr 4.3.1, Oracle JDK 1.7.0_51 64 bit on CentOS 6.5
>> - We have 2 Solr servers, one acting as a master that receives all the
>> writes, and one that is replicating from the master and handles all the
>> reads
>> - JVM parameters are:
>>
>> -Djava.io.tmpdir=/usr/local/solr/solr-tomcat/temp
>> -Dcatalina.home=/usr/local/solr/solr-tomcat
>> -Dcatalina.base=/usr/local/solr/solr-tomcat
>> -Djava.endorsed.dirs=/usr/local/solr/solr-tomcat/endorsed
>> -XX:+DisableExplicitGC
>> -XX:GCTimeRatio=9
>> -XX:MaxGCPauseMillis=1500
>> -XX:+UseParallelGC
>> -Xss512k
>> -XX:MaxPermSize=256m
>> -Xmx3096m
>> -Xms64m
>>
>>-Dlog4j.configuration=file:///usr/local/solr/solr-tomcat/conf/solr-log4j.
>>pr
>> operties
>> -Dsolr.solr.home=/usr/local/solr/solr-home
>> -Duser.timezone=GMT
>> -javaagent:/usr/local/solr/solr-tomcat/newrelic/newrelic.jar
>> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>>
>>-Djava.util.logging.config.file=/usr/local/solr/solr-tomcat/conf/logging.
>>pr
>> operties
>>
>>
>> We have one active core, the index is about 2.17 GB with around 6
>>million
>> documents.
>>
>> The issue that we see is that every so often, heap memory will spike, GC
>> percentage will go to 100%, and we¹ll see OOM errors.  There¹s no change
>> in traffic patterns as far as we can tell.  According to New Relic, it¹s
>> the Old Gen that¹s running out of space. The OOM stack trace is below.
>> I¹d be very grateful for any help you can offer!
>>
>> Thanks,
>> Scott
>>
>> ERROR - 2014-08-13 22:45:53.252; org.apache.solr.common.SolrException;
>> null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap
>> space
>
>When you actually get anOutOfMemoryErrorexception, it means that the
>program is trying to allocate more memory than you have told it it's
>allowed to allocate.  In your case, you've limited it to 3096
>megabytes.  This is not the same thing as garbage collection, although
>the low memory situation will usually lead to a LOT of GC activity.
>
>This specific section of the following wiki page lists some of the
>reasons that your Solr install might be using a lot of heap memory.
>Later it describes some things you can do to reduce heap requirements:
>
>http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
>
>It's a good idea to read the entire wiki page, including the parts that
>come before the Java Heap section that is directly linked above, and the
>other wiki pages linked at the top of that page.
>
>Thanks,
>Shawn
>

My question was actually more about what in Solr might cause the server to
suddenly go from a very consistent heap size of 300-400 MB to over 2 GB in
a matter of minutes with no changes in traffic.  I get why the VM is
crashing, I just don’t know why Solr is suddenly going nuts.

This email message contains information that Corporate Reimbursement Services, 
Inc. considers confidential and/or proprietary, or may later designate as 
confidential and proprietary. It is intended only for use of the individual or 
entity named above and should not be forwarded to any other persons or entities 
without the express consent of Corporate Reimbursement Services, Inc., nor 
should it be used for any purpose other than in the course of any potential or 
actual business relationship with Corporate Reimbursement Services, Inc. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.


Queries slow on replicas after replication

2014-08-26 Thread Scott Rankin
Hi all,

I have a scenario here and I'd love some advice on what my options are.  We 
have one Solr master and two read replicas.  The replicas query the master 
every 10 seconds because we need relatively quick availability of new 
documents.   We balance read queries across all three servers and send writes 
only to the master.

The problem that I'm seeing is that average query time is vastly higher on the 
replicas than on the master - 150ms vs 1600ms.  What I've noticed is that 
immediately after a replication, a query against the replica can take up to 5 
seconds.  Then subsequent queries are faster until the next replication.   On 
one level this makes sense, since when there's a change to the index I imagine 
that the caches get flushed.  But this is really bogging down performance on a 
pretty high number of queries.

We should have plenty of OS disk cache - the server has 12 GB of RAM, and the 
apps on the server are only using up about 6.  Our index is 2.7 GB, so it 
should fit in the OS disk cache.  Are there any other factors that I can look 
at to eliminate these slow queries?

Thanks,
Scott

Scott Rankin
Corporate Reimbursement Services, Inc.
Phone: 617-467-1931
Email: sran...@crsinc.com<mailto:sran...@crsinc.com>

This email message contains information that Corporate Reimbursement Services, 
Inc. considers confidential and/or proprietary, or may later designate as 
confidential and proprietary. It is intended only for use of the individual or 
entity named above and should not be forwarded to any other persons or entities 
without the express consent of Corporate Reimbursement Services, Inc., nor 
should it be used for any purpose other than in the course of any potential or 
actual business relationship with Corporate Reimbursement Services, Inc. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.


Re: Queries slow on replicas after replication

2014-08-27 Thread Scott Rankin
Thanks for the suggestions, Erick.  I took a look in the config and it
turns out that we didn¹t have any auto warming going on.  I set the
filterCache to be about 75% autowarm and the document and query result
cache to be 50%.  That had a marked impact on the performance, with the
average response time going down from about 1600ms to about 800ms.  So
thank you for that.  I also bumped the replication interval up to 20
seconds to give the autowarm time to go.

However - we¹re still looking at a 4-6x difference in performance between
the master and the replicas.  Once we get past the autowarming, is there
anything else that you would recommend us looking at?

Thanks,
Scott


Scott Rankin
Corporate Reimbursement Services, Inc.
Phone: 617-467-1931
Email: sran...@crsinc.com




On 8/26/14, 9:01 PM, "Erick Erickson"  wrote:

>What are your autowarm settings? You should be able to alleviate this by
>configuring these in solrconfig.xml
>1> your cache autowarm settings, particularly filterCache and
>documentResultCache.
>2> your newSearcher settings.
>
>The point of all the autowarming is that these queries are executed after
>a
>commit
>(or replication in your situation) but _before_  the new searcher serves
>queries.
>So here's the sequence
>1> a replication (or commit) happens
>2> a new searcher is opened and autowarming occurs (cache autowarm and
>executing any newSearcher queries defined)
>3> while <2> is going on, incoming queries are served by the old seacher.
>4> when <2> is completed, incoming queries are served by the new searcher
>5> the old searcher fulfills its last query and closes itself.
>
>Don't be confused by firstSearcher and newSearcher. firstSearcher is fired
>when the entire server is started (and there are no cache entries to
>autowarm). newSearcher queries are fired whenever a commit (or
>replication)
>happen.
>
>HOWEVER.
>at a 10 second refresh interval, you have the chance of <2> taking more
>than 10 seconds. you'll see warnings about "too many on deck searchers" if
>this is the case. For a M/R setup, 10 seconds is _very_ aggressive.
>Seriously consider SolrCloud in this case if your latency requirements are
>truly that low.
>
>Having your master serve queries is also suspect. It's _busy_ indexing and
>you're asking it to add query load too. May not really matter a lot
>depends
>on the indexing rate...
>
>Best,
>Erick
>
>
>On Tue, Aug 26, 2014 at 1:32 PM, Scott Rankin  wrote:
>
>> Hi all,
>>
>> I have a scenario here and I'd love some advice on what my options are.
>> We have one Solr master and two read replicas.  The replicas query the
>> master every 10 seconds because we need relatively quick availability of
>> new documents.   We balance read queries across all three servers and
>>send
>> writes only to the master.
>>
>> The problem that I'm seeing is that average query time is vastly higher
>>on
>> the replicas than on the master - 150ms vs 1600ms.  What I've noticed is
>> that immediately after a replication, a query against the replica can
>>take
>> up to 5 seconds.  Then subsequent queries are faster until the next
>> replication.   On one level this makes sense, since when there's a
>>change
>> to the index I imagine that the caches get flushed.  But this is really
>> bogging down performance on a pretty high number of queries.
>>
>> We should have plenty of OS disk cache - the server has 12 GB of RAM,
>>and
>> the apps on the server are only using up about 6.  Our index is 2.7 GB,
>>so
>> it should fit in the OS disk cache.  Are there any other factors that I
>>can
>> look at to eliminate these slow queries?
>>
>> Thanks,
>> Scott
>>
>> Scott Rankin
>> Corporate Reimbursement Services, Inc.
>> Phone: 617-467-1931
>> Email: sran...@crsinc.com<mailto:sran...@crsinc.com>
>>
>> This email message contains information that Corporate Reimbursement
>> Services, Inc. considers confidential and/or proprietary, or may later
>> designate as confidential and proprietary. It is intended only for use
>>of
>> the individual or entity named above and should not be forwarded to any
>> other persons or entities without the express consent of Corporate
>> Reimbursement Services, Inc., nor should it be used for any purpose
>>other
>> than in the course of any potential or actual business relationship with
>> Corporate Reimbursement Services, Inc. If the reader of this message is
>>not
>> the intended recipient, or the employee or agent responsible to deliver
>>it
>>