from:"Patrick O'Lone"

Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-11-26 Thread Patrick O';Lone

I've been tracking a problem in our Solr environment for awhile with
periodic stalls of Solr 3.6.1. I'm running up to a wall on ideas to try
and thought I might get some insight from some others on this list.

The load on the server is normally anywhere between 1-3. It's an 8-core
machine with 40GB of RAM. I have about 25GB of index data that is
replicated to this server every 5 minutes. It's taking about 200
connections per second and roughly every 5-10 minutes it will stall for
about 30 seconds to a minute. The stall causes the load to go to as high
as 90. It is all CPU bound in user space - all cores go to 99%
utilization (spinlock?). When doing a thread dump, the following line is
blocked in all running Tomcat threads:

org.apache.lucene.search.FieldCacheImpl$Cache.get (
FieldCacheImpl.java:230 )

Looking the source code in 3.6.1, that is a function call to
syncronized() which blocks all threads and causes the backlog. I've
tried to correlate these events to the replication events - but even
with replication disabled - this still happens. We run multiple data
centers using Solr and I was comparing garbage collection processes
between and noted that the old generation is collected very differently
on this data center versus others. The old generation is collected as a
massive collect event (several gigabytes worth) - the other data center
is more saw toothed and collects only in 500MB-1GB at a time. Here's my
parameters to java (the same in all environments):

/usr/java/jre/bin/java \
-verbose:gc \
-XX:+PrintGCDetails \
-server \
-Dcom.sun.management.jmxremote \
-XX:+UseConcMarkSweepGC \
-XX:+UseParNewGC \
-XX:+CMSIncrementalMode \
-XX:+CMSParallelRemarkEnabled \
-XX:+CMSIncrementalPacing \
-XX:NewRatio=3 \
-Xms30720M \
-Xmx30720M \
-Djava.endorsed.dirs=/usr/local/share/apache-tomcat/endorsed \
-classpath /usr/local/share/apache-tomcat/bin/bootstrap.jar \
-Dcatalina.base=/usr/local/share/apache-tomcat \
-Dcatalina.home=/usr/local/share/apache-tomcat \
-Djava.io.tmpdir=/tmp \
org.apache.catalina.startup.Bootstrap start

I've tried a few GC option changes from this (been running this way for
a couple of years now) - primarily removing CMS Incremental mode as we
have 8 cores and remarks on the internet suggest that it is only for
smaller SMP setups. Removing CMS did not fix anything.

I've considered that the heap is way too large (30GB from 40GB) and may
not leave enough memory for mmap operations (MMap appears to be used in
the field cache). Based on active memory utilization in Java, seems like
I might be able to reduce down to 22GB safely - but I'm not sure if that
will help with the CPU issues.

I think field cache is used for sorting and faceting. I've started to
investigate facet.method, but from what I can tell, this doesn't seem to
influence sorting at all - only facet queries. I've tried setting
useFilterForSortQuery, and seems to require less field cache but doesn't
address the stalling issues.

Is there something I am overlooking? Perhaps the system is becoming
oversubscribed in terms of resources? Thanks for any help that is offered.

-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-11-26 Thread Patrick O';Lone

We do perform a lot of sorting - on multiple fields in fact. We have
different kinds of Solr configurations - our news searches do little
with regards to faceting, but heavily sort. We provide classified ad
searches and that heavily uses faceting. I might try reducing the JVM
memory some and amount of perm generation as suggested earlier. It feels
like a GC issue and loading the cache just happens to be the victim of a
stop-the-world event at the worse possible time.

> My gut instinct is that your heap size is way too high. Try decreasing it to 
> like 5-10G. I know you say it uses more than that, but that just seems 
> bizarre unless you're doing something like faceting and/or sorting on every 
> field.
> 
> -Michael
> 
> -Original Message-
> From: Patrick O'Lone [mailto:pol...@townnews.com] 
> Sent: Tuesday, November 26, 2013 11:59 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 3.6.1 stalling with high CPU and blocking on field cache
> 
> I've been tracking a problem in our Solr environment for awhile with periodic 
> stalls of Solr 3.6.1. I'm running up to a wall on ideas to try and thought I 
> might get some insight from some others on this list.
> 
> The load on the server is normally anywhere between 1-3. It's an 8-core 
> machine with 40GB of RAM. I have about 25GB of index data that is replicated 
> to this server every 5 minutes. It's taking about 200 connections per second 
> and roughly every 5-10 minutes it will stall for about 30 seconds to a 
> minute. The stall causes the load to go to as high as 90. It is all CPU bound 
> in user space - all cores go to 99% utilization (spinlock?). When doing a 
> thread dump, the following line is blocked in all running Tomcat threads:
> 
> org.apache.lucene.search.FieldCacheImpl$Cache.get (
> FieldCacheImpl.java:230 )
> 
> Looking the source code in 3.6.1, that is a function call to
> syncronized() which blocks all threads and causes the backlog. I've tried to 
> correlate these events to the replication events - but even with replication 
> disabled - this still happens. We run multiple data centers using Solr and I 
> was comparing garbage collection processes between and noted that the old 
> generation is collected very differently on this data center versus others. 
> The old generation is collected as a massive collect event (several gigabytes 
> worth) - the other data center is more saw toothed and collects only in 
> 500MB-1GB at a time. Here's my parameters to java (the same in all 
> environments):
> 
> /usr/java/jre/bin/java \
> -verbose:gc \
> -XX:+PrintGCDetails \
> -server \
> -Dcom.sun.management.jmxremote \
> -XX:+UseConcMarkSweepGC \
> -XX:+UseParNewGC \
> -XX:+CMSIncrementalMode \
> -XX:+CMSParallelRemarkEnabled \
> -XX:+CMSIncrementalPacing \
> -XX:NewRatio=3 \
> -Xms30720M \
> -Xmx30720M \
> -Djava.endorsed.dirs=/usr/local/share/apache-tomcat/endorsed \ -classpath 
> /usr/local/share/apache-tomcat/bin/bootstrap.jar \ 
> -Dcatalina.base=/usr/local/share/apache-tomcat \ 
> -Dcatalina.home=/usr/local/share/apache-tomcat \ -Djava.io.tmpdir=/tmp \ 
> org.apache.catalina.startup.Bootstrap start
> 
> I've tried a few GC option changes from this (been running this way for a 
> couple of years now) - primarily removing CMS Incremental mode as we have 8 
> cores and remarks on the internet suggest that it is only for smaller SMP 
> setups. Removing CMS did not fix anything.
> 
> I've considered that the heap is way too large (30GB from 40GB) and may not 
> leave enough memory for mmap operations (MMap appears to be used in the field 
> cache). Based on active memory utilization in Java, seems like I might be 
> able to reduce down to 22GB safely - but I'm not sure if that will help with 
> the CPU issues.
> 
> I think field cache is used for sorting and faceting. I've started to 
> investigate facet.method, but from what I can tell, this doesn't seem to 
> influence sorting at all - only facet queries. I've tried setting 
> useFilterForSortQuery, and seems to require less field cache but doesn't 
> address the stalling issues.
> 
> Is there something I am overlooking? Perhaps the system is becoming 
> oversubscribed in terms of resources? Thanks for any help that is offered.
> 
> --
> Patrick O'Lone
> Director of Software Development
> TownNews.com
> 
> E-mail ... pol...@townnews.com
> Phone  309-743-0809
> Fax .. 309-743-0830
> 
> 


-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

facet.method=fcs vs facet.method=fc on solr slaves

2013-12-04 Thread Patrick O';Lone

Is there any advantage on a Solr slave to receive queries using
facet.method=fcs instead of the default of facet.method=fc? Most of the
segment files are unchanged between replication events - but I wasn't
sure if replication would cause the unchanged segment field caches to be
lost anyway.
-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

Re: facet.method=fcs vs facet.method=fc on solr slaves

2013-12-05 Thread Patrick O';Lone

So does it make the most sense then to force, by default,
facet.method=fcs on slave nodes that receive updates every 5 minutes but
with large segments that don't change every update? Right now,
everything I have configured uses facet.method=fc since we don't declare
it at all.

Randomly, after replication, I have several threads that will hang on
reading data from field cache and I'm trying to think of things I can do
to mitigate that. Thanks for the info.

> Hello Patrick,
> 
> Replication flushes UnInvertedField cache that impacts fc, but doesn't
> harm Lucene's FieldCache which is for fcs. You can check how much time
> in millis is spend on UnInvertedField cache regeneration in INFO logs like
> "UnInverted multi-valued field ,time=### ..."
> 
> 
> On Thu, Dec 5, 2013 at 12:15 AM, Patrick O'Lone  <mailto:pol...@townnews.com>> wrote:
> 
> Is there any advantage on a Solr slave to receive queries using
> facet.method=fcs instead of the default of facet.method=fc? Most of the
> segment files are unchanged between replication events - but I wasn't
> sure if replication would cause the unchanged segment field caches to be
> lost anyway.
> --
> Patrick O'Lone
> Director of Software Development
> TownNews.com
> 
> E-mail ... pol...@townnews.com <mailto:pol...@townnews.com>
> Phone  309-743-0809
> Fax .. 309-743-0830
> 
> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
> 
> <http://www.griddynamics.com>
> <mailto:mkhlud...@griddynamics.com>


-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-12-09 Thread Patrick O';Lone

I have a new question about this issue - I create a filter queries of
the form:

fq=start_time:[* TO NOW/5MINUTE]

This is used to restrict the set of documents to only items that have a
start time within the next 5 minutes. Most of my indexes have millions
of documents with few documents that start sometime in the future.
Nearly all of my queries include this, would this cause every other
search thread to block until the filter query is re-cached every 5
minutes and if so, is there a better way to do it? Thanks for any
continued help with this issue!

> We have a webapp running with a very high HEAP size (24GB) and we have
> no problems with it AFTER we enabled the new GC that is meant to replace
> sometime in the future the CMS GC, but you have to have Java 6 update
> "Some number I couldn't find but latest should cover" to be able to use:
> 
> 1. Remove all GC options you have and...
> 2. Replace them with /"-XX:+UseG1GC -XX:MaxGCPauseMillis=50"/
> 
> As a test of course, more information you can read on the following (and
> interesting) article, we also have Solr running with these options, no
> more pauses or HEAP size hitting the sky.
> 
> Don't get bored reading the 1st (and small) introduction page of the
> article, page 2 and 3 will make lot of sense:
> http://www.drdobbs.com/jvm/g1-javas-garbage-first-garbage-collector/219401061
> 
> 
> HTH,
> 
> Guido.
> 
> On 26/11/13 21:59, Patrick O'Lone wrote:
>> We do perform a lot of sorting - on multiple fields in fact. We have
>> different kinds of Solr configurations - our news searches do little
>> with regards to faceting, but heavily sort. We provide classified ad
>> searches and that heavily uses faceting. I might try reducing the JVM
>> memory some and amount of perm generation as suggested earlier. It feels
>> like a GC issue and loading the cache just happens to be the victim of a
>> stop-the-world event at the worse possible time.
>>
>>> My gut instinct is that your heap size is way too high. Try
>>> decreasing it to like 5-10G. I know you say it uses more than that,
>>> but that just seems bizarre unless you're doing something like
>>> faceting and/or sorting on every field.
>>>
>>> -Michael
>>>
>>> -Original Message-
>>> From: Patrick O'Lone [mailto:pol...@townnews.com]
>>> Sent: Tuesday, November 26, 2013 11:59 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Solr 3.6.1 stalling with high CPU and blocking on field cache
>>>
>>> I've been tracking a problem in our Solr environment for awhile with
>>> periodic stalls of Solr 3.6.1. I'm running up to a wall on ideas to
>>> try and thought I might get some insight from some others on this list.
>>>
>>> The load on the server is normally anywhere between 1-3. It's an
>>> 8-core machine with 40GB of RAM. I have about 25GB of index data that
>>> is replicated to this server every 5 minutes. It's taking about 200
>>> connections per second and roughly every 5-10 minutes it will stall
>>> for about 30 seconds to a minute. The stall causes the load to go to
>>> as high as 90. It is all CPU bound in user space - all cores go to
>>> 99% utilization (spinlock?). When doing a thread dump, the following
>>> line is blocked in all running Tomcat threads:
>>>
>>> org.apache.lucene.search.FieldCacheImpl$Cache.get (
>>> FieldCacheImpl.java:230 )
>>>
>>> Looking the source code in 3.6.1, that is a function call to
>>> syncronized() which blocks all threads and causes the backlog. I've
>>> tried to correlate these events to the replication events - but even
>>> with replication disabled - this still happens. We run multiple data
>>> centers using Solr and I was comparing garbage collection processes
>>> between and noted that the old generation is collected very
>>> differently on this data center versus others. The old generation is
>>> collected as a massive collect event (several gigabytes worth) - the
>>> other data center is more saw toothed and collects only in 500MB-1GB
>>> at a time. Here's my parameters to java (the same in all environments):
>>>
>>> /usr/java/jre/bin/java \
>>> -verbose:gc \
>>> -XX:+PrintGCDetails \
>>> -server \
>>> -Dcom.sun.management.jmxremote \
>>> -XX:+UseConcMarkSweepGC \
>>> -XX:+UseParNewGC \
>>> -XX:+CMSIncrementalMode \
>>> -XX:+CMSParallelRemarkEnabled \
>>> -XX:+CMSIncrementalPacing \
>>> -XX:NewRatio=3 \
>>> -Xm

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-12-09 Thread Patrick O';Lone

Unfortunately, in a test environment, this happens in version 4.4.0 of
Solr as well.

> I was trying to locate the release notes for 3.6.x it is too old, if I
> were you I would update to 3.6.2 (from 3.6.1), it shouldn't affect you
> since it is a minor release, locate the release notes and see if
> something that is affecting you got fixed, also, I would be thinking on
> moving on to 4.x which is quite stable and fast.
> 
> Like anything with Java and concurrency, it will just get better (and
> faster) with bigger numbers and concurrency frameworks becoming more and
> more reliable, standard and stable.
> 
> Regards,
> 
> Guido.
> 
> On 09/12/13 15:07, Patrick O'Lone wrote:
>> I have a new question about this issue - I create a filter queries of
>> the form:
>>
>> fq=start_time:[* TO NOW/5MINUTE]
>>
>> This is used to restrict the set of documents to only items that have a
>> start time within the next 5 minutes. Most of my indexes have millions
>> of documents with few documents that start sometime in the future.
>> Nearly all of my queries include this, would this cause every other
>> search thread to block until the filter query is re-cached every 5
>> minutes and if so, is there a better way to do it? Thanks for any
>> continued help with this issue!
>>
>>> We have a webapp running with a very high HEAP size (24GB) and we have
>>> no problems with it AFTER we enabled the new GC that is meant to replace
>>> sometime in the future the CMS GC, but you have to have Java 6 update
>>> "Some number I couldn't find but latest should cover" to be able to use:
>>>
>>> 1. Remove all GC options you have and...
>>> 2. Replace them with /"-XX:+UseG1GC -XX:MaxGCPauseMillis=50"/
>>>
>>> As a test of course, more information you can read on the following (and
>>> interesting) article, we also have Solr running with these options, no
>>> more pauses or HEAP size hitting the sky.
>>>
>>> Don't get bored reading the 1st (and small) introduction page of the
>>> article, page 2 and 3 will make lot of sense:
>>> http://www.drdobbs.com/jvm/g1-javas-garbage-first-garbage-collector/219401061
>>>
>>>
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>> On 26/11/13 21:59, Patrick O'Lone wrote:
>>>> We do perform a lot of sorting - on multiple fields in fact. We have
>>>> different kinds of Solr configurations - our news searches do little
>>>> with regards to faceting, but heavily sort. We provide classified ad
>>>> searches and that heavily uses faceting. I might try reducing the JVM
>>>> memory some and amount of perm generation as suggested earlier. It
>>>> feels
>>>> like a GC issue and loading the cache just happens to be the victim
>>>> of a
>>>> stop-the-world event at the worse possible time.
>>>>
>>>>> My gut instinct is that your heap size is way too high. Try
>>>>> decreasing it to like 5-10G. I know you say it uses more than that,
>>>>> but that just seems bizarre unless you're doing something like
>>>>> faceting and/or sorting on every field.
>>>>>
>>>>> -Michael
>>>>>
>>>>> -Original Message-
>>>>> From: Patrick O'Lone [mailto:pol...@townnews.com]
>>>>> Sent: Tuesday, November 26, 2013 11:59 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Solr 3.6.1 stalling with high CPU and blocking on field cache
>>>>>
>>>>> I've been tracking a problem in our Solr environment for awhile with
>>>>> periodic stalls of Solr 3.6.1. I'm running up to a wall on ideas to
>>>>> try and thought I might get some insight from some others on this
>>>>> list.
>>>>>
>>>>> The load on the server is normally anywhere between 1-3. It's an
>>>>> 8-core machine with 40GB of RAM. I have about 25GB of index data that
>>>>> is replicated to this server every 5 minutes. It's taking about 200
>>>>> connections per second and roughly every 5-10 minutes it will stall
>>>>> for about 30 seconds to a minute. The stall causes the load to go to
>>>>> as high as 90. It is all CPU bound in user space - all cores go to
>>>>> 99% utilization (spinlock?). When doing a thread dump, the following
>>>>> line is blocked in all running Tomcat threads:
>>&

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-12-09 Thread Patrick O';Lone

Yeah, I tried G1, but it did not help - I don't think it is a garbage
collection issue. I've made various changes to iCMS as well and the
issue ALWAYS happens - no matter what I do. If I'm taking heavy traffic
(200 requests per second) - as soon as I hit a 5 minute mark - the world
stops - garbage collection would be less predictable. Nearly all of my
requests have this 5 minute windowing behavior on time though, which is
why I have it as a strong suspect now. If it blocks on that - even for a
couple of seconds, my traffic backlog will be 600-800 requests.

> Did you add the Garbage collection JVM options I suggested you?
> 
> -XX:+UseG1GC -XX:MaxGCPauseMillis=50
> 
> Guido.
> 
> On 09/12/13 16:33, Patrick O'Lone wrote:
>> Unfortunately, in a test environment, this happens in version 4.4.0 of
>> Solr as well.
>>
>>> I was trying to locate the release notes for 3.6.x it is too old, if I
>>> were you I would update to 3.6.2 (from 3.6.1), it shouldn't affect you
>>> since it is a minor release, locate the release notes and see if
>>> something that is affecting you got fixed, also, I would be thinking on
>>> moving on to 4.x which is quite stable and fast.
>>>
>>> Like anything with Java and concurrency, it will just get better (and
>>> faster) with bigger numbers and concurrency frameworks becoming more and
>>> more reliable, standard and stable.
>>>
>>> Regards,
>>>
>>> Guido.
>>>
>>> On 09/12/13 15:07, Patrick O'Lone wrote:
>>>> I have a new question about this issue - I create a filter queries of
>>>> the form:
>>>>
>>>> fq=start_time:[* TO NOW/5MINUTE]
>>>>
>>>> This is used to restrict the set of documents to only items that have a
>>>> start time within the next 5 minutes. Most of my indexes have millions
>>>> of documents with few documents that start sometime in the future.
>>>> Nearly all of my queries include this, would this cause every other
>>>> search thread to block until the filter query is re-cached every 5
>>>> minutes and if so, is there a better way to do it? Thanks for any
>>>> continued help with this issue!
>>>>
>>>>> We have a webapp running with a very high HEAP size (24GB) and we have
>>>>> no problems with it AFTER we enabled the new GC that is meant to
>>>>> replace
>>>>> sometime in the future the CMS GC, but you have to have Java 6 update
>>>>> "Some number I couldn't find but latest should cover" to be able to
>>>>> use:
>>>>>
>>>>> 1. Remove all GC options you have and...
>>>>> 2. Replace them with /"-XX:+UseG1GC -XX:MaxGCPauseMillis=50"/
>>>>>
>>>>> As a test of course, more information you can read on the following
>>>>> (and
>>>>> interesting) article, we also have Solr running with these options, no
>>>>> more pauses or HEAP size hitting the sky.
>>>>>
>>>>> Don't get bored reading the 1st (and small) introduction page of the
>>>>> article, page 2 and 3 will make lot of sense:
>>>>> http://www.drdobbs.com/jvm/g1-javas-garbage-first-garbage-collector/219401061
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> HTH,
>>>>>
>>>>> Guido.
>>>>>
>>>>> On 26/11/13 21:59, Patrick O'Lone wrote:
>>>>>> We do perform a lot of sorting - on multiple fields in fact. We have
>>>>>> different kinds of Solr configurations - our news searches do little
>>>>>> with regards to faceting, but heavily sort. We provide classified ad
>>>>>> searches and that heavily uses faceting. I might try reducing the JVM
>>>>>> memory some and amount of perm generation as suggested earlier. It
>>>>>> feels
>>>>>> like a GC issue and loading the cache just happens to be the victim
>>>>>> of a
>>>>>> stop-the-world event at the worse possible time.
>>>>>>
>>>>>>> My gut instinct is that your heap size is way too high. Try
>>>>>>> decreasing it to like 5-10G. I know you say it uses more than that,
>>>>>>> but that just seems bizarre unless you're doing something like
>>>>>>> faceting and/or sorting on every field.
>>>>>>>
>>>>>>> -Michael
&g

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-12-09 Thread Patrick O';Lone

Well, I want to include everything will start in the next 5 minute
interval and everything that came before. The query is more like:

fq=start_time:[* TO NOW+5MINUTE/5MINUTE]

so that it rounds to the nearest 5 minute interval on the right-hand
side. But, as soon as 1 second after that 5 minute window, everything
pauses wanting for filter cache (at least that's my working theory based
on observation). Is it possible to do something like:

fq=start_time:[* TO NOW+1DAY/DAY]&q=start_time:[* TO NOW/MINUTE]

where it would use the filter cache to narrow down by day resolution and
then filter as part of the standard query, or something like that?

My thought is that this would still gain a benefit from a query cache,
but somewhat slower since it must remove results for things appearing
later in the day.

> If you want a start time within the next 5 minutes, I think your filter
> is not the good one.
> * will be replaced by the first date in your field
> 
> Try :
> fq=start_time:[NOW TO NOW+5MINUTE]
> 
> Franck Brisbart
> 
> 
> Le lundi 09 décembre 2013 à 09:07 -0600, Patrick O'Lone a écrit :
>> I have a new question about this issue - I create a filter queries of
>> the form:
>>
>> fq=start_time:[* TO NOW/5MINUTE]
>>
>> This is used to restrict the set of documents to only items that have a
>> start time within the next 5 minutes. Most of my indexes have millions
>> of documents with few documents that start sometime in the future.
>> Nearly all of my queries include this, would this cause every other
>> search thread to block until the filter query is re-cached every 5
>> minutes and if so, is there a better way to do it? Thanks for any
>> continued help with this issue!
>>
>>> We have a webapp running with a very high HEAP size (24GB) and we have
>>> no problems with it AFTER we enabled the new GC that is meant to replace
>>> sometime in the future the CMS GC, but you have to have Java 6 update
>>> "Some number I couldn't find but latest should cover" to be able to use:
>>>
>>> 1. Remove all GC options you have and...
>>> 2. Replace them with /"-XX:+UseG1GC -XX:MaxGCPauseMillis=50"/
>>>
>>> As a test of course, more information you can read on the following (and
>>> interesting) article, we also have Solr running with these options, no
>>> more pauses or HEAP size hitting the sky.
>>>
>>> Don't get bored reading the 1st (and small) introduction page of the
>>> article, page 2 and 3 will make lot of sense:
>>> http://www.drdobbs.com/jvm/g1-javas-garbage-first-garbage-collector/219401061
>>>
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>> On 26/11/13 21:59, Patrick O'Lone wrote:
>>>> We do perform a lot of sorting - on multiple fields in fact. We have
>>>> different kinds of Solr configurations - our news searches do little
>>>> with regards to faceting, but heavily sort. We provide classified ad
>>>> searches and that heavily uses faceting. I might try reducing the JVM
>>>> memory some and amount of perm generation as suggested earlier. It feels
>>>> like a GC issue and loading the cache just happens to be the victim of a
>>>> stop-the-world event at the worse possible time.
>>>>
>>>>> My gut instinct is that your heap size is way too high. Try
>>>>> decreasing it to like 5-10G. I know you say it uses more than that,
>>>>> but that just seems bizarre unless you're doing something like
>>>>> faceting and/or sorting on every field.
>>>>>
>>>>> -Michael
>>>>>
>>>>> -Original Message-
>>>>> From: Patrick O'Lone [mailto:pol...@townnews.com]
>>>>> Sent: Tuesday, November 26, 2013 11:59 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Solr 3.6.1 stalling with high CPU and blocking on field cache
>>>>>
>>>>> I've been tracking a problem in our Solr environment for awhile with
>>>>> periodic stalls of Solr 3.6.1. I'm running up to a wall on ideas to
>>>>> try and thought I might get some insight from some others on this list.
>>>>>
>>>>> The load on the server is normally anywhere between 1-3. It's an
>>>>> 8-core machine with 40GB of RAM. I have about 25GB of index data that
>>>>> is replicated to this server every 5 minutes. It's taking about 200
>>>>> connections per second and roughly every 5-10 minutes it will stall
>>&

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-12-09 Thread Patrick O';Lone

I initially thought this was the case as well. These are slave nodes
that receive updates every 5-10 minutes. However, this issue happens
even if replication is turned off and no update handler is provided at all.

I have confirmed against my data that simply querying the fq for a
start_time in a range takes 11-13 seconds to actually populate the
cache. If I make the fq not cache at all, my QTime raises by about
100ms, but does not have the stalling effect. A purely negative query
also seems to have this effect, that is:

fq=-start_time:[NOW/MINUTE TO *]

But, I'm not sure if that is because it actually caches the negative
query or if it discards it entirely.

> Patrick,
> 
> Are you getting these stalls following a commit? If so then the issue is
> most likely fieldCache warming pauses. To stop your users from seeing
> this pause you'll need to add static warming queries to your
> solrconfig.xml to warm the fieldCache before it's registered .
> 
> 
> On Mon, Dec 9, 2013 at 12:33 PM, Patrick O'Lone  <mailto:pol...@townnews.com>> wrote:
> 
> Well, I want to include everything will start in the next 5 minute
> interval and everything that came before. The query is more like:
> 
> fq=start_time:[* TO NOW+5MINUTE/5MINUTE]
> 
> so that it rounds to the nearest 5 minute interval on the right-hand
> side. But, as soon as 1 second after that 5 minute window, everything
> pauses wanting for filter cache (at least that's my working theory based
> on observation). Is it possible to do something like:
> 
> fq=start_time:[* TO NOW+1DAY/DAY]&q=start_time:[* TO NOW/MINUTE]
> 
> where it would use the filter cache to narrow down by day resolution and
> then filter as part of the standard query, or something like that?
> 
> My thought is that this would still gain a benefit from a query cache,
> but somewhat slower since it must remove results for things appearing
> later in the day.
> 
> > If you want a start time within the next 5 minutes, I think your
> filter
> > is not the good one.
> > * will be replaced by the first date in your field
> >
> > Try :
> > fq=start_time:[NOW TO NOW+5MINUTE]
> >
> > Franck Brisbart
> >
> >
> > Le lundi 09 d�cembre 2013 � 09:07 -0600, Patrick O'Lone a �crit :
> >> I have a new question about this issue - I create a filter queries of
> >> the form:
> >>
> >> fq=start_time:[* TO NOW/5MINUTE]
> >>
> >> This is used to restrict the set of documents to only items that
> have a
> >> start time within the next 5 minutes. Most of my indexes have
> millions
> >> of documents with few documents that start sometime in the future.
> >> Nearly all of my queries include this, would this cause every other
> >> search thread to block until the filter query is re-cached every 5
> >> minutes and if so, is there a better way to do it? Thanks for any
> >> continued help with this issue!
> >>
> >>> We have a webapp running with a very high HEAP size (24GB) and
> we have
> >>> no problems with it AFTER we enabled the new GC that is meant to
> replace
> >>> sometime in the future the CMS GC, but you have to have Java 6
> update
> >>> "Some number I couldn't find but latest should cover" to be able
> to use:
> >>>
> >>> 1. Remove all GC options you have and...
> >>> 2. Replace them with /"-XX:+UseG1GC -XX:MaxGCPauseMillis=50"/
> >>>
> >>> As a test of course, more information you can read on the
> following (and
> >>> interesting) article, we also have Solr running with these
> options, no
> >>> more pauses or HEAP size hitting the sky.
> >>>
> >>> Don't get bored reading the 1st (and small) introduction page of the
> >>> article, page 2 and 3 will make lot of sense:
> >>>
> 
> http://www.drdobbs.com/jvm/g1-javas-garbage-first-garbage-collector/219401061
> >>>
> >>>
> >>> HTH,
> >>>
> >>> Guido.
> >>>
> >>> On 26/11/13 21:59, Patrick O'Lone wrote:
> >>>> We do perform a lot of sorting - on multiple fields in fact. We
> have
> >>>> different kinds of Solr configurations - our news searches do
> little
> >>>>

LFU cache and autowarming

2013-12-19 Thread Patrick O';Lone

If I was to use the LFU cache instead of FastLRU on the filter cache, if
I enable auto-warming on that cache type - does it warm the most
frequently used fq on the filter cache? Thanks for any info!

-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

Re: LFU cache and autowarming

2013-12-19 Thread Patrick O';Lone

Well, I haven't tested it - if it's not ready yet I will probably avoid
for now.

> On 12/19/2013 1:46 PM, Patrick O'Lone wrote:
>> If I was to use the LFU cache instead of FastLRU on the filter cache, if
>> I enable auto-warming on that cache type - does it warm the most
>> frequently used fq on the filter cache? Thanks for any info!
> 
> I wrote that cache.  It's a really really crappy implementation, I would
> only expect it to work well if it's the cache is very very small.
> 
> I do have a replacement implementation that's just about ready, but I've
> not been able to find 'round tuits to work on getting it polished and
> committed.
> 
> https://issues.apache.org/jira/browse/SOLR-2906
> https://issues.apache.org/jira/browse/SOLR-3393
> 
> Thanks,
> Shawn
> 
> 


-- 
Patrick O'Lone
Director of Software Development
TownNews.com

E-mail ... pol...@townnews.com
Phone  309-743-0809
Fax .. 309-743-0830

Solr 3.6.1 stalling with high CPU and blocking on field cache

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

facet.method=fcs vs facet.method=fc on solr slaves

Re: facet.method=fcs vs facet.method=fc on solr slaves

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

LFU cache and autowarming

Re: LFU cache and autowarming

11 matches

Site Navigation

Mail list logo

Footer information