Re: Very long young generation stop the world GC pause

2016-12-09 Thread Ere Maijala
Then again, if the load characteristics on the Solr instance differ e.g. 
by time of day, G1GC, in my experience, may have trouble adapting. For 
instance if your query load reduces drastically during the night, it may 
take a while for G1GC to catch up in the morning. What I've found useful 
from experience, and your mileage will probably vary, is to limit the 
young generation size with a large heap. With Xmx31G something like 
these could work:


-XX:+UnlockExperimentalVMOptions \
-XX:G1MaxNewSizePercent=5 \

The aim here is to only limit the maximum and still allow some adaptation.

--Ere

8.12.2016, 16.07, Pushkar Raste kirjoitti:

Disable all the G1GC tuning your are doing except for ParallelRefProcEnabled

G1GC is an adaptive algorithm and would keep tuning to reach the default
pause goal of 250ms which should be good for most of the applications.

Can you also tell us how much RAM you have on your machine and if you have
swap enabled and being used?

On Dec 8, 2016 8:53 AM, "forest_soup"  wrote:


Besides, will those JVM options make it better?
-XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=10



--
View this message in context: http://lucene.472066.n3.
nabble.com/Very-long-young-generation-stop-the-world-GC-
pause-tp4308911p4308937.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Shawn Heisey
On 12/8/2016 6:08 PM, Brent wrote:
> Is there a difference between an "on deck" searcher and a warming
> searcher? From what I've read, they sound like the same thing. 

The on-deck searcher is the one that's active and serving queries.  A
warming searcher is one that is still coming up.  As soon as the oldest
warming searcher is ready, it becomes the on-deck searcher and the old
one is thrown away.

Thanks,
Shawn



Substitution variable and Collection api

2016-12-09 Thread Sunil Varma
Hi
I am trying to create a collection via Collection API and set  a core
property to use system substitution  variable as shown below:
http://localhost:8090/solr/admin/collections?action=
CREATE&name=ds&numShards=1&replicationFactor=1&
maxShardsPerNode=1&collection.configName=ds&property.
dataDir=${solr.data.dir}\ds

This doesn't work as the index files are getting created at the root folder
(c:/ds/).

How do I force it to accept the value as a literal string so that it is set
as "dataDir=${solr.data.dir}/ds" ?

Note: If I explicitly modify the core.properties "dataDir" to
${solr.data.dir}\ds , it works as expected and the index files gets created
at this location.

This is using Solr 6.3.

Thanks
Sunil


Re: Very long young generation stop the world GC pause

2016-12-09 Thread Pushkar Raste
My guess is system time is high either due  to lock contention (too many
parallel threads) or page faults.

Heap size was less than 6gb when this long pause occurred, and and young
generation was less than 2gb. Though lowering heap size would help I don't
think that is the root cause here

On Dec 9, 2016 3:02 AM, "Ere Maijala"  wrote:

> Then again, if the load characteristics on the Solr instance differ e.g.
> by time of day, G1GC, in my experience, may have trouble adapting. For
> instance if your query load reduces drastically during the night, it may
> take a while for G1GC to catch up in the morning. What I've found useful
> from experience, and your mileage will probably vary, is to limit the young
> generation size with a large heap. With Xmx31G something like these could
> work:
>
> -XX:+UnlockExperimentalVMOptions \
> -XX:G1MaxNewSizePercent=5 \
>
> The aim here is to only limit the maximum and still allow some adaptation.
>
> --Ere
>
> 8.12.2016, 16.07, Pushkar Raste kirjoitti:
>
>> Disable all the G1GC tuning your are doing except for
>> ParallelRefProcEnabled
>>
>> G1GC is an adaptive algorithm and would keep tuning to reach the default
>> pause goal of 250ms which should be good for most of the applications.
>>
>> Can you also tell us how much RAM you have on your machine and if you have
>> swap enabled and being used?
>>
>> On Dec 8, 2016 8:53 AM, "forest_soup"  wrote:
>>
>> Besides, will those JVM options make it better?
>>> -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=10
>>>
>>>
>>>
>>> --
>>> View this message in context: http://lucene.472066.n3.
>>> nabble.com/Very-long-young-generation-stop-the-world-GC-
>>> pause-tp4308911p4308937.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>


Re: LukeRequestHandler Error getting file length for [segments_1l]

2016-12-09 Thread Furkan KAMACI
No OOM, no corrupted index. Just a clean instal with few documents. Similar
to this:
http://lucene.472066.n3.nabble.com/NoSuchFileException-errors-common-on-version-5-5-0-td4263072.html

On Wed, Nov 30, 2016 at 3:19 AM, Shawn Heisey  wrote:

> On 11/29/2016 8:40 AM, halis Yılboğa wrote:
> > it is not normal to get that many error actually. Main problem should be
> > from your index. It seems to me your index is corrupted.
> >
> > 29 Kas 2016 Sal, 14:40 tarihinde, Furkan KAMACI 
> > şunu yazdı:
> >
> >> On the other hand, my Solr instance stops frequently due to such errors:
> >>
> >> 2016-11-29 12:25:36.962 WARN  (qtp1528637575-14) [   x:collection1]
> >> o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_c]
> >> java.nio.file.NoSuchFileException: data/index/segments_c
>
> If your Solr instance is actually stopping, I would suspect the OOM
> script, assuming a non-windows system.  On non-windows systems, recent
> versions of Solr have a script that forcibly terminates Solr in the
> event of an OutOfMemoryError.  This script has its own log, which would
> be in the same place as solr.log.
>
> I've never heard of Solr actually crashing on a normally configured
> system, and I'm reasonably sure that the message you've indicated is not
> something that would cause a crash.  In fact, I've never seen it cause
> any real issues, just the warning message.
>
> Thanks,
> Shawn
>
>


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Joel Bernstein
An on-deck searcher is not yet the active searcher. The SolrCore increments
the on-deck searcher count prior to starting the warming process. Unless
it's the first searcher, a new searcher will be warmed and then registered.
Once registered the searcher becomes active.

So, the initial question: is an on-deck searcher a warming searcher: the
answer is basically yes.



Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Dec 9, 2016 at 9:04 AM, Shawn Heisey  wrote:

> On 12/8/2016 6:08 PM, Brent wrote:
> > Is there a difference between an "on deck" searcher and a warming
> > searcher? From what I've read, they sound like the same thing.
>
> The on-deck searcher is the one that's active and serving queries.  A
> warming searcher is one that is still coming up.  As soon as the oldest
> warming searcher is ready, it becomes the on-deck searcher and the old
> one is thrown away.
>
> Thanks,
> Shawn
>
>


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Brent
Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
Overlapping onDeckSearchers" log message, it seems like the "they're the
same" answer is probably correct, because shouldn't there only be one active
searcher at a time?

Although it makes me curious, if there's a warning about having multiple
(overlapping) warming searchers, why is there a setting
(maxWarmingSearchers) that even lets you have more than one, or at least,
why ever set it to anything other than 1?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/on-deck-searcher-vs-warming-searcher-tp4309021p4309080.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Erick Erickson
bq: because shouldn't there only be one active
searcher at a time?

Kind of. This is a total nit, but there can be multiple
searchers serving queries briefly (one hopes at least).
S1 is serving some query when S2 becomes
active and starts getting new queries. Until the last
query S1 is serving is complete, they both are active.

bq: why is there a setting
(maxWarmingSearchers) that even lets
you have more than one

The contract is that when you commit (assuming
you're opening a new searcher), then all docs
indexed up to that point are visible. Therefore you
_must_ open a new searcher even if one is currently
warming or that contract would be violated. Since
warming can take minutes, not opening a new
searcher if one was currently warming could cause
quite a gap.


Best,
Erick

On Fri, Dec 9, 2016 at 7:30 AM, Brent  wrote:
> Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
> Overlapping onDeckSearchers" log message, it seems like the "they're the
> same" answer is probably correct, because shouldn't there only be one active
> searcher at a time?
>
> Although it makes me curious, if there's a warning about having multiple
> (overlapping) warming searchers, why is there a setting
> (maxWarmingSearchers) that even lets you have more than one, or at least,
> why ever set it to anything other than 1?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/on-deck-searcher-vs-warming-searcher-tp4309021p4309080.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Substitution variable and Collection api

2016-12-09 Thread Erick Erickson
I took a quick look at the code and really don't see any attempt to
resolve sysvars in the collection create code so I don't think this is
supported. Certainly sysvar substitution when reading core.properties
files is supported.

I don't know what gotchas there'd be in supporting this, you could
raise a JIRA to discuss it if you'd like (even better provide a
patch).

Erick

On Fri, Dec 9, 2016 at 6:05 AM, Sunil Varma  wrote:
> Hi
> I am trying to create a collection via Collection API and set  a core
> property to use system substitution  variable as shown below:
> http://localhost:8090/solr/admin/collections?action=
> CREATE&name=ds&numShards=1&replicationFactor=1&
> maxShardsPerNode=1&collection.configName=ds&property.
> dataDir=${solr.data.dir}\ds
>
> This doesn't work as the index files are getting created at the root folder
> (c:/ds/).
>
> How do I force it to accept the value as a literal string so that it is set
> as "dataDir=${solr.data.dir}/ds" ?
>
> Note: If I explicitly modify the core.properties "dataDir" to
> ${solr.data.dir}\ds , it works as expected and the index files gets created
> at this location.
>
> This is using Solr 6.3.
>
> Thanks
> Sunil


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Jihwan Kim
why is there a setting (maxWarmingSearchers) that even lets you have more
than one:
Isn't it also for a case of (frequent) update? For example, one update is
committed.  During the warming up  for this commit, another update is
made.  In this case the new commit also go through another warming.  If the
value is 1, the second warming will fail.  More number of concurrent
warming-up requires larger memory usage.


On Fri, Dec 9, 2016 at 9:14 AM, Erick Erickson 
wrote:

> bq: because shouldn't there only be one active
> searcher at a time?
>
> Kind of. This is a total nit, but there can be multiple
> searchers serving queries briefly (one hopes at least).
> S1 is serving some query when S2 becomes
> active and starts getting new queries. Until the last
> query S1 is serving is complete, they both are active.
>
> bq: why is there a setting
> (maxWarmingSearchers) that even lets
> you have more than one
>
> The contract is that when you commit (assuming
> you're opening a new searcher), then all docs
> indexed up to that point are visible. Therefore you
> _must_ open a new searcher even if one is currently
> warming or that contract would be violated. Since
> warming can take minutes, not opening a new
> searcher if one was currently warming could cause
> quite a gap.
>
>
> Best,
> Erick
>
> On Fri, Dec 9, 2016 at 7:30 AM, Brent  wrote:
> > Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
> > Overlapping onDeckSearchers" log message, it seems like the "they're the
> > same" answer is probably correct, because shouldn't there only be one
> active
> > searcher at a time?
> >
> > Although it makes me curious, if there's a warning about having multiple
> > (overlapping) warming searchers, why is there a setting
> > (maxWarmingSearchers) that even lets you have more than one, or at least,
> > why ever set it to anything other than 1?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/on-deck-searcher-vs-warming-searcher-tp4309021p4309080.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Erick Erickson
Jihwan:

Correct. Do note that there are two distinct warnings here:
1> "Error opening new searcher. exceeded limit of maxWarmingSearchers"
2> "PERFORMANCE WARNING: Overlapping onDeckSearchers=..."

in <1>, the new searcher is _not_ opened.
in <2>, the new searcher _is_ opened.

In practice, getting either warning is an indication of
mis-configuration. Consider a very large filterCache with large
autowarm values. Every new searcher will then allocate space for the
filterCache so having <1> is there to prevent runaway situations that
lead to OOM errors.

<2> is just letting you know that you should look at your usage of
commit so you can avoid <1>.

Best,
Erick

On Fri, Dec 9, 2016 at 8:44 AM, Jihwan Kim  wrote:
> why is there a setting (maxWarmingSearchers) that even lets you have more
> than one:
> Isn't it also for a case of (frequent) update? For example, one update is
> committed.  During the warming up  for this commit, another update is
> made.  In this case the new commit also go through another warming.  If the
> value is 1, the second warming will fail.  More number of concurrent
> warming-up requires larger memory usage.
>
>
> On Fri, Dec 9, 2016 at 9:14 AM, Erick Erickson 
> wrote:
>
>> bq: because shouldn't there only be one active
>> searcher at a time?
>>
>> Kind of. This is a total nit, but there can be multiple
>> searchers serving queries briefly (one hopes at least).
>> S1 is serving some query when S2 becomes
>> active and starts getting new queries. Until the last
>> query S1 is serving is complete, they both are active.
>>
>> bq: why is there a setting
>> (maxWarmingSearchers) that even lets
>> you have more than one
>>
>> The contract is that when you commit (assuming
>> you're opening a new searcher), then all docs
>> indexed up to that point are visible. Therefore you
>> _must_ open a new searcher even if one is currently
>> warming or that contract would be violated. Since
>> warming can take minutes, not opening a new
>> searcher if one was currently warming could cause
>> quite a gap.
>>
>>
>> Best,
>> Erick
>>
>> On Fri, Dec 9, 2016 at 7:30 AM, Brent  wrote:
>> > Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
>> > Overlapping onDeckSearchers" log message, it seems like the "they're the
>> > same" answer is probably correct, because shouldn't there only be one
>> active
>> > searcher at a time?
>> >
>> > Although it makes me curious, if there's a warning about having multiple
>> > (overlapping) warming searchers, why is there a setting
>> > (maxWarmingSearchers) that even lets you have more than one, or at least,
>> > why ever set it to anything other than 1?
>> >
>> >
>> >
>> > --
>> > View this message in context: http://lucene.472066.n3.
>> nabble.com/on-deck-searcher-vs-warming-searcher-tp4309021p4309080.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>


Unicode Character Problem

2016-12-09 Thread Furkan KAMACI
Hi,

I'm trying to index Turkish characters. These are what I see at my index (I
see both of them at different places of my content):

aç �klama
açıklama

These are same words but indexed different (same weird character at first
one). I see that there is not a weird character when I check the original
PDF file.

What do you think about it. Is it related to Solr or Tika?

PS: I use text_general for analyser of content field.

Kind Regards,
Furkan KAMACI


CREATEALIAS to non-existing collections

2016-12-09 Thread Tomás Fernández Löbbe
We currently support requests to CREATEALIAS to collections that don’t
exist. Requests to this alias later result in 404s. If the target
collection is later created, requests to the alias will begin to work. I’m
wondering if someone is relying on this behavior, or if we should validate
the existence of the target collections when creating the alias (and thus,
fail fast in cases of typos or unexpected cluster state)

Tomás


Re: CREATEALIAS to non-existing collections

2016-12-09 Thread Anshum Gupta
I think that might have just been an oversight. We shouldn't allow creation
of an alias for non-existent collections.

On a similar note, I think we should also be clearing out the aliases when
we DELETE a collection.

-Anshum

On Fri, Dec 9, 2016 at 12:57 PM Tomás Fernández Löbbe 
wrote:

> We currently support requests to CREATEALIAS to collections that don’t
> exist. Requests to this alias later result in 404s. If the target
> collection is later created, requests to the alias will begin to work. I’m
> wondering if someone is relying on this behavior, or if we should validate
> the existence of the target collections when creating the alias (and thus,
> fail fast in cases of typos or unexpected cluster state)
>
> Tomás
>


Re: [ANN] InvisibleQueriesRequestHandler

2016-12-09 Thread Otis Gospodnetić
Nice.

Here is something similar: https://github.com/sematext/solr-researcher -
hope others find it useful, too.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Mon, Dec 5, 2016 at 4:18 AM, Andrea Gazzarini  wrote:

> Hi guys,
> I developed this handler [1] while doing some work on a Magento ->  Solr
> project.
>
> If someone is interested (this is a post [2] where I briefly explain the
> goal), or wants to contribute with some idea / improvement, feel free to
> give me a shout or a feedback.
>
> Best,
> Andrea
>
> [1] https://github.com/agazzarini/invisible-queries-request-handler
> [2]
> https://andreagazzarini.blogspot.it/2016/12/composing-
> and-reusing-request-handlers.html
>


Re: CREATEALIAS to non-existing collections

2016-12-09 Thread King Rhoton

> On Dec 9, 2016, at 1:12 PM, Anshum Gupta  wrote:
> 
> On a similar note, I think we should also be clearing out the aliases when
> we DELETE a collection.

This seems problematic since an alias can point to several collections.  Maybe 
the point is that the deleted collection should be removed from every alias 
where it exists

-
King Rhoton, c/o Adobe, 601 Townsend, SF, CA 94103
415-832-4480 x24480
S&P support requests should go to search-...@adobe.com



Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Trey Grainger
Shawn and Joel both answered the question with seemingly opposite answers,
but Joel's should be right. On Deck, as an idiom, means "getting ready to
go next". I think it has it's history in military / naval terminology (a
plane being "on deck" of an aircraft carrier was the next one to take off),
and was later used heavily in baseball (the "on deck" batter was the one
warming up to go next) and probably elsewhere.

I've always understood the "on deck" searcher(s) being the same as the
warming searcher(s). So you have the "active" searcher and them the warming
or on deck searchers.

-Trey


On Fri, Dec 9, 2016 at 11:54 AM, Erick Erickson 
wrote:

> Jihwan:
>
> Correct. Do note that there are two distinct warnings here:
> 1> "Error opening new searcher. exceeded limit of maxWarmingSearchers"
> 2> "PERFORMANCE WARNING: Overlapping onDeckSearchers=..."
>
> in <1>, the new searcher is _not_ opened.
> in <2>, the new searcher _is_ opened.
>
> In practice, getting either warning is an indication of
> mis-configuration. Consider a very large filterCache with large
> autowarm values. Every new searcher will then allocate space for the
> filterCache so having <1> is there to prevent runaway situations that
> lead to OOM errors.
>
> <2> is just letting you know that you should look at your usage of
> commit so you can avoid <1>.
>
> Best,
> Erick
>
> On Fri, Dec 9, 2016 at 8:44 AM, Jihwan Kim  wrote:
> > why is there a setting (maxWarmingSearchers) that even lets you have more
> > than one:
> > Isn't it also for a case of (frequent) update? For example, one update is
> > committed.  During the warming up  for this commit, another update is
> > made.  In this case the new commit also go through another warming.  If
> the
> > value is 1, the second warming will fail.  More number of concurrent
> > warming-up requires larger memory usage.
> >
> >
> > On Fri, Dec 9, 2016 at 9:14 AM, Erick Erickson 
> > wrote:
> >
> >> bq: because shouldn't there only be one active
> >> searcher at a time?
> >>
> >> Kind of. This is a total nit, but there can be multiple
> >> searchers serving queries briefly (one hopes at least).
> >> S1 is serving some query when S2 becomes
> >> active and starts getting new queries. Until the last
> >> query S1 is serving is complete, they both are active.
> >>
> >> bq: why is there a setting
> >> (maxWarmingSearchers) that even lets
> >> you have more than one
> >>
> >> The contract is that when you commit (assuming
> >> you're opening a new searcher), then all docs
> >> indexed up to that point are visible. Therefore you
> >> _must_ open a new searcher even if one is currently
> >> warming or that contract would be violated. Since
> >> warming can take minutes, not opening a new
> >> searcher if one was currently warming could cause
> >> quite a gap.
> >>
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Dec 9, 2016 at 7:30 AM, Brent  wrote:
> >> > Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
> >> > Overlapping onDeckSearchers" log message, it seems like the "they're
> the
> >> > same" answer is probably correct, because shouldn't there only be one
> >> active
> >> > searcher at a time?
> >> >
> >> > Although it makes me curious, if there's a warning about having
> multiple
> >> > (overlapping) warming searchers, why is there a setting
> >> > (maxWarmingSearchers) that even lets you have more than one, or at
> least,
> >> > why ever set it to anything other than 1?
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context: http://lucene.472066.n3.
> >> nabble.com/on-deck-searcher-vs-warming-searcher-tp4309021p4309080.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>


Re: CREATEALIAS to non-existing collections

2016-12-09 Thread Jeff Wartes

I’d prefer it if the alias was required to be removed, or pointed elsewhere, 
before the collection could be deleted.

As a best practice, I encourage all SolrCloud users to configure an alias to 
each collection, and use only the alias in their clients. This allows atomic 
switching between collections with no client changes necessary. It’s pretty 
handy in the common case that you’re rebuilding a collection because you’ve 
changed how things are indexed, but you don’t want to take downtime. Variations 
include an alias per class of client, or even multiple aliases where the client 
chooses one at random per request, for moving fractional traffic between 
collections.

In this scenario, the alias would also act as a lock, preventing removal of the 
“live” collection by testifying that some client is using it.

I considered this a first-class safety feature in the cluster management tool I 
wrote: https://github.com/whitepages/solrcloud_manager


On 12/9/16, 1:12 PM, "Anshum Gupta"  wrote:

I think that might have just been an oversight. We shouldn't allow creation
of an alias for non-existent collections.

On a similar note, I think we should also be clearing out the aliases when
we DELETE a collection.

-Anshum

On Fri, Dec 9, 2016 at 12:57 PM Tomás Fernández Löbbe 

wrote:

> We currently support requests to CREATEALIAS to collections that don’t
> exist. Requests to this alias later result in 404s. If the target
> collection is later created, requests to the alias will begin to work. I’m
> wondering if someone is relying on this behavior, or if we should validate
> the existence of the target collections when creating the alias (and thus,
> fail fast in cases of typos or unexpected cluster state)
>
> Tomás
>




Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Brent
I'm using Solr Cloud 6.1.0, and my client application is using SolrJ 6.1.0.

Using this Solr config, I get none of the dreaded "PERFORMANCE WARNING:
Overlapping onDeckSearchers=2" log messages:
https://dl.dropboxusercontent.com/u/49733981/solrconfig-no_warnings.xml

However, I start getting them frequently after I add an expiration update
processor to the update request processor chain, as seen in this config (at
the bottom):
https://dl.dropboxusercontent.com/u/49733981/solrconfig-warnings.xml

Do I have something configured wrong in the way I've tried to add the
function of expiring documents? My client application sets the "expire_at"
field with the date to remove the document being added, so I don't need
anything on the Solr Cloud side to calculate the expiration date using a
TTL. I've confirmed that the documents are getting removed as expected after
the TTL duration.

Is it possible that the expiration processor is triggering additional
commits? Seems like the warning is usually the result of commits happening
too frequently. If the commit spacing is fine without the expiration
processor, but not okay when I add it, it seems like maybe each update is
now triggering a (soft?) commit. Although, that'd actually be crazy and I'm
sure I'd see a lot more errors if that were the case... is it triggering a
commit every 30 seconds, because that's what I have the
autoDeletePeriodSeconds set to? Maybe if I try to offset that a bit from the
10 second auto soft commit I'm using? Seems like it'd be better (if that is
the case) if the processor simple didn't have to do a commit when it expires
documents, and instead let the auto commit settings handle that. 

Do I still need the line:

when I have the 

element?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-DocExpirationUpdateProcessorFactory-causes-Overlapping-onDeckSearchers-warnings-tp4309155.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Erick Erickson
bq: ...is it triggering a commit every 30 seconds, because that's what
I have the autoDeletePeriodSeconds set to


Yep. There's this line from Chris' writeup:

After the deleteByQuery has been executed, a soft commit is also
executed using openSearcher=true so that search results will no longer
see the expired documents.


Assuming your index is changing, you'll indeed open one searcher as a
result of your autocommit and a second as a result of TTL processing.
And they'll overlap sometimes.

There's a note in the code about making the commits optional, it seems
fair to raise a JIRA about implementing this. Patches even more
welcome ;).

Meanwhile, this performance _warning_ is benign. That is, the new
searcher is indeed opened. If you see something like "Error opening
new searcher. exceeded limit of maxWarmingSearchers" then the
newest searcher will not be opened (although the next searcher opened
will pick up any changes).

Note that there's no particular point in bumping the max warming
searchers in solrconfig.xml since the warning message is dumped
whenever there are > 1 warming searchers. If you get the "error
opening" message it's a more open question though.

So your choices are:
1> just ignore it
2> contribute a patch
3> increase the interval. That'll reduce the number of times you see
this warning, but won't eliminate them all.

Best,
Erick


On Fri, Dec 9, 2016 at 3:15 PM, Brent  wrote:
> I'm using Solr Cloud 6.1.0, and my client application is using SolrJ 6.1.0.
>
> Using this Solr config, I get none of the dreaded "PERFORMANCE WARNING:
> Overlapping onDeckSearchers=2" log messages:
> https://dl.dropboxusercontent.com/u/49733981/solrconfig-no_warnings.xml
>
> However, I start getting them frequently after I add an expiration update
> processor to the update request processor chain, as seen in this config (at
> the bottom):
> https://dl.dropboxusercontent.com/u/49733981/solrconfig-warnings.xml
>
> Do I have something configured wrong in the way I've tried to add the
> function of expiring documents? My client application sets the "expire_at"
> field with the date to remove the document being added, so I don't need
> anything on the Solr Cloud side to calculate the expiration date using a
> TTL. I've confirmed that the documents are getting removed as expected after
> the TTL duration.
>
> Is it possible that the expiration processor is triggering additional
> commits? Seems like the warning is usually the result of commits happening
> too frequently. If the commit spacing is fine without the expiration
> processor, but not okay when I add it, it seems like maybe each update is
> now triggering a (soft?) commit. Although, that'd actually be crazy and I'm
> sure I'd see a lot more errors if that were the case... is it triggering a
> commit every 30 seconds, because that's what I have the
> autoDeletePeriodSeconds set to? Maybe if I try to offset that a bit from the
> 10 second auto soft commit I'm using? Seems like it'd be better (if that is
> the case) if the processor simple didn't have to do a commit when it expires
> documents, and instead let the auto commit settings handle that.
>
> Do I still need the line:
>  name="/update">
> when I have the
>  default="true">
> element?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Adding-DocExpirationUpdateProcessorFactory-causes-Overlapping-onDeckSearchers-warnings-tp4309155.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Kevin Risden
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.java#L407

Based on that it looks like this would definitely trigger additional
commits. Specifically with openSearcher being true.

Not sure the best way around this.

Kevin Risden

On Fri, Dec 9, 2016 at 5:15 PM, Brent  wrote:

> I'm using Solr Cloud 6.1.0, and my client application is using SolrJ 6.1.0.
>
> Using this Solr config, I get none of the dreaded "PERFORMANCE WARNING:
> Overlapping onDeckSearchers=2" log messages:
> https://dl.dropboxusercontent.com/u/49733981/solrconfig-no_warnings.xml
>
> However, I start getting them frequently after I add an expiration update
> processor to the update request processor chain, as seen in this config (at
> the bottom):
> https://dl.dropboxusercontent.com/u/49733981/solrconfig-warnings.xml
>
> Do I have something configured wrong in the way I've tried to add the
> function of expiring documents? My client application sets the "expire_at"
> field with the date to remove the document being added, so I don't need
> anything on the Solr Cloud side to calculate the expiration date using a
> TTL. I've confirmed that the documents are getting removed as expected
> after
> the TTL duration.
>
> Is it possible that the expiration processor is triggering additional
> commits? Seems like the warning is usually the result of commits happening
> too frequently. If the commit spacing is fine without the expiration
> processor, but not okay when I add it, it seems like maybe each update is
> now triggering a (soft?) commit. Although, that'd actually be crazy and I'm
> sure I'd see a lot more errors if that were the case... is it triggering a
> commit every 30 seconds, because that's what I have the
> autoDeletePeriodSeconds set to? Maybe if I try to offset that a bit from
> the
> 10 second auto soft commit I'm using? Seems like it'd be better (if that is
> the case) if the processor simple didn't have to do a commit when it
> expires
> documents, and instead let the auto commit settings handle that.
>
> Do I still need the line:
>  name="/update">
> when I have the
>  default="true">
> element?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Adding-
> DocExpirationUpdateProcessorFactory-causes-Overlapping-
> onDeckSearchers-warnings-tp4309155.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Brent
Okay, I created a JIRA ticket
(https://issues.apache.org/jira/servicedesk/agent/SOLR/issue/SOLR-9841) and
will work on a patch.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-DocExpirationUpdateProcessorFactory-causes-Overlapping-onDeckSearchers-warnings-tp4309155p4309173.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Chris Hostetter

: bq: ...is it triggering a commit every 30 seconds, because that's what
: I have the autoDeletePeriodSeconds set to

yes a commit is triggered each time a delete is fired.

: There's a note in the code about making the commits optional, it seems
: fair to raise a JIRA about implementing this. Patches even more
: welcome ;).

No, actaully the Note in the code is about wether or not there should be 
an option to force a *HARD* commits.

The fact that there is no option prevent any commit at all was a concious 
choice:

a) the processor is basically useless unless something does a commit -- 
there's no point in doing deletes every 30 seconds if we only want to 
bother having a new searcher every 60 seconds -- it just means we're doing 
twice the work w/o any added benefit.

b) a softCommit+openSeacher is a No-Op unless there is soemthing to 
actauly commit. (see SOLR-5783 and TestIndexSearcher.testReopen)


If you are seeing an increase in "Overlapping onDeckSearchers" when using 
DocExpirationUpdateProcessorFactory, it's becuase you actaully have docs 
expiring quite frequently relative to the autoDeletePeriodSeconds and 
the amount of time needed to warm each of the new searchers.

if ou don't want the searchers to be re-opened so frequently, just 
increase the autoDeletePeriodSeconds. 


-Hoss
http://www.lucidworks.com/


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Brent
Chris Hostetter-3 wrote
> If you are seeing an increase in "Overlapping onDeckSearchers" when using 
> DocExpirationUpdateProcessorFactory, it's becuase you actaully have docs 
> expiring quite frequently relative to the autoDeletePeriodSeconds and 
> the amount of time needed to warm each of the new searchers.
> 
> if ou don't want the searchers to be re-opened so frequently, just 
> increase the autoDeletePeriodSeconds. 

But if I increase the period, then it'll have even more docs that have
expired, and shouldn't that make the amount of time needed to warm the new
searcher even longer?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-DocExpirationUpdateProcessorFactory-causes-Overlapping-onDeckSearchers-warnings-tp4309155p4309175.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding DocExpirationUpdateProcessorFactory causes "Overlapping onDeckSearchers" warnings

2016-12-09 Thread Chris Hostetter

: > If you are seeing an increase in "Overlapping onDeckSearchers" when using 
: > DocExpirationUpdateProcessorFactory, it's becuase you actaully have docs 
: > expiring quite frequently relative to the autoDeletePeriodSeconds and 
: > the amount of time needed to warm each of the new searchers.
: > 
: > if ou don't want the searchers to be re-opened so frequently, just 
: > increase the autoDeletePeriodSeconds. 
: 
: But if I increase the period, then it'll have even more docs that have
: expired, and shouldn't that make the amount of time needed to warm the new
: searcher even longer?

Not to the point of being significant in any practical sense ...

In crude generalizations: the largest overhead in auto-warming is the 
number of queries (ie: the size of the cache), and the main overhead on a 
per query basis is the number of docs that match that query.

So unless you're expiring (and replacing!) the majority of documents 
in your index every X seconds, but you only care about opening a new 
searcher every X*2 seconds, you shouldn't notice any obserable differnece 
in the time needed to do the warming if you only delete every X*2 seconds.



-Hoss
http://www.lucidworks.com/


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Joel Bernstein
The question about allowing more the one on-deck searcher is a good one.
The current behavior with maxWarmingSearcher config is to throw an
exception if searchers are being opened too frequently. There is probably a
good reason why it was done this way but I'm not sure the history behind it.

Currently I'm adding code to Alfresco's version of Solr that guards against
having more the one on-deck searcher. This allows users to set the commit
intervals low without having to worry about getting overlapping searchers.
Something like this might useful in the standard Solr as well, if people
don't like exceptions being thrown when searchers are opened too frequently.


Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Dec 9, 2016 at 5:42 PM, Trey Grainger  wrote:

> Shawn and Joel both answered the question with seemingly opposite answers,
> but Joel's should be right. On Deck, as an idiom, means "getting ready to
> go next". I think it has it's history in military / naval terminology (a
> plane being "on deck" of an aircraft carrier was the next one to take off),
> and was later used heavily in baseball (the "on deck" batter was the one
> warming up to go next) and probably elsewhere.
>
> I've always understood the "on deck" searcher(s) being the same as the
> warming searcher(s). So you have the "active" searcher and them the warming
> or on deck searchers.
>
> -Trey
>
>
> On Fri, Dec 9, 2016 at 11:54 AM, Erick Erickson 
> wrote:
>
> > Jihwan:
> >
> > Correct. Do note that there are two distinct warnings here:
> > 1> "Error opening new searcher. exceeded limit of
> maxWarmingSearchers"
> > 2> "PERFORMANCE WARNING: Overlapping onDeckSearchers=..."
> >
> > in <1>, the new searcher is _not_ opened.
> > in <2>, the new searcher _is_ opened.
> >
> > In practice, getting either warning is an indication of
> > mis-configuration. Consider a very large filterCache with large
> > autowarm values. Every new searcher will then allocate space for the
> > filterCache so having <1> is there to prevent runaway situations that
> > lead to OOM errors.
> >
> > <2> is just letting you know that you should look at your usage of
> > commit so you can avoid <1>.
> >
> > Best,
> > Erick
> >
> > On Fri, Dec 9, 2016 at 8:44 AM, Jihwan Kim  wrote:
> > > why is there a setting (maxWarmingSearchers) that even lets you have
> more
> > > than one:
> > > Isn't it also for a case of (frequent) update? For example, one update
> is
> > > committed.  During the warming up  for this commit, another update is
> > > made.  In this case the new commit also go through another warming.  If
> > the
> > > value is 1, the second warming will fail.  More number of concurrent
> > > warming-up requires larger memory usage.
> > >
> > >
> > > On Fri, Dec 9, 2016 at 9:14 AM, Erick Erickson <
> erickerick...@gmail.com>
> > > wrote:
> > >
> > >> bq: because shouldn't there only be one active
> > >> searcher at a time?
> > >>
> > >> Kind of. This is a total nit, but there can be multiple
> > >> searchers serving queries briefly (one hopes at least).
> > >> S1 is serving some query when S2 becomes
> > >> active and starts getting new queries. Until the last
> > >> query S1 is serving is complete, they both are active.
> > >>
> > >> bq: why is there a setting
> > >> (maxWarmingSearchers) that even lets
> > >> you have more than one
> > >>
> > >> The contract is that when you commit (assuming
> > >> you're opening a new searcher), then all docs
> > >> indexed up to that point are visible. Therefore you
> > >> _must_ open a new searcher even if one is currently
> > >> warming or that contract would be violated. Since
> > >> warming can take minutes, not opening a new
> > >> searcher if one was currently warming could cause
> > >> quite a gap.
> > >>
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Fri, Dec 9, 2016 at 7:30 AM, Brent 
> wrote:
> > >> > Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
> > >> > Overlapping onDeckSearchers" log message, it seems like the "they're
> > the
> > >> > same" answer is probably correct, because shouldn't there only be
> one
> > >> active
> > >> > searcher at a time?
> > >> >
> > >> > Although it makes me curious, if there's a warning about having
> > multiple
> > >> > (overlapping) warming searchers, why is there a setting
> > >> > (maxWarmingSearchers) that even lets you have more than one, or at
> > least,
> > >> > why ever set it to anything other than 1?
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > View this message in context: http://lucene.472066.n3.
> > >> nabble.com/on-deck-searcher-vs-warming-searcher-
> tp4309021p4309080.html
> > >> > Sent from the Solr - User mailing list archive at Nabble.com.
> > >>
> >
>


Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Yonik Seeley
We've got a patch to prevent the exceptions:
https://issues.apache.org/jira/browse/SOLR-9712

-Yonik


On Fri, Dec 9, 2016 at 7:45 PM, Joel Bernstein  wrote:
> The question about allowing more the one on-deck searcher is a good one.
> The current behavior with maxWarmingSearcher config is to throw an
> exception if searchers are being opened too frequently. There is probably a
> good reason why it was done this way but I'm not sure the history behind it.
>
> Currently I'm adding code to Alfresco's version of Solr that guards against
> having more the one on-deck searcher. This allows users to set the commit
> intervals low without having to worry about getting overlapping searchers.
> Something like this might useful in the standard Solr as well, if people
> don't like exceptions being thrown when searchers are opened too frequently.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Dec 9, 2016 at 5:42 PM, Trey Grainger  wrote:
>
>> Shawn and Joel both answered the question with seemingly opposite answers,
>> but Joel's should be right. On Deck, as an idiom, means "getting ready to
>> go next". I think it has it's history in military / naval terminology (a
>> plane being "on deck" of an aircraft carrier was the next one to take off),
>> and was later used heavily in baseball (the "on deck" batter was the one
>> warming up to go next) and probably elsewhere.
>>
>> I've always understood the "on deck" searcher(s) being the same as the
>> warming searcher(s). So you have the "active" searcher and them the warming
>> or on deck searchers.
>>
>> -Trey
>>
>>
>> On Fri, Dec 9, 2016 at 11:54 AM, Erick Erickson 
>> wrote:
>>
>> > Jihwan:
>> >
>> > Correct. Do note that there are two distinct warnings here:
>> > 1> "Error opening new searcher. exceeded limit of
>> maxWarmingSearchers"
>> > 2> "PERFORMANCE WARNING: Overlapping onDeckSearchers=..."
>> >
>> > in <1>, the new searcher is _not_ opened.
>> > in <2>, the new searcher _is_ opened.
>> >
>> > In practice, getting either warning is an indication of
>> > mis-configuration. Consider a very large filterCache with large
>> > autowarm values. Every new searcher will then allocate space for the
>> > filterCache so having <1> is there to prevent runaway situations that
>> > lead to OOM errors.
>> >
>> > <2> is just letting you know that you should look at your usage of
>> > commit so you can avoid <1>.
>> >
>> > Best,
>> > Erick
>> >
>> > On Fri, Dec 9, 2016 at 8:44 AM, Jihwan Kim  wrote:
>> > > why is there a setting (maxWarmingSearchers) that even lets you have
>> more
>> > > than one:
>> > > Isn't it also for a case of (frequent) update? For example, one update
>> is
>> > > committed.  During the warming up  for this commit, another update is
>> > > made.  In this case the new commit also go through another warming.  If
>> > the
>> > > value is 1, the second warming will fail.  More number of concurrent
>> > > warming-up requires larger memory usage.
>> > >
>> > >
>> > > On Fri, Dec 9, 2016 at 9:14 AM, Erick Erickson <
>> erickerick...@gmail.com>
>> > > wrote:
>> > >
>> > >> bq: because shouldn't there only be one active
>> > >> searcher at a time?
>> > >>
>> > >> Kind of. This is a total nit, but there can be multiple
>> > >> searchers serving queries briefly (one hopes at least).
>> > >> S1 is serving some query when S2 becomes
>> > >> active and starts getting new queries. Until the last
>> > >> query S1 is serving is complete, they both are active.
>> > >>
>> > >> bq: why is there a setting
>> > >> (maxWarmingSearchers) that even lets
>> > >> you have more than one
>> > >>
>> > >> The contract is that when you commit (assuming
>> > >> you're opening a new searcher), then all docs
>> > >> indexed up to that point are visible. Therefore you
>> > >> _must_ open a new searcher even if one is currently
>> > >> warming or that contract would be violated. Since
>> > >> warming can take minutes, not opening a new
>> > >> searcher if one was currently warming could cause
>> > >> quite a gap.
>> > >>
>> > >>
>> > >> Best,
>> > >> Erick
>> > >>
>> > >> On Fri, Dec 9, 2016 at 7:30 AM, Brent 
>> wrote:
>> > >> > Hmmm, conflicting answers. Given the infamous "PERFORMANCE WARNING:
>> > >> > Overlapping onDeckSearchers" log message, it seems like the "they're
>> > the
>> > >> > same" answer is probably correct, because shouldn't there only be
>> one
>> > >> active
>> > >> > searcher at a time?
>> > >> >
>> > >> > Although it makes me curious, if there's a warning about having
>> > multiple
>> > >> > (overlapping) warming searchers, why is there a setting
>> > >> > (maxWarmingSearchers) that even lets you have more than one, or at
>> > least,
>> > >> > why ever set it to anything other than 1?
>> > >> >
>> > >> >
>> > >> >
>> > >> > --
>> > >> > View this message in context: http://lucene.472066.n3.
>> > >> nabble.com/on-deck-searcher-vs-warming-searcher-
>> tp4309021p4309080.html
>> > >> > Sent from the Solr - User mailing list archive at Nabble.com.
>> > >