expand=true throws error

2020-03-30 Thread Szűcs Roland
Hi All,

I manage to use edismax queryparser in solr 8.4.1 with collapse without any
problem. I tested it with the SOLR admin GUI. So fq={!collapse field=title}
worked fine.

As soon as I use the example from the documentation and use:  fq={!collapse
field=title}&expand=true, I did not get back any additional output with
section expanded.

Any idea?

Thanks in advance,
Roland


Re: expand=true throws error

2020-03-30 Thread Munendra S N
Hey,
Could you please share the stacktrace or error message you received?

On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland 
wrote:

> Hi All,
>
> I manage to use edismax queryparser in solr 8.4.1 with collapse without any
> problem. I tested it with the SOLR admin GUI. So fq={!collapse field=title}
> worked fine.
>
> As soon as I use the example from the documentation and use:  fq={!collapse
> field=title}&expand=true, I did not get back any additional output with
> section expanded.
>
> Any idea?
>
> Thanks in advance,
> Roland
>


Re: expand=true throws error

2020-03-30 Thread Szűcs Roland
Hi Munendra,
I do not get error . The strange thing is that I get exactly the same
response with fq={!collapse field=title} versus  fq={!collapse
field=title}&expand=true.
Collapse works properly as a standalone fq but expand has no impact. How
can I have access to the "hidden" documents then?

Roland

Munendra S N  ezt írta (időpont: 2020. márc. 30.,
H, 16:47):

> Hey,
> Could you please share the stacktrace or error message you received?
>
> On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland 
> wrote:
>
> > Hi All,
> >
> > I manage to use edismax queryparser in solr 8.4.1 with collapse without
> any
> > problem. I tested it with the SOLR admin GUI. So fq={!collapse
> field=title}
> > worked fine.
> >
> > As soon as I use the example from the documentation and use:
> fq={!collapse
> > field=title}&expand=true, I did not get back any additional output with
> > section expanded.
> >
> > Any idea?
> >
> > Thanks in advance,
> > Roland
> >
>


Re: expand=true throws error

2020-03-30 Thread Munendra S N
Please share the complete request. Also, does number of results change with
& without collapse. Usually title would be unique every document. If that
is  the case then, there won't be anything to expand right?

On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland 
wrote:

> Hi Munendra,
> I do not get error . The strange thing is that I get exactly the same
> response with fq={!collapse field=title} versus  fq={!collapse
> field=title}&expand=true.
> Collapse works properly as a standalone fq but expand has no impact. How
> can I have access to the "hidden" documents then?
>
> Roland
>
> Munendra S N  ezt írta (időpont: 2020. márc. 30.,
> H, 16:47):
>
> > Hey,
> > Could you please share the stacktrace or error message you received?
> >
> > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland 
> > wrote:
> >
> > > Hi All,
> > >
> > > I manage to use edismax queryparser in solr 8.4.1 with collapse without
> > any
> > > problem. I tested it with the SOLR admin GUI. So fq={!collapse
> > field=title}
> > > worked fine.
> > >
> > > As soon as I use the example from the documentation and use:
> > fq={!collapse
> > > field=title}&expand=true, I did not get back any additional output with
> > > section expanded.
> > >
> > > Any idea?
> > >
> > > Thanks in advance,
> > > Roland
> > >
> >
>


DocValue field & commit

2020-03-30 Thread sujatha arun
 A facet heavy query which uses docValue fields for faceting  returns about
5k results executes between  10ms to 5 secs and the 5 secs time seems to
coincide with after a hard commit.

Does that have any relation? Why the fluctuation in execution time?

Thanks,
Revas


RE: No files to download for index generation

2020-03-30 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I wanted to ask *yet again* whether anyone could please clarify what this error 
means?

The wording could be interpreted as a benign "I found that there was nothing 
which needed to be done after all"; but were that to be the meaning of this 
error, why would it be flagged as an ERROR rather than as INFO or WARN ?

Please advise


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Wednesday, March 11, 2020 5:18 PM
To: solr-user@lucene.apache.org
Subject: RE: No files to download for index generation

I wanted to ask *again* whether anyone has any insight regarding this message

There seem to have been several people asking the question on this forum 
(Markus Jelsma on 8/23/19, Akreeti Agarwal on 12/27/19 and Vadim Ivanov on 
12/29/19)

The only response I have seen was five words from Erick Erickson on 12/27/19: 
"Not sure about that one"

Could someone please clarify what this error means?

The wording could be interpreted as a benign "I found that there was nothing 
which needed to be done after all"; but were that to be the meaning of this 
error, why would it be flagged as an ERROR rather than as INFO or WARN ?


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Monday, June 10, 2019 9:57 AM
To: solr-user@lucene.apache.org
Subject: RE: No files to download for index generation

Does anyone yet have any insight on interpreting the severity of this message?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Tuesday, June 04, 2019 4:07 PM
To: solr-user@lucene.apache.org
Subject: No files to download for index generation

We have occasionally been seeing an error such as the following:
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's generation: 1424625
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's version: 1559619115480
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's generation: 1424624
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's version: 1559619050130
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Starting replication process
2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher No files to download for index generation: 1424625

Is that last line actually an error as in "there SHOULD be files to download, 
but there are none"?

Or is it simply informative as in "there are no files to download, so we are 
all done here"?


GC1 and StringDeduplication

2020-03-30 Thread Stephen Lewis Bianamara
Hi SOLR Community,

I've been looking at performance tuning solr's GC lately. I found this
helpful article on the matter.
https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey

One thing the article does not address is the GC1's ability to use string
deduplication:
https://blog.gceasy.io/2018/12/23/usestringdeduplication/#more-2861

Are there scenarios where this is advisable for SOLR? For example, if my
cluster has many requests which come in with the same basic filters (things
like site, username, type filter, etc...) would you expect the performance
gain to be noticeable as SOLR may not be frequently recreating the same
dictionary keys, etc? Or since there is likely some overall loss in
performance, would you expect the loss to be likely to outweigh any of the
gains?

Thanks!
Stephen


Re: DocValue field & commit

2020-03-30 Thread Erick Erickson
Response spikes after commits are almost always something to do
with autowarming or docValues being set to false. So here’s what
I’d look at, in order.

1> are the fields used defined with docValues=true? They should be.
With this much variance it sounds like you don’t have that value set.
You’ll have to rebuild your entire index, first deleting all documents…

You assert that they are all docValues, but the variance is so
high that I wonder whether they _all_ are. They may very well be, but
I’ve been tripped up by things I know are true that aren’t too often ;)

You can insure this by setting 'uninvertible=“true” ‘ in your field type, 
see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on 
7.6 or later.

2>what are your autowarming settings for queryResultCache and/or
filterCache. Start with a relatively small number, say 16 and look at
your autowarm times to insure they aren’t excessive.

3> if autowarming doesn’t help, consider specifying a newSearcher
event in solrconfig.xml that exercises the facets.

NOTE: <2> and <3> will mask any fields that are docValues=false that
slipped through the cracks, so I’d double check <1> first.

Best,
Erick

> On Mar 30, 2020, at 12:20 PM, sujatha arun  wrote:
> 
> A facet heavy query which uses docValue fields for faceting  returns about
> 5k results executes between  10ms to 5 secs and the 5 secs time seems to
> coincide with after a hard commit.
> 
> Does that have any relation? Why the fluctuation in execution time?
> 
> Thanks,
> Revas



Re: DocValue field & commit

2020-03-30 Thread Revas
Thanks, Eric.

1) We are using dynamic string field for faceting where indexing =false and
stored=false . By default docValues are enabled for primitive fields (solr
6.6.), so not explicitly defined in schema. Do you think its wrong
assumption? Also I do not this field listed in feild cache, but dont see
any dynamic fields listed.
2) Autowarm count is at 32 for both and autowarm time is 25 for queryresult
and  17
3)Can you elaborate what you mean here



On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson 
wrote:

> Response spikes after commits are almost always something to do
> with autowarming or docValues being set to false. So here’s what
> I’d look at, in order.
>
> 1> are the fields used defined with docValues=true? They should be.
> With this much variance it sounds like you don’t have that value set.
> You’ll have to rebuild your entire index, first deleting all documents…
>
> You assert that they are all docValues, but the variance is so
> high that I wonder whether they _all_ are. They may very well be, but
> I’ve been tripped up by things I know are true that aren’t too often ;)
>
> You can insure this by setting 'uninvertible=“true” ‘ in your field type,
> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on
> 7.6 or later.
>
> 2>what are your autowarming settings for queryResultCache and/or
> filterCache. Start with a relatively small number, say 16 and look at
> your autowarm times to insure they aren’t excessive.
>
> 3> if autowarming doesn’t help, consider specifying a newSearcher
> event in solrconfig.xml that exercises the facets.
>
> NOTE: <2> and <3> will mask any fields that are docValues=false that
> slipped through the cracks, so I’d double check <1> first.
>
> Best,
> Erick
>
> > On Mar 30, 2020, at 12:20 PM, sujatha arun  wrote:
> >
> > A facet heavy query which uses docValue fields for faceting  returns
> about
> > 5k results executes between  10ms to 5 secs and the 5 secs time seems to
> > coincide with after a hard commit.
> >
> > Does that have any relation? Why the fluctuation in execution time?
> >
> > Thanks,
> > Revas
>
>


Re: DocValue field & commit

2020-03-30 Thread Erick Erickson
OK, sounds like docValues is set.

Sure, in solrconfig.xml, there are two sections “firstSearcher” and 
“newSearcher”.
These are queries (or lists of queries) that are fired as part of autowarming
when Solr is first started (firstSearcher) or when a commit happens that opens
a new searcher (newSearcher). These are hand-crafted static queries. So 
create one or more newSearcher sections in that block that exercise your
faceting and it’ll be fired as part of autowarming. That should smooth out
the delay your user’s experience when commits happen.

Best,
Erick

> On Mar 30, 2020, at 4:06 PM, Revas  wrote:
> 
> Thanks, Eric.
> 
> 1) We are using dynamic string field for faceting where indexing =false and
> stored=false . By default docValues are enabled for primitive fields (solr
> 6.6.), so not explicitly defined in schema. Do you think its wrong
> assumption? Also I do not this field listed in feild cache, but dont see
> any dynamic fields listed.
> 2) Autowarm count is at 32 for both and autowarm time is 25 for queryresult
> and  17
> 3)Can you elaborate what you mean here
> 
> 
> 
> On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson 
> wrote:
> 
>> Response spikes after commits are almost always something to do
>> with autowarming or docValues being set to false. So here’s what
>> I’d look at, in order.
>> 
>> 1> are the fields used defined with docValues=true? They should be.
>> With this much variance it sounds like you don’t have that value set.
>> You’ll have to rebuild your entire index, first deleting all documents…
>> 
>> You assert that they are all docValues, but the variance is so
>> high that I wonder whether they _all_ are. They may very well be, but
>> I’ve been tripped up by things I know are true that aren’t too often ;)
>> 
>> You can insure this by setting 'uninvertible=“true” ‘ in your field type,
>> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on
>> 7.6 or later.
>> 
>> 2>what are your autowarming settings for queryResultCache and/or
>> filterCache. Start with a relatively small number, say 16 and look at
>> your autowarm times to insure they aren’t excessive.
>> 
>> 3> if autowarming doesn’t help, consider specifying a newSearcher
>> event in solrconfig.xml that exercises the facets.
>> 
>> NOTE: <2> and <3> will mask any fields that are docValues=false that
>> slipped through the cracks, so I’d double check <1> first.
>> 
>> Best,
>> Erick
>> 
>>> On Mar 30, 2020, at 12:20 PM, sujatha arun  wrote:
>>> 
>>> A facet heavy query which uses docValue fields for faceting  returns
>> about
>>> 5k results executes between  10ms to 5 secs and the 5 secs time seems to
>>> coincide with after a hard commit.
>>> 
>>> Does that have any relation? Why the fluctuation in execution time?
>>> 
>>> Thanks,
>>> Revas
>> 
>> 



Re: DocValue field & commit

2020-03-30 Thread Revas
Correcting some typos ...

Thanks, Eric.

1) We are using dynamic string field for faceting where indexing =false and
stored=false . By default docValues are enabled for primitive fields (solr
6.6.), so not explicitly defined in schema. Do you think its wrong
assumption? Also I do not see this field listed in feild cache, but don't
see any dynamic fields listed.
2) Autowarm count is at 32 for both and autowarm time is 25 for
query-result cache and  1724 for filter cache
3)Can you elaborate what you mean here. We have hard-commit every 5 mins
with opensearcher=false and soft-commit every 2 secs.


On Mon, Mar 30, 2020 at 4:06 PM Revas  wrote:

> Thanks, Eric.
>
> 1) We are using dynamic string field for faceting where indexing =false
> and stored=false . By default docValues are enabled for primitive fields
> (solr 6.6.), so not explicitly defined in schema. Do you think its wrong
> assumption? Also I do not this field listed in feild cache, but dont see
> any dynamic fields listed.
> 2) Autowarm count is at 32 for both and autowarm time is 25 for
> queryresult and  17
> 3)Can you elaborate what you mean here
>
>
>
> On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson 
> wrote:
>
>> Response spikes after commits are almost always something to do
>> with autowarming or docValues being set to false. So here’s what
>> I’d look at, in order.
>>
>> 1> are the fields used defined with docValues=true? They should be.
>> With this much variance it sounds like you don’t have that value set.
>> You’ll have to rebuild your entire index, first deleting all documents…
>>
>> You assert that they are all docValues, but the variance is so
>> high that I wonder whether they _all_ are. They may very well be, but
>> I’ve been tripped up by things I know are true that aren’t too often ;)
>>
>> You can insure this by setting 'uninvertible=“true” ‘ in your field type,
>> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on
>> 7.6 or later.
>>
>> 2>what are your autowarming settings for queryResultCache and/or
>> filterCache. Start with a relatively small number, say 16 and look at
>> your autowarm times to insure they aren’t excessive.
>>
>> 3> if autowarming doesn’t help, consider specifying a newSearcher
>> event in solrconfig.xml that exercises the facets.
>>
>> NOTE: <2> and <3> will mask any fields that are docValues=false that
>> slipped through the cracks, so I’d double check <1> first.
>>
>> Best,
>> Erick
>>
>> > On Mar 30, 2020, at 12:20 PM, sujatha arun  wrote:
>> >
>> > A facet heavy query which uses docValue fields for faceting  returns
>> about
>> > 5k results executes between  10ms to 5 secs and the 5 secs time seems to
>> > coincide with after a hard commit.
>> >
>> > Does that have any relation? Why the fluctuation in execution time?
>> >
>> > Thanks,
>> > Revas
>>
>>


Re: expand=true throws error

2020-03-30 Thread Szűcs Roland
Hi Munendra,
Let's see the 3 scenario:
1. Query without collapse
2. Query with collapse
3. Query with collapse and expand
I made a mini book database for this:
Case 1:
{ "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"author:\"William
Shakespeare\"", "_":"1585603593269"}}, "response":{"numFound":4,"start":0,"
docs":[ { "id":"1", "author":"William Shakespeare", "title":"The Taming of
the Shrew", "format":"ebook", "_version_":1662625767773700096}, { "id":"2",
"author":"William Shakespeare", "title":"The Taming of the Shrew", "format":
"paper", "_version_":1662625790857052160}, { "id":"3", "author":"William
Shakespeare", "title":"The Taming of the Shrew", "format":"audiobook", "
_version_":1662625809553162240}, { "id":"4", "author":"William Shakespeare",
"title":"Much Ado about Nothing", "format":"paper", "_version_":
1662625868323749888}] }}
As you can see there are 3 different format from the same book.

Case 2:
{ "responseHeader":{ "status":0, "QTime":2, "params":{ "q":"author:\"William
Shakespeare\"", "fq":"{!collapse field=title}", "_":"1585603593269"}}, "
response":{"numFound":2,"start":0,"docs":[ { "id":"1", "author":"William
Shakespeare", "title":"The Taming of the Shrew", "format":"ebook", "
_version_":1662625767773700096}, { "id":"4", "author":"William Shakespeare",
"title":"Much Ado about Nothing", "format":"paper", "_version_":
1662625868323749888}] }}
Collapse post filter worked as I expected.
Case 3 let;s extend it with expand=true:
{ "responseHeader":{ "status":0, "QTime":1, "params":{ "q":"author:\"William
Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_":
"1585603593269"}}, "response":{"numFound":2,"start":0,"docs":[ { "id":"1", "
author":"William Shakespeare", "title":"The Taming of the Shrew", "format":
"ebook", "_version_":1662625767773700096}, { "id":"4", "author":"William
Shakespeare", "title":"Much Ado about Nothing", "format":"paper", "_version_
":1662625868323749888}] }}

As you can see nothing as changed. There is no additional section of the
response.

Cheers,
Roland

Munendra S N  ezt írta (időpont: 2020. márc. 30.,
H, 17:46):

> Please share the complete request. Also, does number of results change with
> & without collapse. Usually title would be unique every document. If that
> is  the case then, there won't be anything to expand right?
>
> On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland 
> wrote:
>
> > Hi Munendra,
> > I do not get error . The strange thing is that I get exactly the same
> > response with fq={!collapse field=title} versus  fq={!collapse
> > field=title}&expand=true.
> > Collapse works properly as a standalone fq but expand has no impact. How
> > can I have access to the "hidden" documents then?
> >
> > Roland
> >
> > Munendra S N  ezt írta (időpont: 2020. márc.
> 30.,
> > H, 16:47):
> >
> > > Hey,
> > > Could you please share the stacktrace or error message you received?
> > >
> > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland <
> szucs.rol...@bookandwalk.hu>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I manage to use edismax queryparser in solr 8.4.1 with collapse
> without
> > > any
> > > > problem. I tested it with the SOLR admin GUI. So fq={!collapse
> > > field=title}
> > > > worked fine.
> > > >
> > > > As soon as I use the example from the documentation and use:
> > > fq={!collapse
> > > > field=title}&expand=true, I did not get back any additional output
> with
> > > > section expanded.
> > > >
> > > > Any idea?
> > > >
> > > > Thanks in advance,
> > > > Roland
> > > >
> > >
> >
>


Configuring shardhandler factory for select handler

2020-03-30 Thread Jay Potharaju
Hi,
I am trying to update the connection & sockettime out value for my `select`
handler. After updating the configs i do not see that value being set and
it defaults to 60 sec.
How can i update these values?

Also looks like the docs have sockeTimeout & connectionTimeout values
swapped.

https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.2/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java#L193


https://lucene.apache.org/solr/guide/7_0/distributed-requests.html#configuring-the-shardhandlerfactory

Thanks
Jay


Re: DocValue field & commit

2020-03-30 Thread Revas
Thanks, Erick,

The process time execution based on debugQuery between the query and facets
is as follows

query 10ms
facets 4900ms

since max time is spent on facet processing (docValues enabled), query and
filter cache do no apply to this, correct?


   -  Autowarm count is at 32 for both and autowarm time is 25 for
   query-result cache and  1724 for filter cache
   -  We have hard-commit every 5 mins with opensearcher=false and
   soft-commit every 2 secs.
   - facet are a mix of pivot facets,range facets and facet queries
   - when the same facets criteria bring a smaller result set, response is
   much faster




On Mon, Mar 30, 2020 at 4:47 PM Erick Erickson 
wrote:

> OK, sounds like docValues is set.
>
> Sure, in solrconfig.xml, there are two sections “firstSearcher” and
> “newSearcher”.
> These are queries (or lists of queries) that are fired as part of
> autowarming
> when Solr is first started (firstSearcher) or when a commit happens that
> opens
> a new searcher (newSearcher). These are hand-crafted static queries. So
> create one or more newSearcher sections in that block that exercise your
> faceting and it’ll be fired as part of autowarming. That should smooth out
> the delay your user’s experience when commits happen.
>
> Best,
> Erick
>
> > On Mar 30, 2020, at 4:06 PM, Revas  wrote:
> >
> > Thanks, Eric.
> >
> > 1) We are using dynamic string field for faceting where indexing =false
> and
> > stored=false . By default docValues are enabled for primitive fields
> (solr
> > 6.6.), so not explicitly defined in schema. Do you think its wrong
> > assumption? Also I do not this field listed in feild cache, but dont see
> > any dynamic fields listed.
> > 2) Autowarm count is at 32 for both and autowarm time is 25 for
> queryresult
> > and  17
> > 3)Can you elaborate what you mean here
> >
> >
> >
> > On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson 
> > wrote:
> >
> >> Response spikes after commits are almost always something to do
> >> with autowarming or docValues being set to false. So here’s what
> >> I’d look at, in order.
> >>
> >> 1> are the fields used defined with docValues=true? They should be.
> >> With this much variance it sounds like you don’t have that value set.
> >> You’ll have to rebuild your entire index, first deleting all documents…
> >>
> >> You assert that they are all docValues, but the variance is so
> >> high that I wonder whether they _all_ are. They may very well be, but
> >> I’ve been tripped up by things I know are true that aren’t too often ;)
> >>
> >> You can insure this by setting 'uninvertible=“true” ‘ in your field
> type,
> >> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on
> >> 7.6 or later.
> >>
> >> 2>what are your autowarming settings for queryResultCache and/or
> >> filterCache. Start with a relatively small number, say 16 and look at
> >> your autowarm times to insure they aren’t excessive.
> >>
> >> 3> if autowarming doesn’t help, consider specifying a newSearcher
> >> event in solrconfig.xml that exercises the facets.
> >>
> >> NOTE: <2> and <3> will mask any fields that are docValues=false that
> >> slipped through the cracks, so I’d double check <1> first.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Mar 30, 2020, at 12:20 PM, sujatha arun 
> wrote:
> >>>
> >>> A facet heavy query which uses docValue fields for faceting  returns
> >> about
> >>> 5k results executes between  10ms to 5 secs and the 5 secs time seems
> to
> >>> coincide with after a hard commit.
> >>>
> >>> Does that have any relation? Why the fluctuation in execution time?
> >>>
> >>> Thanks,
> >>> Revas
> >>
> >>
>
>


Re: Configuring shardhandler factory for select handler

2020-03-30 Thread Jay Potharaju
figured out referred to the docs here
https://github.com/apache/lucene-solr/commit/0ce635ec01e9d3ce04a5fbf5d472ea9d5d28bfee?short_path=421a323#diff-421a323f596319f0485e0b03070d94e6


Thanks
Jay



On Mon, Mar 30, 2020 at 3:38 PM Jay Potharaju  wrote:

> Hi,
> I am trying to update the connection & sockettime out value for my
> `select` handler. After updating the configs i do not see that value being
> set and it defaults to 60 sec.
> How can i update these values?
>
> Also looks like the docs have sockeTimeout & connectionTimeout values
> swapped.
>
>
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.2/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java#L193
>
>
>
> https://lucene.apache.org/solr/guide/7_0/distributed-requests.html#configuring-the-shardhandlerfactory
>
> Thanks
> Jay
>
>


Re: DocValue field & commit

2020-03-30 Thread Erick Erickson
Oh dear. Your autowarming is almost, but not quite totally, useless given
your 2 second soft commit interval. See: 
https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

So autowarming is probably not a cure, when you originally said “commit” I was 
assuming
that was one that opened a new searcher, if that’s not true autowarming isn’t a 
cure.

Do you _really_ require 2 second soft commit intervals? I would not be
surprised if you also see “too many on deck searcher” warnings in your
logs at times. This is one of my hot buttons, having very short soft commit
intervals is something people do without understanding the tradeoffs,
one of which is that your caches are probably getting a poor utilization rate. 
Often
the recommendation for short intervals like this is to not use the caches at 
all.

The newSearcher is a full query. Go ahead and add facets. But again, this 
probably
isn’t going to help much.

But really, revisit your autocommit settings. Taking 1.7 seconds to autowarm
means that you have roughly this.
- commit
- 1.7 seconds later, the new searcher is open for business.
- 0.3 seconds after that a new searcher is open, which takes another 1.7 
seconds to autowarm.

I doubt your hard commit is really the culprit here _unless_ you’re running on 
an under-powered
machine. The hard commit will trigger segment merging, which is CPU and I/O 
intensive. If
you’re using a machine that can’t afford the cycles to be taken up by merging, 
that could account
for what you see, but new searchers are being opened every 2 seconds (assuming 
a relatively
constant indexing load).

Best,
Erick

> On Mar 30, 2020, at 6:42 PM, Revas  wrote:
> 
> Thanks, Erick,
> 
> The process time execution based on debugQuery between the query and facets
> is as follows
> 
> query 10ms
> facets 4900ms
> 
> since max time is spent on facet processing (docValues enabled), query and
> filter cache do no apply to this, correct?
> 
> 
>   -  Autowarm count is at 32 for both and autowarm time is 25 for
>   query-result cache and  1724 for filter cache
>   -  We have hard-commit every 5 mins with opensearcher=false and
>   soft-commit every 2 secs.
>   - facet are a mix of pivot facets,range facets and facet queries
>   - when the same facets criteria bring a smaller result set, response is
>   much faster
> 
> 
> 
> 
> On Mon, Mar 30, 2020 at 4:47 PM Erick Erickson 
> wrote:
> 
>> OK, sounds like docValues is set.
>> 
>> Sure, in solrconfig.xml, there are two sections “firstSearcher” and
>> “newSearcher”.
>> These are queries (or lists of queries) that are fired as part of
>> autowarming
>> when Solr is first started (firstSearcher) or when a commit happens that
>> opens
>> a new searcher (newSearcher). These are hand-crafted static queries. So
>> create one or more newSearcher sections in that block that exercise your
>> faceting and it’ll be fired as part of autowarming. That should smooth out
>> the delay your user’s experience when commits happen.
>> 
>> Best,
>> Erick
>> 
>>> On Mar 30, 2020, at 4:06 PM, Revas  wrote:
>>> 
>>> Thanks, Eric.
>>> 
>>> 1) We are using dynamic string field for faceting where indexing =false
>> and
>>> stored=false . By default docValues are enabled for primitive fields
>> (solr
>>> 6.6.), so not explicitly defined in schema. Do you think its wrong
>>> assumption? Also I do not this field listed in feild cache, but dont see
>>> any dynamic fields listed.
>>> 2) Autowarm count is at 32 for both and autowarm time is 25 for
>> queryresult
>>> and  17
>>> 3)Can you elaborate what you mean here
>>> 
>>> 
>>> 
>>> On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson 
>>> wrote:
>>> 
 Response spikes after commits are almost always something to do
 with autowarming or docValues being set to false. So here’s what
 I’d look at, in order.
 
 1> are the fields used defined with docValues=true? They should be.
 With this much variance it sounds like you don’t have that value set.
 You’ll have to rebuild your entire index, first deleting all documents…
 
 You assert that they are all docValues, but the variance is so
 high that I wonder whether they _all_ are. They may very well be, but
 I’ve been tripped up by things I know are true that aren’t too often ;)
 
 You can insure this by setting 'uninvertible=“true” ‘ in your field
>> type,
 see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on
 7.6 or later.
 
 2>what are your autowarming settings for queryResultCache and/or
 filterCache. Start with a relatively small number, say 16 and look at
 your autowarm times to insure they aren’t excessive.
 
 3> if autowarming doesn’t help, consider specifying a newSearcher
 event in solrconfig.xml that exercises the facets.
 
 NOTE: <2> and <3> will mask any fields that are docValues=false that
 slipped through the cracks, so I’d double check <1> first.
 
 Best,
 Erick
>>

Re: expand=true throws error

2020-03-30 Thread Munendra S N
> Case 3 let;s extend it with expand=true:
> { "responseHeader":{ "status":0, "QTime":1, "params":{
> "q":"author:\"William
> Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_":
> "1585603593269"}},
>
I think it is because, expand=true parameter is not passed properly. As you
can see from the params in the responseHeader section, q , fq are separate
keys but expand=true is appended to fq value.

If passed correctly, it should look something like this

> { "responseHeader":{ "status":0, "QTime":1, "params":{
> "q":"author:\"William
> Shakespeare\"", "fq":"{!collapse field=title}", "expand": "true", "_":
> "1585603593269"}},
>

Regards,
Munendra S N



On Tue, Mar 31, 2020 at 3:07 AM Szűcs Roland 
wrote:

> Hi Munendra,
> Let's see the 3 scenario:
> 1. Query without collapse
> 2. Query with collapse
> 3. Query with collapse and expand
> I made a mini book database for this:
> Case 1:
> { "responseHeader":{ "status":0, "QTime":0, "params":{
> "q":"author:\"William
> Shakespeare\"", "_":"1585603593269"}}, "response":{"numFound":4,"start":0,"
> docs":[ { "id":"1", "author":"William Shakespeare", "title":"The Taming of
> the Shrew", "format":"ebook", "_version_":1662625767773700096}, { "id":"2",
> "author":"William Shakespeare", "title":"The Taming of the Shrew",
> "format":
> "paper", "_version_":1662625790857052160}, { "id":"3", "author":"William
> Shakespeare", "title":"The Taming of the Shrew", "format":"audiobook", "
> _version_":1662625809553162240}, { "id":"4", "author":"William
> Shakespeare",
> "title":"Much Ado about Nothing", "format":"paper", "_version_":
> 1662625868323749888}] }}
> As you can see there are 3 different format from the same book.
>
> Case 2:
> { "responseHeader":{ "status":0, "QTime":2, "params":{
> "q":"author:\"William
> Shakespeare\"", "fq":"{!collapse field=title}", "_":"1585603593269"}}, "
> response":{"numFound":2,"start":0,"docs":[ { "id":"1", "author":"William
> Shakespeare", "title":"The Taming of the Shrew", "format":"ebook", "
> _version_":1662625767773700096}, { "id":"4", "author":"William
> Shakespeare",
> "title":"Much Ado about Nothing", "format":"paper", "_version_":
> 1662625868323749888}] }}
> Collapse post filter worked as I expected.
> Case 3 let;s extend it with expand=true:
> { "responseHeader":{ "status":0, "QTime":1, "params":{
> "q":"author:\"William
> Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_":
> "1585603593269"}}, "response":{"numFound":2,"start":0,"docs":[ { "id":"1",
> "
> author":"William Shakespeare", "title":"The Taming of the Shrew", "format":
> "ebook", "_version_":1662625767773700096}, { "id":"4", "author":"William
> Shakespeare", "title":"Much Ado about Nothing", "format":"paper",
> "_version_
> ":1662625868323749888}] }}
>
> As you can see nothing as changed. There is no additional section of the
> response.
>
> Cheers,
> Roland
>
> Munendra S N  ezt írta (időpont: 2020. márc. 30.,
> H, 17:46):
>
> > Please share the complete request. Also, does number of results change
> with
> > & without collapse. Usually title would be unique every document. If that
> > is  the case then, there won't be anything to expand right?
> >
> > On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland 
> > wrote:
> >
> > > Hi Munendra,
> > > I do not get error . The strange thing is that I get exactly the same
> > > response with fq={!collapse field=title} versus  fq={!collapse
> > > field=title}&expand=true.
> > > Collapse works properly as a standalone fq but expand has no impact.
> How
> > > can I have access to the "hidden" documents then?
> > >
> > > Roland
> > >
> > > Munendra S N  ezt írta (időpont: 2020. márc.
> > 30.,
> > > H, 16:47):
> > >
> > > > Hey,
> > > > Could you please share the stacktrace or error message you received?
> > > >
> > > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland <
> > szucs.rol...@bookandwalk.hu>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I manage to use edismax queryparser in solr 8.4.1 with collapse
> > without
> > > > any
> > > > > problem. I tested it with the SOLR admin GUI. So fq={!collapse
> > > > field=title}
> > > > > worked fine.
> > > > >
> > > > > As soon as I use the example from the documentation and use:
> > > > fq={!collapse
> > > > > field=title}&expand=true, I did not get back any additional output
> > with
> > > > > section expanded.
> > > > >
> > > > > Any idea?
> > > > >
> > > > > Thanks in advance,
> > > > > Roland
> > > > >
> > > >
> > >
> >
>


Re: Solrcloud 7.6 OOM due to unable to create native threads

2020-03-30 Thread Raji N
Thanks Eric. I don't seeing anywhere that CDCR is not recommended for
production use. Took the thread dump. Seeing about 140 CDCR threads


cdcr-replicator-219-thread-8" #787 prio=5 os_prio=0 tid=0x7f7c34009000
nid=0x50a waiting on condition [0x7f7ec871b000]

   java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for  <0x0001da724ca0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)

at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)




cdcr-update-log-synchronizer-157-thread-1" #240 prio=5 os_prio=0
tid=0x7f8782543800 nid=0x2e5 waiting on condition [0x7f82ad99c000]

   java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for  <0x0001d7f9e8e8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)

at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)


Thanks,

Raji

On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson 
wrote:

> What that error usually means is that there are a zillion threads running.
>
> Try taking a thread dump. It’s _probable_ that it’s CDCR, but
> take a look at the thread dump to see if you have lots of
> threads that are running. Any by “lots” here, I mean 100s of threads
> that reference the same component, in this case that have cdcr in
> the stack trace.
>
> CDCR is not getting active work at this point, you might want to
> consider another replication strategy if you’re not willing to fix
> the code.
>
> Best,
> Erick
>
> > On Mar 29, 2020, at 4:17 AM, Raji N  wrote:
> >
> > Hi All,
> >
> > We running solrcloud 7.6  (with the patch #
> >
> https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
> > production on 7 hosts in  containers. The container memory is 48GB , heap
> > is 24GB.
> > ulimit -v
> >
> > unlimited
> >
> > ulimit -m
> >
> > unlimited
> > We don't have any custom code in solr. We have set up  bidirectional CDCR
> > between primary and secondary Datacenter. Our secondary DC is very
> unstable
> > and many times many instances are down.
> >
> > We get below exception quite often. Is this because the CDCR connection
> is
> > broken.
> >
> > WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
> > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> >   at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >
> >   at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >
> >   at
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >   at
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719

Re: Solrcloud 7.6 OOM due to unable to create native threads

2020-03-30 Thread Raji N
Hi Eric,

What are you recommendations for SolrCloud DR strategy.

Thanks,
Raji

On Sun, Mar 29, 2020 at 6:25 PM Erick Erickson 
wrote:

> I don’t recommend CDCR at this point, I think there better approaches.
>
> The root problem is that CDCR uses tlog files as a queueing mechanism.
> If the connection between the DCs is broken for any reason, the tlogs grow
> without limit. This could probably be fixed, but a better alternative is to
> use something designed to insure messages (updates) are delivered to
> separate DCs rathe than try to have CDCR re-invent that wheel.
>
> Best,
> Erick
>
> > On Mar 29, 2020, at 6:47 PM, S G  wrote:
> >
> > Is CDCR even recommended to be used in production?
> > Or it was abandoned before it could become production ready ?
> >
> > Thanks
> > SG
> >
> >
> > On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson 
> > wrote:
> >
> >> What that error usually means is that there are a zillion threads
> running.
> >>
> >> Try taking a thread dump. It’s _probable_ that it’s CDCR, but
> >> take a look at the thread dump to see if you have lots of
> >> threads that are running. Any by “lots” here, I mean 100s of threads
> >> that reference the same component, in this case that have cdcr in
> >> the stack trace.
> >>
> >> CDCR is not getting active work at this point, you might want to
> >> consider another replication strategy if you’re not willing to fix
> >> the code.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Mar 29, 2020, at 4:17 AM, Raji N  wrote:
> >>>
> >>> Hi All,
> >>>
> >>> We running solrcloud 7.6  (with the patch #
> >>>
> >>
> https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
> >>> production on 7 hosts in  containers. The container memory is 48GB ,
> heap
> >>> is 24GB.
> >>> ulimit -v
> >>>
> >>> unlimited
> >>>
> >>> ulimit -m
> >>>
> >>> unlimited
> >>> We don't have any custom code in solr. We have set up  bidirectional
> CDCR
> >>> between primary and secondary Datacenter. Our secondary DC is very
> >> unstable
> >>> and many times many instances are down.
> >>>
> >>> We get below exception quite often. Is this because the CDCR connection
> >> is
> >>> broken.
> >>>
> >>> WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
> >>> o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >>>
> >>> java.lang.OutOfMemoryError: unable to create new native thread
> >>>
> >>>  at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >>>
> >>>  at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >>>
> >>>  at
> >>>
> >>
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> >>> ~[httpclient-4.5.3.jar:4.5.3]
> >>>
> >>>  at
> >>>
> >>
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> >>> ~[httpclient-4.5.3.jar:4.5.3]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:200)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> >>> [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> >>> 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> >>> 14:02:46]
> >>>
> >>>  at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >>> [?:1.8.0_211]
> >>>
> >>>  at
> >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> >>> [?:1.8.0_211]
> >>>
> >>>  at
> >>>
> >>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> >>> [?:1.8.0_