expand=true throws error
Hi All, I manage to use edismax queryparser in solr 8.4.1 with collapse without any problem. I tested it with the SOLR admin GUI. So fq={!collapse field=title} worked fine. As soon as I use the example from the documentation and use: fq={!collapse field=title}&expand=true, I did not get back any additional output with section expanded. Any idea? Thanks in advance, Roland
Re: expand=true throws error
Hey, Could you please share the stacktrace or error message you received? On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland wrote: > Hi All, > > I manage to use edismax queryparser in solr 8.4.1 with collapse without any > problem. I tested it with the SOLR admin GUI. So fq={!collapse field=title} > worked fine. > > As soon as I use the example from the documentation and use: fq={!collapse > field=title}&expand=true, I did not get back any additional output with > section expanded. > > Any idea? > > Thanks in advance, > Roland >
Re: expand=true throws error
Hi Munendra, I do not get error . The strange thing is that I get exactly the same response with fq={!collapse field=title} versus fq={!collapse field=title}&expand=true. Collapse works properly as a standalone fq but expand has no impact. How can I have access to the "hidden" documents then? Roland Munendra S N ezt írta (időpont: 2020. márc. 30., H, 16:47): > Hey, > Could you please share the stacktrace or error message you received? > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland > wrote: > > > Hi All, > > > > I manage to use edismax queryparser in solr 8.4.1 with collapse without > any > > problem. I tested it with the SOLR admin GUI. So fq={!collapse > field=title} > > worked fine. > > > > As soon as I use the example from the documentation and use: > fq={!collapse > > field=title}&expand=true, I did not get back any additional output with > > section expanded. > > > > Any idea? > > > > Thanks in advance, > > Roland > > >
Re: expand=true throws error
Please share the complete request. Also, does number of results change with & without collapse. Usually title would be unique every document. If that is the case then, there won't be anything to expand right? On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland wrote: > Hi Munendra, > I do not get error . The strange thing is that I get exactly the same > response with fq={!collapse field=title} versus fq={!collapse > field=title}&expand=true. > Collapse works properly as a standalone fq but expand has no impact. How > can I have access to the "hidden" documents then? > > Roland > > Munendra S N ezt írta (időpont: 2020. márc. 30., > H, 16:47): > > > Hey, > > Could you please share the stacktrace or error message you received? > > > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland > > wrote: > > > > > Hi All, > > > > > > I manage to use edismax queryparser in solr 8.4.1 with collapse without > > any > > > problem. I tested it with the SOLR admin GUI. So fq={!collapse > > field=title} > > > worked fine. > > > > > > As soon as I use the example from the documentation and use: > > fq={!collapse > > > field=title}&expand=true, I did not get back any additional output with > > > section expanded. > > > > > > Any idea? > > > > > > Thanks in advance, > > > Roland > > > > > >
DocValue field & commit
A facet heavy query which uses docValue fields for faceting returns about 5k results executes between 10ms to 5 secs and the 5 secs time seems to coincide with after a hard commit. Does that have any relation? Why the fluctuation in execution time? Thanks, Revas
RE: No files to download for index generation
I wanted to ask *yet again* whether anyone could please clarify what this error means? The wording could be interpreted as a benign "I found that there was nothing which needed to be done after all"; but were that to be the meaning of this error, why would it be flagged as an ERROR rather than as INFO or WARN ? Please advise -Original Message- From: Oakley, Craig (NIH/NLM/NCBI) [C] Sent: Wednesday, March 11, 2020 5:18 PM To: solr-user@lucene.apache.org Subject: RE: No files to download for index generation I wanted to ask *again* whether anyone has any insight regarding this message There seem to have been several people asking the question on this forum (Markus Jelsma on 8/23/19, Akreeti Agarwal on 12/27/19 and Vadim Ivanov on 12/29/19) The only response I have seen was five words from Erick Erickson on 12/27/19: "Not sure about that one" Could someone please clarify what this error means? The wording could be interpreted as a benign "I found that there was nothing which needed to be done after all"; but were that to be the meaning of this error, why would it be flagged as an ERROR rather than as INFO or WARN ? -Original Message- From: Oakley, Craig (NIH/NLM/NCBI) [C] Sent: Monday, June 10, 2019 9:57 AM To: solr-user@lucene.apache.org Subject: RE: No files to download for index generation Does anyone yet have any insight on interpreting the severity of this message? -Original Message- From: Oakley, Craig (NIH/NLM/NCBI) [C] Sent: Tuesday, June 04, 2019 4:07 PM To: solr-user@lucene.apache.org Subject: No files to download for index generation We have occasionally been seeing an error such as the following: 2019-06-03 23:32:45.583 INFO (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher Master's generation: 1424625 2019-06-03 23:32:45.583 INFO (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher Master's version: 1559619115480 2019-06-03 23:32:45.583 INFO (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher Slave's generation: 1424624 2019-06-03 23:32:45.583 INFO (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher Slave's version: 1559619050130 2019-06-03 23:32:45.583 INFO (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher Starting replication process 2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [ ] o.a.s.h.IndexFetcher No files to download for index generation: 1424625 Is that last line actually an error as in "there SHOULD be files to download, but there are none"? Or is it simply informative as in "there are no files to download, so we are all done here"?
GC1 and StringDeduplication
Hi SOLR Community, I've been looking at performance tuning solr's GC lately. I found this helpful article on the matter. https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey One thing the article does not address is the GC1's ability to use string deduplication: https://blog.gceasy.io/2018/12/23/usestringdeduplication/#more-2861 Are there scenarios where this is advisable for SOLR? For example, if my cluster has many requests which come in with the same basic filters (things like site, username, type filter, etc...) would you expect the performance gain to be noticeable as SOLR may not be frequently recreating the same dictionary keys, etc? Or since there is likely some overall loss in performance, would you expect the loss to be likely to outweigh any of the gains? Thanks! Stephen
Re: DocValue field & commit
Response spikes after commits are almost always something to do with autowarming or docValues being set to false. So here’s what I’d look at, in order. 1> are the fields used defined with docValues=true? They should be. With this much variance it sounds like you don’t have that value set. You’ll have to rebuild your entire index, first deleting all documents… You assert that they are all docValues, but the variance is so high that I wonder whether they _all_ are. They may very well be, but I’ve been tripped up by things I know are true that aren’t too often ;) You can insure this by setting 'uninvertible=“true” ‘ in your field type, see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on 7.6 or later. 2>what are your autowarming settings for queryResultCache and/or filterCache. Start with a relatively small number, say 16 and look at your autowarm times to insure they aren’t excessive. 3> if autowarming doesn’t help, consider specifying a newSearcher event in solrconfig.xml that exercises the facets. NOTE: <2> and <3> will mask any fields that are docValues=false that slipped through the cracks, so I’d double check <1> first. Best, Erick > On Mar 30, 2020, at 12:20 PM, sujatha arun wrote: > > A facet heavy query which uses docValue fields for faceting returns about > 5k results executes between 10ms to 5 secs and the 5 secs time seems to > coincide with after a hard commit. > > Does that have any relation? Why the fluctuation in execution time? > > Thanks, > Revas
Re: DocValue field & commit
Thanks, Eric. 1) We are using dynamic string field for faceting where indexing =false and stored=false . By default docValues are enabled for primitive fields (solr 6.6.), so not explicitly defined in schema. Do you think its wrong assumption? Also I do not this field listed in feild cache, but dont see any dynamic fields listed. 2) Autowarm count is at 32 for both and autowarm time is 25 for queryresult and 17 3)Can you elaborate what you mean here On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson wrote: > Response spikes after commits are almost always something to do > with autowarming or docValues being set to false. So here’s what > I’d look at, in order. > > 1> are the fields used defined with docValues=true? They should be. > With this much variance it sounds like you don’t have that value set. > You’ll have to rebuild your entire index, first deleting all documents… > > You assert that they are all docValues, but the variance is so > high that I wonder whether they _all_ are. They may very well be, but > I’ve been tripped up by things I know are true that aren’t too often ;) > > You can insure this by setting 'uninvertible=“true” ‘ in your field type, > see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on > 7.6 or later. > > 2>what are your autowarming settings for queryResultCache and/or > filterCache. Start with a relatively small number, say 16 and look at > your autowarm times to insure they aren’t excessive. > > 3> if autowarming doesn’t help, consider specifying a newSearcher > event in solrconfig.xml that exercises the facets. > > NOTE: <2> and <3> will mask any fields that are docValues=false that > slipped through the cracks, so I’d double check <1> first. > > Best, > Erick > > > On Mar 30, 2020, at 12:20 PM, sujatha arun wrote: > > > > A facet heavy query which uses docValue fields for faceting returns > about > > 5k results executes between 10ms to 5 secs and the 5 secs time seems to > > coincide with after a hard commit. > > > > Does that have any relation? Why the fluctuation in execution time? > > > > Thanks, > > Revas > >
Re: DocValue field & commit
OK, sounds like docValues is set. Sure, in solrconfig.xml, there are two sections “firstSearcher” and “newSearcher”. These are queries (or lists of queries) that are fired as part of autowarming when Solr is first started (firstSearcher) or when a commit happens that opens a new searcher (newSearcher). These are hand-crafted static queries. So create one or more newSearcher sections in that block that exercise your faceting and it’ll be fired as part of autowarming. That should smooth out the delay your user’s experience when commits happen. Best, Erick > On Mar 30, 2020, at 4:06 PM, Revas wrote: > > Thanks, Eric. > > 1) We are using dynamic string field for faceting where indexing =false and > stored=false . By default docValues are enabled for primitive fields (solr > 6.6.), so not explicitly defined in schema. Do you think its wrong > assumption? Also I do not this field listed in feild cache, but dont see > any dynamic fields listed. > 2) Autowarm count is at 32 for both and autowarm time is 25 for queryresult > and 17 > 3)Can you elaborate what you mean here > > > > On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson > wrote: > >> Response spikes after commits are almost always something to do >> with autowarming or docValues being set to false. So here’s what >> I’d look at, in order. >> >> 1> are the fields used defined with docValues=true? They should be. >> With this much variance it sounds like you don’t have that value set. >> You’ll have to rebuild your entire index, first deleting all documents… >> >> You assert that they are all docValues, but the variance is so >> high that I wonder whether they _all_ are. They may very well be, but >> I’ve been tripped up by things I know are true that aren’t too often ;) >> >> You can insure this by setting 'uninvertible=“true” ‘ in your field type, >> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on >> 7.6 or later. >> >> 2>what are your autowarming settings for queryResultCache and/or >> filterCache. Start with a relatively small number, say 16 and look at >> your autowarm times to insure they aren’t excessive. >> >> 3> if autowarming doesn’t help, consider specifying a newSearcher >> event in solrconfig.xml that exercises the facets. >> >> NOTE: <2> and <3> will mask any fields that are docValues=false that >> slipped through the cracks, so I’d double check <1> first. >> >> Best, >> Erick >> >>> On Mar 30, 2020, at 12:20 PM, sujatha arun wrote: >>> >>> A facet heavy query which uses docValue fields for faceting returns >> about >>> 5k results executes between 10ms to 5 secs and the 5 secs time seems to >>> coincide with after a hard commit. >>> >>> Does that have any relation? Why the fluctuation in execution time? >>> >>> Thanks, >>> Revas >> >>
Re: DocValue field & commit
Correcting some typos ... Thanks, Eric. 1) We are using dynamic string field for faceting where indexing =false and stored=false . By default docValues are enabled for primitive fields (solr 6.6.), so not explicitly defined in schema. Do you think its wrong assumption? Also I do not see this field listed in feild cache, but don't see any dynamic fields listed. 2) Autowarm count is at 32 for both and autowarm time is 25 for query-result cache and 1724 for filter cache 3)Can you elaborate what you mean here. We have hard-commit every 5 mins with opensearcher=false and soft-commit every 2 secs. On Mon, Mar 30, 2020 at 4:06 PM Revas wrote: > Thanks, Eric. > > 1) We are using dynamic string field for faceting where indexing =false > and stored=false . By default docValues are enabled for primitive fields > (solr 6.6.), so not explicitly defined in schema. Do you think its wrong > assumption? Also I do not this field listed in feild cache, but dont see > any dynamic fields listed. > 2) Autowarm count is at 32 for both and autowarm time is 25 for > queryresult and 17 > 3)Can you elaborate what you mean here > > > > On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson > wrote: > >> Response spikes after commits are almost always something to do >> with autowarming or docValues being set to false. So here’s what >> I’d look at, in order. >> >> 1> are the fields used defined with docValues=true? They should be. >> With this much variance it sounds like you don’t have that value set. >> You’ll have to rebuild your entire index, first deleting all documents… >> >> You assert that they are all docValues, but the variance is so >> high that I wonder whether they _all_ are. They may very well be, but >> I’ve been tripped up by things I know are true that aren’t too often ;) >> >> You can insure this by setting 'uninvertible=“true” ‘ in your field type, >> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on >> 7.6 or later. >> >> 2>what are your autowarming settings for queryResultCache and/or >> filterCache. Start with a relatively small number, say 16 and look at >> your autowarm times to insure they aren’t excessive. >> >> 3> if autowarming doesn’t help, consider specifying a newSearcher >> event in solrconfig.xml that exercises the facets. >> >> NOTE: <2> and <3> will mask any fields that are docValues=false that >> slipped through the cracks, so I’d double check <1> first. >> >> Best, >> Erick >> >> > On Mar 30, 2020, at 12:20 PM, sujatha arun wrote: >> > >> > A facet heavy query which uses docValue fields for faceting returns >> about >> > 5k results executes between 10ms to 5 secs and the 5 secs time seems to >> > coincide with after a hard commit. >> > >> > Does that have any relation? Why the fluctuation in execution time? >> > >> > Thanks, >> > Revas >> >>
Re: expand=true throws error
Hi Munendra, Let's see the 3 scenario: 1. Query without collapse 2. Query with collapse 3. Query with collapse and expand I made a mini book database for this: Case 1: { "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"author:\"William Shakespeare\"", "_":"1585603593269"}}, "response":{"numFound":4,"start":0," docs":[ { "id":"1", "author":"William Shakespeare", "title":"The Taming of the Shrew", "format":"ebook", "_version_":1662625767773700096}, { "id":"2", "author":"William Shakespeare", "title":"The Taming of the Shrew", "format": "paper", "_version_":1662625790857052160}, { "id":"3", "author":"William Shakespeare", "title":"The Taming of the Shrew", "format":"audiobook", " _version_":1662625809553162240}, { "id":"4", "author":"William Shakespeare", "title":"Much Ado about Nothing", "format":"paper", "_version_": 1662625868323749888}] }} As you can see there are 3 different format from the same book. Case 2: { "responseHeader":{ "status":0, "QTime":2, "params":{ "q":"author:\"William Shakespeare\"", "fq":"{!collapse field=title}", "_":"1585603593269"}}, " response":{"numFound":2,"start":0,"docs":[ { "id":"1", "author":"William Shakespeare", "title":"The Taming of the Shrew", "format":"ebook", " _version_":1662625767773700096}, { "id":"4", "author":"William Shakespeare", "title":"Much Ado about Nothing", "format":"paper", "_version_": 1662625868323749888}] }} Collapse post filter worked as I expected. Case 3 let;s extend it with expand=true: { "responseHeader":{ "status":0, "QTime":1, "params":{ "q":"author:\"William Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_": "1585603593269"}}, "response":{"numFound":2,"start":0,"docs":[ { "id":"1", " author":"William Shakespeare", "title":"The Taming of the Shrew", "format": "ebook", "_version_":1662625767773700096}, { "id":"4", "author":"William Shakespeare", "title":"Much Ado about Nothing", "format":"paper", "_version_ ":1662625868323749888}] }} As you can see nothing as changed. There is no additional section of the response. Cheers, Roland Munendra S N ezt írta (időpont: 2020. márc. 30., H, 17:46): > Please share the complete request. Also, does number of results change with > & without collapse. Usually title would be unique every document. If that > is the case then, there won't be anything to expand right? > > On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland > wrote: > > > Hi Munendra, > > I do not get error . The strange thing is that I get exactly the same > > response with fq={!collapse field=title} versus fq={!collapse > > field=title}&expand=true. > > Collapse works properly as a standalone fq but expand has no impact. How > > can I have access to the "hidden" documents then? > > > > Roland > > > > Munendra S N ezt írta (időpont: 2020. márc. > 30., > > H, 16:47): > > > > > Hey, > > > Could you please share the stacktrace or error message you received? > > > > > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland < > szucs.rol...@bookandwalk.hu> > > > wrote: > > > > > > > Hi All, > > > > > > > > I manage to use edismax queryparser in solr 8.4.1 with collapse > without > > > any > > > > problem. I tested it with the SOLR admin GUI. So fq={!collapse > > > field=title} > > > > worked fine. > > > > > > > > As soon as I use the example from the documentation and use: > > > fq={!collapse > > > > field=title}&expand=true, I did not get back any additional output > with > > > > section expanded. > > > > > > > > Any idea? > > > > > > > > Thanks in advance, > > > > Roland > > > > > > > > > >
Configuring shardhandler factory for select handler
Hi, I am trying to update the connection & sockettime out value for my `select` handler. After updating the configs i do not see that value being set and it defaults to 60 sec. How can i update these values? Also looks like the docs have sockeTimeout & connectionTimeout values swapped. https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.2/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java#L193 https://lucene.apache.org/solr/guide/7_0/distributed-requests.html#configuring-the-shardhandlerfactory Thanks Jay
Re: DocValue field & commit
Thanks, Erick, The process time execution based on debugQuery between the query and facets is as follows query 10ms facets 4900ms since max time is spent on facet processing (docValues enabled), query and filter cache do no apply to this, correct? - Autowarm count is at 32 for both and autowarm time is 25 for query-result cache and 1724 for filter cache - We have hard-commit every 5 mins with opensearcher=false and soft-commit every 2 secs. - facet are a mix of pivot facets,range facets and facet queries - when the same facets criteria bring a smaller result set, response is much faster On Mon, Mar 30, 2020 at 4:47 PM Erick Erickson wrote: > OK, sounds like docValues is set. > > Sure, in solrconfig.xml, there are two sections “firstSearcher” and > “newSearcher”. > These are queries (or lists of queries) that are fired as part of > autowarming > when Solr is first started (firstSearcher) or when a commit happens that > opens > a new searcher (newSearcher). These are hand-crafted static queries. So > create one or more newSearcher sections in that block that exercise your > faceting and it’ll be fired as part of autowarming. That should smooth out > the delay your user’s experience when commits happen. > > Best, > Erick > > > On Mar 30, 2020, at 4:06 PM, Revas wrote: > > > > Thanks, Eric. > > > > 1) We are using dynamic string field for faceting where indexing =false > and > > stored=false . By default docValues are enabled for primitive fields > (solr > > 6.6.), so not explicitly defined in schema. Do you think its wrong > > assumption? Also I do not this field listed in feild cache, but dont see > > any dynamic fields listed. > > 2) Autowarm count is at 32 for both and autowarm time is 25 for > queryresult > > and 17 > > 3)Can you elaborate what you mean here > > > > > > > > On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson > > wrote: > > > >> Response spikes after commits are almost always something to do > >> with autowarming or docValues being set to false. So here’s what > >> I’d look at, in order. > >> > >> 1> are the fields used defined with docValues=true? They should be. > >> With this much variance it sounds like you don’t have that value set. > >> You’ll have to rebuild your entire index, first deleting all documents… > >> > >> You assert that they are all docValues, but the variance is so > >> high that I wonder whether they _all_ are. They may very well be, but > >> I’ve been tripped up by things I know are true that aren’t too often ;) > >> > >> You can insure this by setting 'uninvertible=“true” ‘ in your field > type, > >> see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on > >> 7.6 or later. > >> > >> 2>what are your autowarming settings for queryResultCache and/or > >> filterCache. Start with a relatively small number, say 16 and look at > >> your autowarm times to insure they aren’t excessive. > >> > >> 3> if autowarming doesn’t help, consider specifying a newSearcher > >> event in solrconfig.xml that exercises the facets. > >> > >> NOTE: <2> and <3> will mask any fields that are docValues=false that > >> slipped through the cracks, so I’d double check <1> first. > >> > >> Best, > >> Erick > >> > >>> On Mar 30, 2020, at 12:20 PM, sujatha arun > wrote: > >>> > >>> A facet heavy query which uses docValue fields for faceting returns > >> about > >>> 5k results executes between 10ms to 5 secs and the 5 secs time seems > to > >>> coincide with after a hard commit. > >>> > >>> Does that have any relation? Why the fluctuation in execution time? > >>> > >>> Thanks, > >>> Revas > >> > >> > >
Re: Configuring shardhandler factory for select handler
figured out referred to the docs here https://github.com/apache/lucene-solr/commit/0ce635ec01e9d3ce04a5fbf5d472ea9d5d28bfee?short_path=421a323#diff-421a323f596319f0485e0b03070d94e6 Thanks Jay On Mon, Mar 30, 2020 at 3:38 PM Jay Potharaju wrote: > Hi, > I am trying to update the connection & sockettime out value for my > `select` handler. After updating the configs i do not see that value being > set and it defaults to 60 sec. > How can i update these values? > > Also looks like the docs have sockeTimeout & connectionTimeout values > swapped. > > > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.2/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java#L193 > > > > https://lucene.apache.org/solr/guide/7_0/distributed-requests.html#configuring-the-shardhandlerfactory > > Thanks > Jay > >
Re: DocValue field & commit
Oh dear. Your autowarming is almost, but not quite totally, useless given your 2 second soft commit interval. See: https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ So autowarming is probably not a cure, when you originally said “commit” I was assuming that was one that opened a new searcher, if that’s not true autowarming isn’t a cure. Do you _really_ require 2 second soft commit intervals? I would not be surprised if you also see “too many on deck searcher” warnings in your logs at times. This is one of my hot buttons, having very short soft commit intervals is something people do without understanding the tradeoffs, one of which is that your caches are probably getting a poor utilization rate. Often the recommendation for short intervals like this is to not use the caches at all. The newSearcher is a full query. Go ahead and add facets. But again, this probably isn’t going to help much. But really, revisit your autocommit settings. Taking 1.7 seconds to autowarm means that you have roughly this. - commit - 1.7 seconds later, the new searcher is open for business. - 0.3 seconds after that a new searcher is open, which takes another 1.7 seconds to autowarm. I doubt your hard commit is really the culprit here _unless_ you’re running on an under-powered machine. The hard commit will trigger segment merging, which is CPU and I/O intensive. If you’re using a machine that can’t afford the cycles to be taken up by merging, that could account for what you see, but new searchers are being opened every 2 seconds (assuming a relatively constant indexing load). Best, Erick > On Mar 30, 2020, at 6:42 PM, Revas wrote: > > Thanks, Erick, > > The process time execution based on debugQuery between the query and facets > is as follows > > query 10ms > facets 4900ms > > since max time is spent on facet processing (docValues enabled), query and > filter cache do no apply to this, correct? > > > - Autowarm count is at 32 for both and autowarm time is 25 for > query-result cache and 1724 for filter cache > - We have hard-commit every 5 mins with opensearcher=false and > soft-commit every 2 secs. > - facet are a mix of pivot facets,range facets and facet queries > - when the same facets criteria bring a smaller result set, response is > much faster > > > > > On Mon, Mar 30, 2020 at 4:47 PM Erick Erickson > wrote: > >> OK, sounds like docValues is set. >> >> Sure, in solrconfig.xml, there are two sections “firstSearcher” and >> “newSearcher”. >> These are queries (or lists of queries) that are fired as part of >> autowarming >> when Solr is first started (firstSearcher) or when a commit happens that >> opens >> a new searcher (newSearcher). These are hand-crafted static queries. So >> create one or more newSearcher sections in that block that exercise your >> faceting and it’ll be fired as part of autowarming. That should smooth out >> the delay your user’s experience when commits happen. >> >> Best, >> Erick >> >>> On Mar 30, 2020, at 4:06 PM, Revas wrote: >>> >>> Thanks, Eric. >>> >>> 1) We are using dynamic string field for faceting where indexing =false >> and >>> stored=false . By default docValues are enabled for primitive fields >> (solr >>> 6.6.), so not explicitly defined in schema. Do you think its wrong >>> assumption? Also I do not this field listed in feild cache, but dont see >>> any dynamic fields listed. >>> 2) Autowarm count is at 32 for both and autowarm time is 25 for >> queryresult >>> and 17 >>> 3)Can you elaborate what you mean here >>> >>> >>> >>> On Mon, Mar 30, 2020 at 1:43 PM Erick Erickson >>> wrote: >>> Response spikes after commits are almost always something to do with autowarming or docValues being set to false. So here’s what I’d look at, in order. 1> are the fields used defined with docValues=true? They should be. With this much variance it sounds like you don’t have that value set. You’ll have to rebuild your entire index, first deleting all documents… You assert that they are all docValues, but the variance is so high that I wonder whether they _all_ are. They may very well be, but I’ve been tripped up by things I know are true that aren’t too often ;) You can insure this by setting 'uninvertible=“true” ‘ in your field >> type, see: https://issues.apache.org/jira/browse/SOLR-12962 if you’re on 7.6 or later. 2>what are your autowarming settings for queryResultCache and/or filterCache. Start with a relatively small number, say 16 and look at your autowarm times to insure they aren’t excessive. 3> if autowarming doesn’t help, consider specifying a newSearcher event in solrconfig.xml that exercises the facets. NOTE: <2> and <3> will mask any fields that are docValues=false that slipped through the cracks, so I’d double check <1> first. Best, Erick >>
Re: expand=true throws error
> Case 3 let;s extend it with expand=true: > { "responseHeader":{ "status":0, "QTime":1, "params":{ > "q":"author:\"William > Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_": > "1585603593269"}}, > I think it is because, expand=true parameter is not passed properly. As you can see from the params in the responseHeader section, q , fq are separate keys but expand=true is appended to fq value. If passed correctly, it should look something like this > { "responseHeader":{ "status":0, "QTime":1, "params":{ > "q":"author:\"William > Shakespeare\"", "fq":"{!collapse field=title}", "expand": "true", "_": > "1585603593269"}}, > Regards, Munendra S N On Tue, Mar 31, 2020 at 3:07 AM Szűcs Roland wrote: > Hi Munendra, > Let's see the 3 scenario: > 1. Query without collapse > 2. Query with collapse > 3. Query with collapse and expand > I made a mini book database for this: > Case 1: > { "responseHeader":{ "status":0, "QTime":0, "params":{ > "q":"author:\"William > Shakespeare\"", "_":"1585603593269"}}, "response":{"numFound":4,"start":0," > docs":[ { "id":"1", "author":"William Shakespeare", "title":"The Taming of > the Shrew", "format":"ebook", "_version_":1662625767773700096}, { "id":"2", > "author":"William Shakespeare", "title":"The Taming of the Shrew", > "format": > "paper", "_version_":1662625790857052160}, { "id":"3", "author":"William > Shakespeare", "title":"The Taming of the Shrew", "format":"audiobook", " > _version_":1662625809553162240}, { "id":"4", "author":"William > Shakespeare", > "title":"Much Ado about Nothing", "format":"paper", "_version_": > 1662625868323749888}] }} > As you can see there are 3 different format from the same book. > > Case 2: > { "responseHeader":{ "status":0, "QTime":2, "params":{ > "q":"author:\"William > Shakespeare\"", "fq":"{!collapse field=title}", "_":"1585603593269"}}, " > response":{"numFound":2,"start":0,"docs":[ { "id":"1", "author":"William > Shakespeare", "title":"The Taming of the Shrew", "format":"ebook", " > _version_":1662625767773700096}, { "id":"4", "author":"William > Shakespeare", > "title":"Much Ado about Nothing", "format":"paper", "_version_": > 1662625868323749888}] }} > Collapse post filter worked as I expected. > Case 3 let;s extend it with expand=true: > { "responseHeader":{ "status":0, "QTime":1, "params":{ > "q":"author:\"William > Shakespeare\"", "fq":"{!collapse field=title}&expand=true", "_": > "1585603593269"}}, "response":{"numFound":2,"start":0,"docs":[ { "id":"1", > " > author":"William Shakespeare", "title":"The Taming of the Shrew", "format": > "ebook", "_version_":1662625767773700096}, { "id":"4", "author":"William > Shakespeare", "title":"Much Ado about Nothing", "format":"paper", > "_version_ > ":1662625868323749888}] }} > > As you can see nothing as changed. There is no additional section of the > response. > > Cheers, > Roland > > Munendra S N ezt írta (időpont: 2020. márc. 30., > H, 17:46): > > > Please share the complete request. Also, does number of results change > with > > & without collapse. Usually title would be unique every document. If that > > is the case then, there won't be anything to expand right? > > > > On Mon, Mar 30, 2020, 8:22 PM Szűcs Roland > > wrote: > > > > > Hi Munendra, > > > I do not get error . The strange thing is that I get exactly the same > > > response with fq={!collapse field=title} versus fq={!collapse > > > field=title}&expand=true. > > > Collapse works properly as a standalone fq but expand has no impact. > How > > > can I have access to the "hidden" documents then? > > > > > > Roland > > > > > > Munendra S N ezt írta (időpont: 2020. márc. > > 30., > > > H, 16:47): > > > > > > > Hey, > > > > Could you please share the stacktrace or error message you received? > > > > > > > > On Mon, Mar 30, 2020, 7:58 PM Szűcs Roland < > > szucs.rol...@bookandwalk.hu> > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > I manage to use edismax queryparser in solr 8.4.1 with collapse > > without > > > > any > > > > > problem. I tested it with the SOLR admin GUI. So fq={!collapse > > > > field=title} > > > > > worked fine. > > > > > > > > > > As soon as I use the example from the documentation and use: > > > > fq={!collapse > > > > > field=title}&expand=true, I did not get back any additional output > > with > > > > > section expanded. > > > > > > > > > > Any idea? > > > > > > > > > > Thanks in advance, > > > > > Roland > > > > > > > > > > > > > > >
Re: Solrcloud 7.6 OOM due to unable to create native threads
Thanks Eric. I don't seeing anywhere that CDCR is not recommended for production use. Took the thread dump. Seeing about 140 CDCR threads cdcr-replicator-219-thread-8" #787 prio=5 os_prio=0 tid=0x7f7c34009000 nid=0x50a waiting on condition [0x7f7ec871b000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0001da724ca0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) cdcr-update-log-synchronizer-157-thread-1" #240 prio=5 os_prio=0 tid=0x7f8782543800 nid=0x2e5 waiting on condition [0x7f82ad99c000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0001d7f9e8e8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Thanks, Raji On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson wrote: > What that error usually means is that there are a zillion threads running. > > Try taking a thread dump. It’s _probable_ that it’s CDCR, but > take a look at the thread dump to see if you have lots of > threads that are running. Any by “lots” here, I mean 100s of threads > that reference the same component, in this case that have cdcr in > the stack trace. > > CDCR is not getting active work at this point, you might want to > consider another replication strategy if you’re not willing to fix > the code. > > Best, > Erick > > > On Mar 29, 2020, at 4:17 AM, Raji N wrote: > > > > Hi All, > > > > We running solrcloud 7.6 (with the patch # > > > https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon > > production on 7 hosts in containers. The container memory is 48GB , heap > > is 24GB. > > ulimit -v > > > > unlimited > > > > ulimit -m > > > > unlimited > > We don't have any custom code in solr. We have set up bidirectional CDCR > > between primary and secondary Datacenter. Our secondary DC is very > unstable > > and many times many instances are down. > > > > We get below exception quite often. Is this because the CDCR connection > is > > broken. > > > > WARN (cdcr-update-log-synchronizer-80-thread-1) [ ] > > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception > > > > java.lang.OutOfMemoryError: unable to create new native thread > > > > at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211] > > > > at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211] > > > > at > > > org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) > > ~[httpclient-4.5.3.jar:4.5.3] > > > > at > > > org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219) > > ~[httpclient-4.5.3.jar:4.5.3] > > > > at > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > - nknize - 2018-12-07 14:47:53] > > > > at > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > - nknize - 2018-12-07 14:47:53] > > > > at > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > - nknize - 2018-12-07 14:47:53] > > > > at > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) > > ~[solr-solrj-7.6.0.jar:7.6.0 719
Re: Solrcloud 7.6 OOM due to unable to create native threads
Hi Eric, What are you recommendations for SolrCloud DR strategy. Thanks, Raji On Sun, Mar 29, 2020 at 6:25 PM Erick Erickson wrote: > I don’t recommend CDCR at this point, I think there better approaches. > > The root problem is that CDCR uses tlog files as a queueing mechanism. > If the connection between the DCs is broken for any reason, the tlogs grow > without limit. This could probably be fixed, but a better alternative is to > use something designed to insure messages (updates) are delivered to > separate DCs rathe than try to have CDCR re-invent that wheel. > > Best, > Erick > > > On Mar 29, 2020, at 6:47 PM, S G wrote: > > > > Is CDCR even recommended to be used in production? > > Or it was abandoned before it could become production ready ? > > > > Thanks > > SG > > > > > > On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson > > wrote: > > > >> What that error usually means is that there are a zillion threads > running. > >> > >> Try taking a thread dump. It’s _probable_ that it’s CDCR, but > >> take a look at the thread dump to see if you have lots of > >> threads that are running. Any by “lots” here, I mean 100s of threads > >> that reference the same component, in this case that have cdcr in > >> the stack trace. > >> > >> CDCR is not getting active work at this point, you might want to > >> consider another replication strategy if you’re not willing to fix > >> the code. > >> > >> Best, > >> Erick > >> > >>> On Mar 29, 2020, at 4:17 AM, Raji N wrote: > >>> > >>> Hi All, > >>> > >>> We running solrcloud 7.6 (with the patch # > >>> > >> > https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon > >>> production on 7 hosts in containers. The container memory is 48GB , > heap > >>> is 24GB. > >>> ulimit -v > >>> > >>> unlimited > >>> > >>> ulimit -m > >>> > >>> unlimited > >>> We don't have any custom code in solr. We have set up bidirectional > CDCR > >>> between primary and secondary Datacenter. Our secondary DC is very > >> unstable > >>> and many times many instances are down. > >>> > >>> We get below exception quite often. Is this because the CDCR connection > >> is > >>> broken. > >>> > >>> WARN (cdcr-update-log-synchronizer-80-thread-1) [ ] > >>> o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception > >>> > >>> java.lang.OutOfMemoryError: unable to create new native thread > >>> > >>> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211] > >>> > >>> at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211] > >>> > >>> at > >>> > >> > org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) > >>> ~[httpclient-4.5.3.jar:4.5.3] > >>> > >>> at > >>> > >> > org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219) > >>> ~[httpclient-4.5.3.jar:4.5.3] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:200) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957) > >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > >>> - nknize - 2018-12-07 14:47:53] > >>> > >>> at > >>> > >> > org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139) > >>> [solr-core-7.6.0.jar:7.6.0-SNAPSHOT > >>> 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30 > >>> 14:02:46] > >>> > >>> at > >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > >>> [?:1.8.0_211] > >>> > >>> at > >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > >>> [?:1.8.0_211] > >>> > >>> at > >>> > >> > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > >>> [?:1.8.0_