Getting rid of

 group.truncate=true
 group.facet=true
 group=true
 group.field=edition
 group.limit=30
 group.ngroups=true
 group.format=grouped

makes the solr behave again under the normal load but of course the
results are a bit messed up with a kind of duplicates (the point of
grouping is callapsing  product variants into one superproduct)

Cheers
 Marek 
> I'm attaching the gc log, looks ok at the beginning and then within 10
> minutes starts stopping everything for 20 sec or so.
> (sorry for pasting in, wouldn't get through as an attachment)
>
> Cheers
>  Marek
>
> Hi,
> thanks for the quick response.
> We have meanwhile tried to remove the group.facet=true from the set of
> parameters and couldn't reproduce the problem using the same stress
> test, so I think 80% chance this is the root cause.
> We have tried solr 6.4.1, same problem occurs.
> There is only a very few small number of documents (100 - 200K) index on
> the disk only 544Mb. Very low traffic - definitelly less than 10 qps.
> No OOM errors.
> GC log shows
> 2017-03-04T18:40:48.056+0000: 407.215: Total time for which application
> threads were stopped: 19.2522394 seconds, Stopping threads took:
> 0.0003600 seconds
> 2017-03-04T18:40:49.135+0000: 408.294: Total time for which application
> threads were stopped: 0.0256060 seconds, Stopping threads took:
> 0.0240290 seconds
> 2017-03-04T18:40:50.146+0000: 409.305: Total time for which application
> threads were stopped: 0.0106780 seconds, Stopping threads took:
> 0.0090890 seconds
>
> 19 seconds is worrying.
>
> I will try some traces when I'm not under stress test myself.
>
> Thanks
>  Marek
>  
>
>
>
>>> The "Unable to write response, client closed connection or we are
>>> shutting down" bits mean you're timing out. Or maybe something much
>>> more serious. You can up the timeouts, but that's not particularly
>>> useful since the response is so long anyway.
>>>
>>> Before jumping to conclusions, I'd _really_ recommend you figure out
>>> the root cause. First set up jmeter or the like so you can create a
>>> stress test and reproduce this at will on a test machine.
>>>
>>> Things I'd check:
>>>
>>>> At what point do things get slow? 10 QPS? 100 QPS, 1,000 QPS? Let's get a 
>>>> benchmark here for a reality check. If you're throwing 1,000 QPS at a 
>>>> single Solr instance that's simply unrealistic. 100 QPS/node is on the 
>>>> high side of what I'd expect.
>>>> how many docs do you have on a node?
>>>> look at your Solr logs for any anomalies, particularly OOM errors.
>>>> turn on GC logs and see if you're spending an inordinate amount of time in 
>>>> GC. Note you can get a clue if this is the issue by just increasing the 
>>>> JVM heap as a quick test. Not conclusive, but if you give the app another 
>>>> 4G and your timings change radically, problem identified.
>>>> That JIRA you pointed to is unlikely to be the real issue since your 
>>>> performance is OK to start. It's still possible, but..
>>>> attach a profiler to see where the time is being spent. Must be on a test 
>>>> machine since profilers are generally intrusive.
>>>> Grab a couple of stack traces and see if that sheds a clue.
>>> I really have to emphasize, though, that until you do a Root Cause
>>> Analysis, you're just guessing. Going to 6.4 an using JSON facets is a
>>> shot in the dark.
>>>
>>> Best,
>>> Erick
>>>
>>>
>>>
>>> On Sat, Mar 4, 2017 at 8:45 AM, Marek Tichy <ma...@gn.apc.org> wrote:
>>>> Hi,
>>>>
>>>> I'm in a bit of a crisis here. Trying to deploy a new search on an
>>>> ecommerce website which has been tested (but not stress tested). The
>>>> core has been running for years without any performance problems but we
>>>> have now changed two things:
>>>>
>>>> 1) started using group.facet=true in a rather complicated query - see below
>>>>
>>>> 2) added a new core with suggester component
>>>>
>>>> Solr version was 5.2, upgraded to 5.5.4 to try, no improvement.
>>>>
>>>> What happens under real load is the query response times start getting
>>>> higher  > 10000  and most requests end up like this:
>>>> org.apache.solr.servlet.HttpSolrCall; Unable to write response, client
>>>> closed connection or we are shutting down
>>>>
>>>> Could it be  this issue https://issues.apache.org/jira/browse/SOLR-4763
>>>> ? And if so, would upgrading to 6.4 help or changing the app to start
>>>> using JSON.facet ?
>>>>
>>>> Any help would be  greatly appreciated.
>>>>
>>>> Thanks
>>>>
>>>> Marek
>>>>
>>>>
>>>> INFO  - 2017-03-04 16:04:42.619; [   x:kcore]
>>>> org.apache.solr.core.SolrCore; [kcore]  webapp=/solr path=/select
>>>> params={f.ebook_formats.facet.mincount=1&f.languageid.facet.limit=10&f.ebook_formats.facet.limit=10&fq=((type:knihy)+OR+(type:defekty))&fq=authorid:(27544)&f.thematicgroupid.facet.mincount=1&group.ngroups=true&group.ngroups=true&f.type.facet.limit=10&group.facet=true&f.articleparts.facet.mincount=1&f.articleparts.facet.limit=10&group.field=edition&group=true&facet.field=categoryid&facet.field={!ex%3Dat}articletypeid_grouped&facet.field={!ex%3Dat}type&facet.field={!ex%3Dsw}showwindow&facet.field={!ex%3Dtema}thematicgroupid&facet.field={!ex%3Dformat}articleparts&facet.field={!ex%3Dformat}ebook_formats&facet.field={!ex%3Dlang}languageid&f.categoryid.facet.mincount=1&group.limit=30&start=0&f.type.facet.mincount=1&f.thematicgroupid.facet.limit=10&sort=score+desc&rows=12&version=2.2&f.languageid.facet.mincount=1&q=&group.truncate=false&group.format=grouped&f.showwindow.facet.mincount=1&f.articletypeid_grouped.facet.mincount=1&f.categoryid.facet.limit=100&f.showwindow.facet.limit=10&f.articletypeid_grouped.facet.limit=10&facet=true}
>>>> hits=1 status=0 QTime=19214
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>


Reply via email to