Re: EXT: Re: Solr Query Performance benchmarking

2017-05-05 Thread Suresh Pendap
Thanks everyone for taking time to respond to my email. I think you are correct in that the query results might be coming from main memory as I only had around 7k queries. However it is still not clear to me, given that everything was being served from main memory, why is that I am not able to push

Re: Solr Query Performance benchmarking

2017-04-28 Thread Shawn Heisey
On 4/28/2017 12:43 PM, Toke Eskildsen wrote: > Shawn Heisey wrote: >> Adding more shards as Toke suggested *might* help,[...] > I seem to have phrased my suggestion poorly. What I meant to suggest > was a switch to a single shard (with 4 replicas) setup, instead of the > current 2 shards (with 2

RE: Solr Query Performance benchmarking

2017-04-28 Thread Davis, Daniel (NIH/NLM) [C]
Beautiful, thank you. -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Friday, April 28, 2017 3:07 PM To: solr-user@lucene.apache.org Subject: Re: Solr Query Performance benchmarking I use the JMeter plugins. They’ve been reorganized recently, so they

Re: Solr Query Performance benchmarking

2017-04-28 Thread Walter Underwood
niel (NIH/NLM) [C] > wrote: > > Walter, > > If you can share a pointer to that JMeter add-on, I'd love it. > > -Original Message- > From: Walter Underwood [mailto:wun...@wunderwood.org] > Sent: Friday, April 28, 2017 2:53 PM > To: solr-user@lucene

RE: Solr Query Performance benchmarking

2017-04-28 Thread Davis, Daniel (NIH/NLM) [C]
Walter, If you can share a pointer to that JMeter add-on, I'd love it. -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Friday, April 28, 2017 2:53 PM To: solr-user@lucene.apache.org Subject: Re: Solr Query Performance benchmarking I use production

Re: Solr Query Performance benchmarking

2017-04-28 Thread Walter Underwood
I use production logs to get a mix of common and long-tail queries. It is very hard to get a realistic distribution with synthetic queries. A benchmark run goes like this, with a big shell script driving it. 1. Reload the collection to clear caches. 2. Split the log into a cache warming set (usu

Re: Solr Query Performance benchmarking

2017-04-28 Thread Toke Eskildsen
Shawn Heisey wrote: > Adding more shards as Toke suggested *might* help,[...] I seem to have phrased my suggestion poorly. What I meant to suggest was a switch to a single shard (with 4 replicas) setup, instead of the current 2 shards (with 2 replicas). - Toke

Re: Solr Query Performance benchmarking

2017-04-28 Thread Erick Erickson
Well, the best way to get no cache hits is to set the cache sizes to zero ;). That provides worst-case scenarios and tells you exactly how much you're relying on caches. I'm not talking the lower-level Lucene caches here. One thing I've done is use the TermsComponent to generate a list of terms ac

Re: Solr Query Performance benchmarking

2017-04-28 Thread Rick Leir
(aside: Using Gatling or Jmeter?) Question: How can you easily randomize something in the query so you get no cache hits? I think there are several levels of caching. -- Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: Solr Query Performance benchmarking

2017-04-28 Thread Erick Erickson
re: the q vs. fq question. My claim (not verified) is that the fastest of all would be q=*:*&fq={!cache=false}. That would bypass the scoring that putting it in the "q" clause would entail as well as bypass the filter cache. But I have to agree with Walter, this is very suspicious IMO. Here's what

Re: Solr Query Performance benchmarking

2017-04-28 Thread Walter Underwood
More “unrealistic” than “amazing”. I bet the set of test queries is smaller than the query result cache size. Results from cache are about 2 ms, but network communication to the shards would add enough overhead to reach 40 ms. wunder Walter Underwood wun...@wunderwood.org http://observer.wunder

Re: Solr Query Performance benchmarking

2017-04-28 Thread Shawn Heisey
On 4/27/2017 5:20 PM, Suresh Pendap wrote: > Max throughput that I get: 12000 to 12500 reqs/sec > 95 percentile query latency: 30 to 40 msec These numbers are *amazing* ... far better than I would have expected to see on a 27GB index, even in a situation where it fits entirely into available memor

Re: Solr Query Performance benchmarking

2017-04-28 Thread Toke Eskildsen
On Thu, 2017-04-27 at 23:20 +, Suresh Pendap wrote: > Number of Solr Nodes: 4 > Number of shards: 2 > replication-factor:  2 > Index size: 55 GB > Shard/Core size: 27.7 GB > maxConnsPerHost: 1000 The overhead of sharding is not trivial. Your overall index size is fairly small, relative to your