Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-19 Thread Erick Erickson
15M docs may still comfortably fit in a single shard! I've seen up to 300M docs fit on a shard. Then again I've seen 10M docs make things unacceptably slow. You simply cannot extrapolate from 10K to 5M reliably. Put all 5M docs on the stand-alone servers and test _that_. Whenever I see numbers lik

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-19 Thread Susheel Kumar
You may want to utilise Document routing (_route_) option to have your query serve faster but above you are trying to compare apple with oranges meaning your performance tests numbers have to be based on either your actual numbers like 3-5 million docs per shard or sufficient enough to see advantag

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-19 Thread kasimjinwala
This is just for performance testing we have taken 10K records per shard. In live scenario it would be 30L-50L per shard. I want to search document from all shards, it will slow down and take too long time. I know in case of solr Cloud, it will query all shard node and then return result. Is ther

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-18 Thread Erick Erickson
+1 to Susheel's question. Sharding inevitably adds overhead. Roughly each shard is queried for its top N docs (10 if, say, rows=10). The doc ID and sort criteria (score by default) are returned to the node that originally got the request. That node then sorts the lists into the real top 10 to retur

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-18 Thread Susheel Kumar
Hello, Question: Do you really need sharding/can live without sharding since you mentioned only 10K records in one shard. What's your index/document size? Thanks, Susheel On Mon, Jul 18, 2016 at 2:08 AM, kasimjinwala wrote: > currently I am using solrCloud 5.0 and I am facing query performanc

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

2016-07-18 Thread kasimjinwala
currently I am using solrCloud 5.0 and I am facing query performance issue while using 3 implicit shards, each shard contain around 10K records. when I am specifying shards parameter(*shards=shard1*) in query it gives 30K-35K qps. but while removing shards parameter from query it give *1000-1500qp

Re: SolrCloud - Query performance degrades with multiple servers

2013-01-09 Thread Shawn Heisey
On 1/9/2013 7:01 PM, sausarkar wrote: Hi Yonik, Could you merger this feature with 4.0 branch, We tried to use 4.1 it did solve the CPU spike but we did get other issues. As we are very tight on schedule so it would very beneficial if you could merge this feature with 4.0 branch. 4.1 *is* the

Re: SolrCloud - Query performance degrades with multiple servers

2013-01-09 Thread sausarkar
Hi Yonik, Could you merger this feature with 4.0 branch, We tried to use 4.1 it did solve the CPU spike but we did get other issues. As we are very tight on schedule so it would very beneficial if you could merge this feature with 4.0 branch. Let me know. Thanks -- View this message in contex

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-12 Thread Yonik Seeley
On Wed, Dec 12, 2012 at 5:03 PM, sausarkar wrote: > We still could replicate the issue in 4.1 branch i.e. queries going to one > server (numShards=1) is being distributed among all the servers which is > creating CPU spikes in all the servers in the cloud. Do you think this > behavior is as expect

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-12 Thread sausarkar
We still could replicate the issue in 4.1 branch i.e. queries going to one server (numShards=1) is being distributed among all the servers which is creating CPU spikes in all the servers in the cloud. Do you think this behavior is as expected or will be fixed in the 4.1 release? -- View this mes

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
OK, I tried to reproduce it on trunk, and I can't (i.e. everything is looking fine). rm -rf example/solr/zoo_data cp -rp example example2 cp -rp example example3 cd example java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd exa

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Mark Miller
I'm still looking into this - didn't have a lot of luck seeing it with a test and am going to look at it manually. I'm hoping 4.1 by xmas! We will see though...need to get others on board. - Mark On Tue, Dec 11, 2012 at 2:40 PM, sausarkar wrote: > Do you know when will 4.1 be released or will

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 8:08 PM, sausarkar wrote: > Ok we think we found out the issue here. When solrcloud is started without > specifying numShards argument solrcloud starts with a single shard but still > thinks that there are multiple shards, so it forwards every single query to > all the nodes

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread sausarkar
Do you know when will 4.1 be released or will there be a 4.0.1 release with bug fixes from 4.0? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html Sent from the Solr - User mailing list ar

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-10 Thread Mark Miller
iller-3 [via Lucene]" > mailto:ml-node+s472066n4025457...@n3.nabble.com>> > Date: Saturday, December 8, 2012 11:08 PM > To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>> > Subject: Re: SolrCloud - Query performance degrades with multiple servers > > If

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-09 Thread sausarkar
tyId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&am

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-08 Thread Mark Miller
s=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2 > > > Re: SolrCloud - Quer

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-08 Thread sausarkar
%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=135491

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread Mark Miller
On Dec 6, 2012, at 5:08 PM, sausarkar wrote: > We solved the issue by explicitly adding numShards=1 argument to the solr > start up script. Is this a bug? Sounds like it…perhaps related to SOLR-3971…not sure though. - Mark

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread Mark Miller
Ryan, my new best friend! Please, file JIRA issue(s) for these items! I'm sure you will get some feedback. - Mark On Dec 6, 2012, at 5:09 PM, Ryan Zezeski wrote: > There are some gains to be made in Solr's distributed search code. A few > weeks about I spent time profiling dist search using d

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread Ryan Zezeski
There are some gains to be made in Solr's distributed search code. A few weeks about I spent time profiling dist search using dtrace/btrace and found some areas for improvement. I planned on writing up some blog posts and providing patches but I'll list them off now in case others have input. 1)

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
is a CPU spike each minute. Did you also di this test on the SolrCloud, any observations or suggestions? In Reply To Re: SolrCloud - Query performance degrades with multiple servers Dec 05, 2012; 7:59pm — by Mark Miller-3 This is just the std scatter gather distrib search stuff solr has bee

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
? In Reply To Re: SolrCloud - Query performance degrades with multiple servers Dec 05, 2012; 7:59pm — by Mark Miller-3 This is just the std scatter gather distrib search stuff solr has been using since around 1.4. There is some overhead to that, but generally not much. I've measured it at a

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
We measured for just 3 nodes the overhead is around 100ms. We also noticed is that CPU spikes to 100% and some queries get blocked, this happens only when cloud has multiple nodes but does not happen on single node. All the nodes has the exact same configuration and JVM setting and hardware configu

RE: SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread Michael Ryan
M To: solr-user@lucene.apache.org Subject: Re: SolrCloud - Query performance degrades with multiple servers This is just the std scatter gather distrib search stuff solr has been using since around 1.4. There is some overhead to that, but generally not much. I've measured it at around 30-50ms for

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread Mark Miller
This is just the std scatter gather distrib search stuff solr has been using since around 1.4. There is some overhead to that, but generally not much. I've measured it at around 30-50ms for a 100 machines, each with 10 million docs a few years ago. So…that doesn't help you much…but FYI… - Mark