Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Erick Erickson
Ian: Thanks much for the writeup! It's always good to have real-world documentation! Best, Erick On Fri, Nov 7, 2014 at 8:26 AM, Shawn Heisey wrote: > On 11/7/2014 7:17 AM, Ian Rose wrote: >> *tl;dr: *Routing updates to a random Solr node (and then letting it forward >> the docs to where they n

Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Shawn Heisey
On 11/7/2014 7:17 AM, Ian Rose wrote: > *tl;dr: *Routing updates to a random Solr node (and then letting it forward > the docs to where they need to go) is very expensive, more than I > expected. Using a "smart" router that uses the cluster config to route > documents directly to their shard resul

Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Ian Rose
Hi again, all - Since several people were kind enough to jump in to offer advice on this thread, I wanted to follow up in case anyone finds this useful in the future. *tl;dr: *Routing updates to a random Solr node (and then letting it forward the docs to where they need to go) is very expensive,

Re: Ideas for debugging poor SolrCloud scalability

2014-11-01 Thread Erick Erickson
bq: but it should be more or less a constant factor no matter how many Solr nodes you are using, right? Not really. You've stated that you're not driving Solr very hard in your tests. Therefore you're waiting on I/O. Therefore your tests just aren't going to scale linearly with the number of shard

Re: Ideas for debugging poor SolrCloud scalability

2014-11-01 Thread Shawn Heisey
On 11/1/2014 9:52 AM, Ian Rose wrote: > Just to make sure I am thinking about this right: batching will certainly > make a big difference in performance, but it should be more or less a > constant factor no matter how many Solr nodes you are using, right? Right > now in my load tests, I'm not actu

Re: Ideas for debugging poor SolrCloud scalability

2014-11-01 Thread Ian Rose
Erick, Just to make sure I am thinking about this right: batching will certainly make a big difference in performance, but it should be more or less a constant factor no matter how many Solr nodes you are using, right? Right now in my load tests, I'm not actually that concerned about the absolute

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Yes, I was inadvertently sending them to a replica. When I sent them to the leader, the leader reported (1000 adds) and the replica reported only 1 add per document. So, it looks like the leader forwards the batched jobs individually to the replicas. On Fri, Oct 31, 2014 at 3:26 PM, Erick Erickson

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson
Internally, the docs are batched up into smaller buckets (10 as I remember) and forwarded to the correct shard leader. I suspect that's what you're seeing. Erick On Fri, Oct 31, 2014 at 12:20 PM, Peter Keegan wrote: > Regarding batch indexing: > When I send batches of 1000 docs to a standalone S

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Regarding batch indexing: When I send batches of 1000 docs to a standalone Solr server, the log file reports "(1000 adds)" in LogUpdateProcessor. But when I send them to the leader of a replicated index, the leader log file reports much smaller numbers, usually "(12 adds)". Why do the batches appea

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson
NP, just making sure. I suspect you'll get lots more bang for the buck, and results much more closely matching your expectations if 1> you batch up a bunch of docs at once rather than sending them one at a time. That's probably the easiest thing to try. Sending docs one at a time is something of

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Ian Rose
Hi Erick - Thanks for the detailed response and apologies for my confusing terminology. I should have said "WPS" (writes per second) instead of QPS but I didn't want to introduce a weird new acronym since QPS is well known. Clearly a bad decision on my part. To clarify: I am doing *only* writes

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Erick Erickson
I'm really confused: bq: I am not issuing any queries, only writes (document inserts) bq: It's clear that once the load test client has ~40 simulated users bq: A cluster of 3 shards over 3 Solr nodes *should* support a higher QPS than 2 shards over 2 Solr nodes, right QPS is usually used to mea

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
Thanks for the suggestions so for, all. 1) We are not using SolrJ on the client (not using Java at all) but I am working on writing a "smart" router so that we can always send to the correct node. I am certainly curious to see how that changes things. Nonetheless even with the overhead of extra r

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Erick Erickson
Your indexing client, if written in SolrJ, should use CloudSolrServer which is, in Matt's terms "leader aware". It divides up the documents to be indexed into packets that where each doc in the packet belongs on the same shard, and then sends the packet to the shard leader. This avoids a lot of re-

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Shawn Heisey
On 10/30/2014 2:56 PM, Ian Rose wrote: > I think this is true only for actual queries, right? I am not issuing > any queries, only writes (document inserts). In the case of writes, > increasing the number of shards should increase my throughput (in > ops/sec) more or less linearly, right? No, that

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Matt Hilt
If you are issuing writes to shard non-leaders, then there is a large overhead for the eventual redirect to the leader. I noticed a 3-5 times performance increase by making my write client leader aware. On Oct 30, 2014, at 2:56 PM, Ian Rose wrote: >> >> If you want to increase QPS, you shoul

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
> > If you want to increase QPS, you should not be increasing numShards. > You need to increase replicationFactor. When your numShards matches the > number of servers, every single server will be doing part of the work > for every query. I think this is true only for actual queries, right? I a

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Shawn Heisey
On 10/30/2014 2:23 PM, Ian Rose wrote: > My methodology is as follows. > 1. Start up a K solr servers. > 2. Remove all existing collections. > 3. Create N collections, with numShards=K for each. > 4. Start load testing. Every minute, print the number of successful > updates and the number of faile

Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
Howdy all - The short version is: We are not seeing Solr Cloud performance scale (event close to) linearly as we add nodes. Can anyone suggest good diagnostics for finding scaling bottlenecks? Are there known 'gotchas' that make Solr Cloud fail to scale? In detail: We have used Solr (in non-Clou