Updates are currently done locally before concurrently being sent to all 
replicas - so on a single update, you can expect 2x just from that.

As for your results, it sounds like perhaps there is more overhead than we 
would like in the code that sends to replicas and forwards updates? Someone 
would have to dig in to really know I think. I would doubt it’s a configuration 
issue, but you never know.

-- 
Mark Miller
about.me/markrmiller

On July 8, 2014 at 9:18:28 AM, Ian Williams (NWIS - Applications Design) 
(ian.willi...@wales.nhs.uk) wrote:

Hi  

I'm encountering a surprisingly high increase in response times when I insert 
new documents into a SolrCloud, compared with a standalone Solr instance.  

I have a SolrCloud set up for test and evaluation purposes. I have four shards, 
each with a leader and a replica, distributed over four Windows virtual 
servers. I have zookeeper running on three of the four servers. There are not 
many documents in my SolrCloud (just a few hundred). I am using composite id 
routing, specifying a prefix to my document ids which is then used by Solr to 
determine which shard the document should be stored on.  

I determine in advance which shard a document with a given id prefix will end 
up in, by trying it out in advance. I then try the following scenarios, using 
inserts without commits. E.g. I use:  
curl http://servername:port/solr/update -H "Content-Type: text/xml" 
--data-binary @test.txt  

1. Insert a document, sending it to the server hosting the correct shard, with 
replicas turned off (response time <20ms)  
I find that if I 'switch off' the replicas for my shard (by shutting down Solr 
for the replicas), and then I send the new document to the server hosting the 
leader for the correct shard, then I get a very fast response, i.e. under 10ms, 
which is similar to the performance I get when not using SolrCloud. This is 
expected, as I've removed any overhead to do with replicas or routing to the 
correct shard.  

2. Insert a document, sending it to the server hosting the correct shard, but 
with replicas turned on (response time approx 250ms)  
If I switch on the replica for that shard, then my average response time for an 
insert increases from <10ms to around 250ms. Now I expect an overhead, because 
the leader has to find out where the replica is (from Zookeeper?) and then 
forward the request to that replica, then wait for a reply - but an increase 
from <20ms to 250ms seems very high?  

3. Insert a document, sending it to a server hosting the incorrect shard, with 
replicas turned on (response time approx 500ms)  
If I do the same thing again but this time send to the server hosting a 
different shard to the shard my document will end up in, the average response 
times increase again to around 500ms. Again, I'd expect an increase because of 
the extra step of needing to forward to the correct shard, but the increase 
seems very high?  


Should I expect this much of an overhead for shard routing and replicas, or 
might this indicate a problem in my configuration?  

Many thanks  
Ian  

---  
Mae?r wybodaeth a gynhwysir yn y neges e-bost hon ac yn unrhyw atodiadau?n 
gyfrinachol. Os ydych yn ei derbyn ar gam, rhowch wybod i?r anfonwr a?i dileu?n 
ddi-oed. Ni fwriedir i ddatgelu i unrhyw un heblaw am y derbynnydd, boed yn 
anfwriadol neu fel arall, hepgor cyfrinachedd. Efallai bydd Gwasanaeth Gwybodeg 
GIG Cymru (NWIS) yn monitro ac yn cofnodi pob neges e-bost rhag firysau a 
defnydd amhriodol. Mae?n bosibl y bydd y neges e-bost hon ac unrhyw atebion neu 
atodiadau dilynol yn ddarostyngedig i?r Ddeddf Rhyddid Gwybodaeth. Mae?r farn a 
fynegir yn y neges e-bost hon yn perthyn i?r anfonwr ac nid ydynt o reidrwydd 
yn perthyn i NWIS.  

The information included in this email and any attachments is confidential. If 
received in error, please notify the sender and delete it immediately. 
Disclosure to any party other than the addressee, whether unintentional or 
otherwise, is not intended to waive confidentiality. The NHS Wales Informatics 
Service (NWIS) may monitor and record all emails for viruses and inappropriate 
use. This e-mail and any subsequent replies or attachments may be subject to 
the Freedom of Information Act. The views expressed in this email are those of 
the sender and not necessarily of NWIS.  
---  

Reply via email to