There's no problem here, but I'm curious about how batches of updates
are handled on the Solr server side in Solr cloud?

Going over the code for DistributedUpdateProcessor and
SolrCmdDistributor, it appears that the batch is broken down and docs
are processed one-by-one. By processed, I mean that each doc in the
batch from the client is sent to replicas individually.

This makes sense but I wonder if the forwarding on to replicas could
be done in sub-batches? For instance, if the client sends a batch of
100 documents to a cluster with 4 shards, I wonder if it would be more
efficient to calculate the shard assignments to create 4 sub-batches
and then forward those 4 sub-batches on to their respective leaders?
Maybe I'm overthinking it too ;-)

Cheers,
Tim

Reply via email to