Hi,

I'm looking for a blog post or documentation giving a good overview of the
architecture of SolrCloud (not the user experience and use of Solr). The
audience would be engineers knowledgeable about distributed systems that
know nothing about SolrCloud (side note: I looked for something similar for
Elasticsearch and didn't find either).

Documentation that will touch on the index structure
(collection/shard/replica), the way they're materialized as cores, the
coordination done by ZooKeeper (shard leader élections). How these replicas
interact with each other and the update log in steady state and in recovery
scenarios.
The Overseer role (I wrote the detailed overseer doc
https://github.com/apache/solr/blob/main/dev-docs/overseer/overseer.adoc)
and how it interacts with ZooKeeper, the storage of all metadata in
ZooKeeper, the way nodes start, the way cluster state is propagated and
managed (watches for collections with replicas on a node vs all other
collection) etc.

If anybody knows about a doc that gives a complete overview, I'm very
interested.
Otherwise might end up writing it 🤓

Thanks,
Ilan

Reply via email to