Should circuit breakers only kill external search requests and not cluster-internal requests to shards?
Circuit breakers can kill any request, whether it is a client request from outside the cluster or an internal distributed request to a shard. Killing a portion of distributed request will affect the main request. Not sure whether a 503 from a shard will kill the whole request or cause partial results, but it isn’t good. We run with 8 shards. If a circuit breaker is killing 10% of requests on each host, that will hit 57% of all external requests (0.9^8 = 0.43). That seems like “overkill” to me. If it only kills external requests, then 10% means 10%. Killing only external requests requires that external requests go roughly equally to all hosts in the cluster, or at least all NRT or PULL replicas. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)