Intra-cluster communication; too many ways to do it

David Smiley Thu, 05 Sep 2024 12:10:28 -0700

There are too many ways to submit an intra-cluster request within Solr.
It’s been gnawing at me so I did an audit and made recommendations:


There are too many ways to submit an intra-cluster request within Solr.
It’s been gnawing at me so I did an audit and made recommendations:

A. Creating new Http SolrClient (with or without Cloud) from
org.apache.solr.update.UpdateShardHandler#getDefaultHttpClient — very
popular, 14 usages.
Recommendation: This is a poor place for such a client; will be moved
shortly in SOLR-16503 to CoreContainer and return an Http2SolrClient
instead of HttpClient.

B. org.apache.solr.client.solrj.cloud.SolrCloudManager#request  (default
impl is essentially a CloudLegacySolrClient).  3 usages.
I’m suspicious we need this; either embrace or abandon IMO.
Recommendation: embrace but migrate to getCloudSolrClient and add
getHttpSolrClient too, the latter sourced from CoreContainer.  Then allows
requestAsync usage and hopefully will also have a URL-specific method for
SOLR-17256
right next to it, there’s an httpRequest method that is now unused.
Recommendation: remove

C. org.apache.solr.update.UpdateShardHandler#getUpdateOnlyHttpClient for
“updates” only.  2 usages

D. org.apache.solr.update.UpdateShardHandler#getRecoveryOnlyHttpClient for
“recovery” only.  4 usages.

E. org.apache.solr.client.solrj.io.SolrClientCache wow, 86 usages!!  68 are
in solrj-streaming since this is where this thing originated from, thus
leaving 18 usages elsewhere but I’m over-counting.  The unique classes
using (excluding streaming, excluding a benchmark that tests streaming too)
are 6 usages (classes).

* Some usages are generic to a configurable SolrCloud cluster
(solrj-streaming definitely):
 ReindexCollection and SolrReporter (metrics).
This is the primary purpose/value of SolrClientCache; should continue.
* Some of the other cases just need a cluster-local CloudSolrClient:
SchemaDesigner (2 classes), CollectionsRepairEventListener.
Recommendation: Use SolrCloudManager.
* The other 4 usages are to get an Http SolrClient for a URL.
Recommendation: Use CoreContainer.getDefaultHttpSolrClient moving from
UpdateShardHandler (or similar recommended in SolrCloudManager).  Requires
solution to SOLR-17256 like an Http2SolrClient specific request method
specifying an additional URL parameter.

F. org.apache.solr.handler.component.HttpShardHandler  / ShardHandler.
Originally built for distributed-search, which requires supporting multiple
replicas that are equivalent to serve a request for a shard.  Later a
number of Overseer/cluster commands started using/abusing it.  There are
some API tech-debt / issues I see here as a result of this.  Now that
Http2SolrClient has async support (it didn’t when HttpShardHandler was
first created), I wonder if some Overseer usages should use that instead.
I think it could be simpler.
Recommendation: Add javadocs warning users to avoid using it outside of
distributed search.  Point out Http2SolrClient.requestAsync.

As an aside, I want to mention that EmbeddedSolrServer is pretty cool; I
could imagine expanding its use within Solr in some cases to avoid an HTTP
layer and disconnected caller chain.  Like to issue an admin call, even a
SolrCloud one (e.g. to move a shard.  But it doesn't do collection
routing.  It could be neat if the node-local CloudSolrClient proposed above
could somehow detect such an admin call and route through locally to the
node.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

Intra-cluster communication; too many ways to do it

Reply via email to