Replicas-on-demand

David Smiley Tue, 23 Apr 2024 22:45:42 -0700

I’m soliciting interest / feedback on something.

Maybe some of you are familiar with transient-cores (SOLR-1028), an
LRU cache SolrCore mechanism that allows you to have an almost
unlimited number of cores on a node in name-only with a limited number
that are actually loaded at any one time.  It’s for use-cases where
the working set is significantly smaller than the total possible
addressable set.  Transient-cores only works in standalone Solr; I
tried to get it to work in SolrCloud but it proved difficult / buggy,
especially with leader election entanglements.  Furthermore, if we
imagine tens of thousands of replicas on a node, actually maintaining
that in SolrCloud / ZooKeeper is a ton of information and book-keeping
/ watching etc.


I am imagining another approach where replicas are created and removed
on-demand and thus the underlying core as well.  And at a higher level
of abstraction (at the request level) that can make more informed
decisions than the “SolrCores”/transient-cores mechanism can.  If a
request comes in and we have 0 replicas and a shared file system,
create one similar to autoAddReplicas[1].  If we have 1 replica, maybe
we should asynchronously arrange for another to maintain good
availability.  If the core seems saturated with query activity, create
another (to have more than 2 total).  That might depend on /select vs
/update and be replica-type specific.  Meanwhile a node listener can
remove replicas that have not been used recently, especially to limit
the number of replicas per node.  It can consider the number of
*other* replicas for the same shard that exists, and leadership and
replica type in its decision.  Of course different users/apps might
want to tune such settings differently, and it doesn’t imply a shared
file system to be useful either.

To support such a feature, I don’t think much is needed of Solr.  The
request “demand driven” aspect suggests a new plugin type within
HttpSolrCall that resolves a request to a SolrCore, perhaps called a
“RequestToCoreResolver”, or we make HttpSolrCall itself more
extensible.  For the 0-replica scenario, CloudSolrClient probably
needs a little more tolerance to just get the request to Solr instead
of prematurely failing.  HttpShardHandler (for distributed-search)
might similarly.  There is no core-listener; it could be added, or we
do polling, or we extend SolrCores as a collaborating plugin.

One risk/concern is ensuring the core data is retained after the
replica is removed.  It’s not necessary to do that but if it’s
removed, then it’s expensive & slow to create it again — a problem if
there are no replicas to serve a request.  I haven’t yet checked on
the feasibility of keeping data lying around, and using it again when
re-creating the replica.

A significant motivation of mine with this proposal is to help SIP-20
“Zero Replicas” [2].  The biggest obstacle I see with it is that
unused cores are empty until queried, yet still are in state ACTIVE
the whole time (don’t even use the RECOVERY state).  A hack in
SearchHandler throws an exception and gets the data.  This stuff
doesn’t *need* to be in that branch (it’s not fundamental to Zero even
if it’s fundamental to how we scale with Zero right now) but a
replica-on-demand approach would obsolete that.

[1]:  Solr used to have an “autoAddReplicas” feature prior to Solr 9.
In Solr 9 a substitute was developed — CollectionsRepairEventListener.
Regardless, the idea is to create replicas automatically in response
to nodes going away (or maybe other circumstances) in order to
maintain a replicationFactor.  There are many references to
autoAddReplicas in CHANGES.txt; originally in SOLR-5656 it was
intended for shared file system but was later expanded to be more
general SOLR-10397.  In the case of a shared file system, you can even
reach 0 replicas and nonetheless create more later.

[2]: SIP-20 https://cwiki.apache.org/confluence/x/8YokEQ and branch
https://github.com/apache/solr/tree/jira/solr-17125-zero-replicas

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Replicas-on-demand

Reply via email to