Re: [DISCUSS] SIP-16: Distinguishing Single Node and Cluster Operation Modes at API level

Jason Gerlowski Sat, 22 Mar 2025 00:11:26 -0700

Hey Christos,

Sorry again for the delay; some replies inline.


> For '/api/node' how can I tell which node I am interacting with when
> running in cloud mode? Am I connecting to a specific node via a different
> hostname + port, or am I connecting with a node through a load balancer /
> zookeeper?

There's no load-balancing or distribution of `/api/node` APIs within
Solr, even in SolrCloud mode.  So if your `/api/node` request goes
right to Solr, you can be pretty certain that it'll be served by
whatever host+port you put in the URL.

Of course, administrators often put an external load-balancer in front
of Solr - which really complicates the use of these
non-proxied/distributed, `/api/node`-style APIs.  But there's no
distribution of `/api/node` requests by Solr itself.

> In my suggestions before I try to eliminate the
> difference between the two modes, standalone and cloud, at API level, so
> that clients always interact the same way with Solr regardless of the mode.

That makes sense to me.  And I *think* it's possible, at least at the
API level.  That is - I can't think of any functionality offered by
both modes that are exposed through different APIs depending on
"mode".  Of course, there's a lot of APIs that will error out if you
try them in standalone. We can definitely be more consistent in how
that is surfaced - I love your suggestion of using the "501" status
code as a way to indicate those cases.

I have a feeling I'm missing your main concern/point though.  If I'm
right about that, feel free to pick a specific example to lead the
discussion - that might be a good way to proceed.

> What I try to figure out is how a client like the new Admin UI could
> interact with the API in both scenarios, standalone and cloud mode, without
> having to handle each mode separately

I might be reading too much into what you mean by "without having to
handle each mode separately"...so if I'm reading that too literally,
just ignore my comments below.

As I said above, you shouldn't have huge issues with this at the API
syntax level.  But on the "conceptual" front, I'm less sure.  The
modes share a lot, but ultimately differ hugely in the abstractions
that they offer, the limitations that they have, etc.

Take "cores" as an example.  The "list-cores" API has the same syntax
in both modes.  But the meaning of a core itself is hugely different
between the two: in standalone it's the main abstraction, in SolrCloud
it's essentially an implementation detail.

And on the "limitation" side of that: standalone nodes only know about
themselves, they have no way to know of other nodes in the cluster.
So in standalone mode there's no way to know about cores on other
nodes in the cluster; whereas SolrCloud doesn't have that limitation
at all and could paint a much fuller picture by sending "list-core"
calls to all of the nodes.

That's all to say - the modes are just very very different.  I'm all
for avoiding special-handling, but it might not always be
possible/practical : (

Best,

Jason

On Wed, Mar 5, 2025 at 1:35 PM Christos Malliaridis
<malliari...@apache.org> wrote:
>
> >
> > '/api/node' is reserved for APIs that only impact the receiving node
> > (and aren't otherwise proxied or distributed)
>
>
> That makes sense to me. In my suggestions before I try to eliminate the
> difference between the two modes, standalone and cloud, at API level, so
> that clients always interact the same way with Solr regardless of the mode.
>
> removing the "/cluster" bit of the path might
> > mislead as many users as it helps.
>
>
> Eliminating '/api/cluster' may not be necessary, if we consider "cluster" a
> resource. By my definition a cluster is just a collection of nodes, so the
> same as '/api/nodes'. But having a cluster as an explicit resource in our
> RESTful API would still make sense, since interacting with the nodes
> resource collection (like with '/api/nodes/properties') could introduce
> potential naming conflicts. That's why I was considering only
> '/api/properties'. But I believe '/api/cluster/properties' could work as
> well, and having "cluster" and "node" as resources is fine too. Not sure if
> there are also cases where there could be multiple clusters under the same
> hostname in Solr?
>
> But there are some obstacles IMO - the biggest one being
> > limitations in Solr's featureset as it stands today.
>
>
> I believe this could easily be handled by responses like "501 Not
> Implemented" if an endpoint is not supported in a specific mode. This would
> also not influence a different structure of the endpoints I believe?
>
> For '/api/node' how can I tell which node I am interacting with when
> running in cloud mode? Am I connecting to a specific node via a different
> hostname + port, or am I connecting with a node through a load balancer /
> zookeeper?
>
> What I try to figure out is how a client like the new Admin UI could
> interact with the API in both scenarios, standalone and cloud mode, without
> having to handle each mode separately or rely on implementation details
> like zookeeper.
>
>
> On Mon, Feb 17, 2025 at 6:01 PM Jason Gerlowski <gerlowsk...@gmail.com>
> wrote:
>
> > Hey Christos,
> >
> > Thanks for raising this!
> >
> > > without having worked on the API before and without participating in any
> > prior discussions
> >
> > Quick summary of past discussions and decisions - not defending them
> > necessarily, but important context:
> >
> > '/api/node' is reserved for APIs that only impact the receiving node
> > (and aren't otherwise proxied or distributed): node-healthcheck,
> > status-checking on node-level asynchronous operations, fetching
> > node-specific info (like the node's public-key, environment variables,
> > etc.), debug operations like triggering a thread-dump.  '/api/cluster'
> > has APIs that are only available in "SolrCloud" mode, and that
> > (secondarily) have to do with cluster topology/state: cluster
> > properties, setting node-roles, cross-node rebalancing, package and
> > filestore operations, etc.
> >
> > That's not to say that we've gotta stick with those semantics; if
> > there's consensus in another direction we should act while v2 is still
> > "experimental".
> >
> > > What prevents us from moving towards that direction?
> >
> > I love these API suggestions, from a purely aesthetic/cosmetic
> > perspective.  But there are some obstacles IMO - the biggest one being
> > limitations in Solr's featureset as it stands today.
> >
> > Take "cluster properties" and "node roles" as examples.  I agree that
> > it'd be great to offer them in standalone as well as SolrCloud, and to
> > change the API path to suit.  But that'd be a massive effort, and
> > while those gaps exist removing the "/cluster" bit of the path might
> > mislead as many users as it helps.
> >
> > Best,
> >
> > Jason
> >
> > On Fri, Feb 14, 2025 at 12:15 PM Christos Malliaridis
> > <malliari...@apache.org> wrote:
> > >
> > > Hello everyone,
> > >
> > > I am looking into the v2 API and I was wondering what our final design
> > will
> > > look like in terms of single- and multi-node setups.
> > >
> > > The main question I am trying to answer for myself is "Do we need to
> > > distinguish between the operation mode at API endpoints"?
> > >
> > > From what I can see in the API proposals and current state, some API
> > > endpoints operate inside a cluster context (like
> > /api/cluster/properties),
> > > some inside a node context (like /api/node/logging), some other in
> > cluster
> > > node context (like /api/cluster/nodes/{nodeName}/roles), and some in no
> > > context (which is I believe cluster and node, depending on operation
> > mode?,
> > > like in /api/aliases/{aliasName}/properties).
> > >
> > > From a consumer's point of view, this may be a bit irritating, and
> > without
> > > having worked on the API before and without participating in any prior
> > > discussions, I would believe that it could be simplified.
> > >
> > > Looking into where we are now and what we may expect from Solr in the
> > > future, we may not have to distinguish between the operation modes at the
> > > API endpoints. I am not aware of the historical background or any
> > > constraints that probably apply, so please educate me.
> > >
> > > From what little I know, the following changes would make sense to me:
> > > - GET /api/cluster/properties could be just GET /api/properties
> > >   - it would get the properties of Solr. If it is a cluster, whether it
> > is
> > > a single node or multiple, it should not make a difference
> > > - GET /api/node/logging/messages could be
> > > /api/nodes/{nodeId}/logging/messages
> > >   - It would get the log messages of a specific node. For single node
> > > setups, the node ID is always the same, for multi-nodes it would have
> > > different node IDs
> > > - PUT /api/logging/levels could be added to reflect a cluster-wide log
> > > level configuration, which seems to be missing in the v2 API at the
> > moment
> > > of writing
> > > - GET /api/cluster/nodes/{nodeId}/roles could be
> > /api/nodes/{nodeName}/roles
> > >   - it would return the roles of a specific node (if the roles are per
> > node
> > > configured)
> > > - GET /api/aliases/{aliasName}/properties would stay as is, as it is
> > > node-independent and therefore a nice and simple endpoint
> > >
> > > This way we would reduce the complexity of our API (for the consumers)
> > and
> > > make it more intuitive. Additionally, the consumers would not need to
> > know
> > > whether there are multiple nodes or a single node running Solr, and will
> > > always have a "collection of nodes", even if that collection contains
> > only
> > > a single node at times. And when scaling from one node to multiple nodes,
> > > no changes at the consumer's side are required (which I'm not sure if
> > this
> > > is currently the case).
> > >
> > > What prevents us from moving towards that direction?
> > >
> > > Best,
> > > Christos
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Re: [DISCUSS] SIP-16: Distinguishing Single Node and Cluster Operation Modes at API level

Reply via email to