Re: PRS, important changes needed

2024-05-03 Thread Cassandra Targett
 I haven’t followed what PRS even is or what its benefits are supposed to
be so can’t comment on whether it should/coud be a default. I may be alone
in that since I haven’t had bandwidth to pay attention to very many recent
Solr changes, but I do note that there is only one entry in the Solr Ref
Guide for “perReplicaState” (and none for “PRS” or alternate spellings I
could come up with), which is a single sentence describing it as a
parameter one can set when creating a collection.

https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#create-parameters
(scroll
down a ways)

I think integrating some overview information of this mode and how it’s
expected to work in the Ref Guide would help adoption before v10 (to find
bugs in more use cases) and expose it to more people who aren’t staying up
to date with the flow of issues and commits.

On May 2, 2024 at 4:22:03 PM, David Smiley  wrote:

> Thanks for the update and willingness to help on this journey, Justin!
> Maybe an umbrella JIRA would make sense for tracking purposes, and
> link child issues or do sub-tasks.  Perhaps the goal is "PRS enabled
> by default".
>
> I don't know if this thread is the best place to discuss it but having
> the PRS znode be ephemeral would be really beneficial to dramatically
> strengthen the goals of PRS by efficiently handling restarts.  No need
> to mark a node's replicas as down!
>
> On Thu, May 2, 2024 at 3:45 PM Justin Sweeney
>  wrote:
>
>
> We (Fullstory) have been running PRS for quite a while now with great
>
> stability and a huge performance benefit for us particularly in terms of
>
> cluster restarts. That said, our use case certainly isn't everyone's use
>
> case. We run large clusters with lots of cores so we get a particular
>
> benefit. My expectation in the current state is that as far as performance
>
> it will help some use cases without hurting any use case.
>
>
> I don't know what timing will look like but I am +1 with David on moving to
>
> PRS in Solr 10 as it would make code maintenance much better going forward.
>
> Both myself and other devs at Fullstory can definitely make contributions
>
> toward getting PRS to a state where others also are getting the performance
>
> benefits and feel comfortable with this decision.
>
>
> I can look at adding some Jira's along these lines and would be happy to
>
> discuss more along the way.
>
>
> On Thu, May 2, 2024 at 1:59 PM Houston Putman  wrote:
>
>
> > I'm all for moving towards it if it has both (or at least a good tradeoff
>
> > between):
>
> >
>
> >- A proven stability, like the current implementation
>
> >- A noted increase in performance for common use cases
>
> >
>
> > It seems to me that without the performance benefits, the loss in
> stability
>
> > (PRS has had a few bad bugs in 9x releases) is worrisome.
>
> > I'd be very happy to move to PRS if we can improve it to give us concrete
>
> > benefits, but until then I'm not in favor of making it the default.
>
> >
>
> > Maybe the ephemeral node for replica state
>
> >  is the
>
> > logic we really need to make PRS "pop", but I haven't thought about it
>
> > a ton.
>
> >
>
> > - Houston
>
> >
>
> > On Thu, May 2, 2024 at 12:52 PM David Smiley  wrote:
>
> >
>
> > > Note that PRS has existed for all of the 9x series.  I say in 10,
>
> > > let's finally move on.  Be bold.
>
> > >
>
> > > On Thu, May 2, 2024 at 1:19 PM Ilan Ginzburg 
> wrote:
>
> > > >
>
> > > > There is no plan to remove the non PRS way to manage replica state
>
> > before
>
> > > > making PRS the default way to manage replica state (in addition to
> the
>
> > > > current state.json option) then letting PRS bake for a while with all
>
> > new
>
> > > > deployments (for example a whole release), right?
>
> > > >
>
> > > > Ilan
>
> > > >
>
> > > >
>
> > > >
>
> > > > On Thu, May 2, 2024 at 6:25 PM David Smiley 
>
> > wrote:
>
> > > >
>
> > > > > In the meetup, my colleague Aparna shared her explorative findings
> of
>
> > > > > enabling PRS, which uncovered 2 matters that seem to defeat much of
>
> > > > > PRS's idealized benefits:
>
> > > > > * Shard leader elections still touch state.json
>
> > > > > * Replica state is still in state.json
>
> > > > > Given those two matters, we didn't notice any improvement (or
>
> > > > > regression).  Some FullStory devs, who use this mode in production,
>
> > > > > shared that the first matter wasn't noticed by them because they
> only
>
> > > > > run with one replica per shard.  The other... was unclear why;
> maybe
>
> > > > > for backwards-compatibility?  In my mind, there shouldn't be such a
>
> > > > > concern as it's enabled per-collection and you'd only do this once
>
> > all
>
> > > > > servers are known to be PRS-enabled (e.g. have a modern Solr
>
> > version).
>
> > > > >
>
> > > > > If we can identify JIRA issues to capture the work involved, we
> could
>
> > > > > converse more and track

Re: PRS, important changes needed

2024-05-03 Thread Justin Sweeney
I've got an umbrella Jira started here:
https://issues.apache.org/jira/browse/SOLR-17270. Feel free to add into it
as anyone comes across things.

On Fri, May 3, 2024 at 12:10 PM Cassandra Targett 
wrote:

>  I haven’t followed what PRS even is or what its benefits are supposed to
> be so can’t comment on whether it should/coud be a default. I may be alone
> in that since I haven’t had bandwidth to pay attention to very many recent
> Solr changes, but I do note that there is only one entry in the Solr Ref
> Guide for “perReplicaState” (and none for “PRS” or alternate spellings I
> could come up with), which is a single sentence describing it as a
> parameter one can set when creating a collection.
>
>
> https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#create-parameters
> (scroll
> down a ways)
>
> I think integrating some overview information of this mode and how it’s
> expected to work in the Ref Guide would help adoption before v10 (to find
> bugs in more use cases) and expose it to more people who aren’t staying up
> to date with the flow of issues and commits.
>
> On May 2, 2024 at 4:22:03 PM, David Smiley  wrote:
>
> > Thanks for the update and willingness to help on this journey, Justin!
> > Maybe an umbrella JIRA would make sense for tracking purposes, and
> > link child issues or do sub-tasks.  Perhaps the goal is "PRS enabled
> > by default".
> >
> > I don't know if this thread is the best place to discuss it but having
> > the PRS znode be ephemeral would be really beneficial to dramatically
> > strengthen the goals of PRS by efficiently handling restarts.  No need
> > to mark a node's replicas as down!
> >
> > On Thu, May 2, 2024 at 3:45 PM Justin Sweeney
> >  wrote:
> >
> >
> > We (Fullstory) have been running PRS for quite a while now with great
> >
> > stability and a huge performance benefit for us particularly in terms of
> >
> > cluster restarts. That said, our use case certainly isn't everyone's use
> >
> > case. We run large clusters with lots of cores so we get a particular
> >
> > benefit. My expectation in the current state is that as far as
> performance
> >
> > it will help some use cases without hurting any use case.
> >
> >
> > I don't know what timing will look like but I am +1 with David on moving
> to
> >
> > PRS in Solr 10 as it would make code maintenance much better going
> forward.
> >
> > Both myself and other devs at Fullstory can definitely make contributions
> >
> > toward getting PRS to a state where others also are getting the
> performance
> >
> > benefits and feel comfortable with this decision.
> >
> >
> > I can look at adding some Jira's along these lines and would be happy to
> >
> > discuss more along the way.
> >
> >
> > On Thu, May 2, 2024 at 1:59 PM Houston Putman 
> wrote:
> >
> >
> > > I'm all for moving towards it if it has both (or at least a good
> tradeoff
> >
> > > between):
> >
> > >
> >
> > >- A proven stability, like the current implementation
> >
> > >- A noted increase in performance for common use cases
> >
> > >
> >
> > > It seems to me that without the performance benefits, the loss in
> > stability
> >
> > > (PRS has had a few bad bugs in 9x releases) is worrisome.
> >
> > > I'd be very happy to move to PRS if we can improve it to give us
> concrete
> >
> > > benefits, but until then I'm not in favor of making it the default.
> >
> > >
> >
> > > Maybe the ephemeral node for replica state
> >
> > >  is
> the
> >
> > > logic we really need to make PRS "pop", but I haven't thought about it
> >
> > > a ton.
> >
> > >
> >
> > > - Houston
> >
> > >
> >
> > > On Thu, May 2, 2024 at 12:52 PM David Smiley 
> wrote:
> >
> > >
> >
> > > > Note that PRS has existed for all of the 9x series.  I say in 10,
> >
> > > > let's finally move on.  Be bold.
> >
> > > >
> >
> > > > On Thu, May 2, 2024 at 1:19 PM Ilan Ginzburg 
> > wrote:
> >
> > > > >
> >
> > > > > There is no plan to remove the non PRS way to manage replica state
> >
> > > before
> >
> > > > > making PRS the default way to manage replica state (in addition to
> > the
> >
> > > > > current state.json option) then letting PRS bake for a while with
> all
> >
> > > new
> >
> > > > > deployments (for example a whole release), right?
> >
> > > > >
> >
> > > > > Ilan
> >
> > > > >
> >
> > > > >
> >
> > > > >
> >
> > > > > On Thu, May 2, 2024 at 6:25 PM David Smiley 
> >
> > > wrote:
> >
> > > > >
> >
> > > > > > In the meetup, my colleague Aparna shared her explorative
> findings
> > of
> >
> > > > > > enabling PRS, which uncovered 2 matters that seem to defeat much
> of
> >
> > > > > > PRS's idealized benefits:
> >
> > > > > > * Shard leader elections still touch state.json
> >
> > > > > > * Replica state is still in state.json
> >
> > > > > > Given those two matters, we didn't notice any improvement (or
> >
> > > > > > regression).  Some FullStory devs, who use this mode in
> production,
> >
> 

Re: SolrJ collection creation API, replica type specificity

2024-05-03 Thread Jason Gerlowski
You didn't mention it by name, but it sounds like you're talking about
the v1 API's "replicationFactor" parameter (which has defaulted to
creating NRT replicas for awhile now)?

Personally, I'd rather see that parameter (and corresponding SolrJ
code) go away altogether.  Some things (e.g. the configset name, the
number of shards) are important enough that we force users to be
explicit about them...IMO the number and type of replicas fall into
that category.

But while the ambiguous "replicationFactor" parameter exists I think
some sort of "default replica type" concept makes sense.  (Granted
that we find a way to handle certain complications, like how "PULL"
replicas *must* be used in conjunction with replicas of other
leader-eligible replica types.)

On Wed, May 1, 2024 at 2:32 PM David Smiley  wrote:
>
> In the interests of supporting different replica types better, I'd
> like our SolrJ CollectionAdminRequest methods to not *locally* assume
> NRT when creating a request.  Calls like createCollection(collection
> name, numShards, numReplicas) are nicely ambiguous as to the type, nor
> do javadocs indicate what the type is.  This is good, I think.  Yet
> the default behavior is to create a v1 API call that specifies how
> many NRT replicas (yes of this specific type) to make.  Instead, I'd
> like to see the replica type decision made by the server (Solr).
> Today it also assumes NRT but I could imagine something as simple as
> EnvUtils (env var / sys prop) deciding what the default type should
> be.  So far this is merely changing CollectionAdminRequest and
> consequently the specificity of its v1 requests.  It'd be followed up
> by improving a number of tests to be less specific to NRT.  Any
> concerns here?
>
> *Actually* using other replica types (like TLOG or ZERO) may raise
> issues for some tests beyond this, sure.  In particular, many tests
> assume read-your-write (index, commit, query --> find it) but this
> won't hold if the query randomly routes to a non-leader.  For this I'm
> thinking an automatically applied
> shards.preference=replica.leader:true
> https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> -- only when the default replica type isn't NRT.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: SolrJ collection creation API, replica type specificity

2024-05-03 Thread David Smiley
I totally understand that the client should be empowered to be
specific, and it is right now.  But I also think we should support the
client being unspecific, and instead allow Solr service owners via
Solr-side configuration to choose what makes sense.  Where I work,
there are different teams between client and server, the
people/service at the client don't care about Solr infrastructure
specifics and new-fangled options (PRS being another) and replica
types.  Updating their client to tweak options around this is
annoying.  They just want a collection to be created, even with an
assumed configSet as this Solr cluster is only for servicing the needs
of that client .  The Solr service owner (me) is responsible for Solr
specifics.  One could image for one app, assume TLOG and PULL and for
another, both NRT, or whatever really.

On Fri, May 3, 2024 at 2:58 PM Jason Gerlowski  wrote:
>
> You didn't mention it by name, but it sounds like you're talking about
> the v1 API's "replicationFactor" parameter (which has defaulted to
> creating NRT replicas for awhile now)?
>
> Personally, I'd rather see that parameter (and corresponding SolrJ
> code) go away altogether.  Some things (e.g. the configset name, the
> number of shards) are important enough that we force users to be
> explicit about them...IMO the number and type of replicas fall into
> that category.
>
> But while the ambiguous "replicationFactor" parameter exists I think
> some sort of "default replica type" concept makes sense.  (Granted
> that we find a way to handle certain complications, like how "PULL"
> replicas *must* be used in conjunction with replicas of other
> leader-eligible replica types.)
>
> On Wed, May 1, 2024 at 2:32 PM David Smiley  wrote:
> >
> > In the interests of supporting different replica types better, I'd
> > like our SolrJ CollectionAdminRequest methods to not *locally* assume
> > NRT when creating a request.  Calls like createCollection(collection
> > name, numShards, numReplicas) are nicely ambiguous as to the type, nor
> > do javadocs indicate what the type is.  This is good, I think.  Yet
> > the default behavior is to create a v1 API call that specifies how
> > many NRT replicas (yes of this specific type) to make.  Instead, I'd
> > like to see the replica type decision made by the server (Solr).
> > Today it also assumes NRT but I could imagine something as simple as
> > EnvUtils (env var / sys prop) deciding what the default type should
> > be.  So far this is merely changing CollectionAdminRequest and
> > consequently the specificity of its v1 requests.  It'd be followed up
> > by improving a number of tests to be less specific to NRT.  Any
> > concerns here?
> >
> > *Actually* using other replica types (like TLOG or ZERO) may raise
> > issues for some tests beyond this, sure.  In particular, many tests
> > assume read-your-write (index, commit, query --> find it) but this
> > won't hold if the query randomly routes to a non-leader.  For this I'm
> > thinking an automatically applied
> > shards.preference=replica.leader:true
> > https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-preference-parameter
> > -- only when the default replica type isn't NRT.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org