Re: Custom update processor and race condition with concurrent requests

Sachin Divekar Wed, 04 Mar 2020 10:56:26 -0800

Thanks, Chris.

I think I should stop thinking about doing it in Solr. Anyway, I was just
trying to see how far I can go.


On Wed, Mar 4, 2020 at 11:50 PM Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
> : So, I thought it can be simplified by moving this state transitions and
> : processing logic into Solr by writing a custom update processor. The idea
> : occurred to me when I was thinking about Solr serializing multiple
> : concurrent requests for a document on the leader replica. So, my thought
> : process was if I am getting this serialization for free I can implement
> the
> : entire processing inside Solr and a dumb client to push records to Solr
> : would be sufficient. But, that's not working. Perhaps the point I missed
> is
> : that even though this processing is moved inside Solr I still have a race
> : condition because of time-of-check to time-of-update gap.
>
> Correct.  Solr is (hand wavy) "locking" updates to documents by id on the
> leader node to ensure they are transactional, but that locking happens
> inside DistributedUpdateProcessor, other update processors don't run
> "inside" that lock.
>

Understood. I was not thinking clearly about locking.


>
> : While writing this it just occurred to me that I'm running my custom
> update
> : processor before DistributedProcessor. I'm committing the same XY crime
> : again but if I run it after DistributedProcessor can this race condition
> be
> : avoided?
>
> no.  that would just introduce a whole new host of problems that are a
> much more ivolved conversation to get into (remeber: the processors after
> DUH run on every replica, after the leader has already assigned a
> version and said this update should go thorugh ... so now imagine what
> your error handling logic has to look like?)
>

I completely missed that post-processors run on every replica. It will be
too convoluted to implement.


>
>
> Ultimately the goal that you're talking about really feels like "business
> logic that requires syncronizing/blocking updates" but you're trying to
> avoid writing a syncronized client to do that syncronization and error
> handling before forwarding those updates to solr.
>
> I mean -- even with your explanation of your goal, there is a whole host
> of nuance / use case specific logic that has to go into "based on various
> conflicts it modifies the records for which update failed" -- and that
> logic seems like it would affect the locking: if you get a request that
> violates the legal state transition because of another request that
> (blocked it until it) just finished .... now what?  fail? apply some new
> rules?
>
> this seems like logic you should really want in a "middle ware" layer that
> your clients talk to and sends docs to solr.
>
> If you *REALLY* want to try and piggy back this logic into solr, then
> there is _one_ place i can think of where you can "hook in" to the logic
> DistributedUpdateHandler does while "locking" an id on the leader, and
> that would be extending the AtomicUpdateDocumentMerger...
>
> It's marked experimental, and I don't really understand the use cases
> for why it exists, and in order to customize this you would have to
> also subclass DistributedUpdateHandlerFactory to build your custom
> instance and pass it to the DUH constructor, but then -- in theory -- you
> could intercept any document update *after* the RTG, and before it's
> written to the TLOG, and apply some business logic.
>
> But i wouldn't recommend this ... "the'r be Dragons!"
>

Thanks for this explanation. Yes, that's too dangerous and really not worth
the effort.

I think I am concluding this exercise now. I will stick to my older
implementation where I am handling state transitions on the
clientside using optimistic locking.

--
Sachin

Re: Custom update processor and race condition with concurrent requests

Reply via email to