Thanks, Chris. I think I should stop thinking about doing it in Solr. Anyway, I was just trying to see how far I can go.
On Wed, Mar 4, 2020 at 11:50 PM Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : So, I thought it can be simplified by moving this state transitions and > : processing logic into Solr by writing a custom update processor. The idea > : occurred to me when I was thinking about Solr serializing multiple > : concurrent requests for a document on the leader replica. So, my thought > : process was if I am getting this serialization for free I can implement > the > : entire processing inside Solr and a dumb client to push records to Solr > : would be sufficient. But, that's not working. Perhaps the point I missed > is > : that even though this processing is moved inside Solr I still have a race > : condition because of time-of-check to time-of-update gap. > > Correct. Solr is (hand wavy) "locking" updates to documents by id on the > leader node to ensure they are transactional, but that locking happens > inside DistributedUpdateProcessor, other update processors don't run > "inside" that lock. > Understood. I was not thinking clearly about locking. > > : While writing this it just occurred to me that I'm running my custom > update > : processor before DistributedProcessor. I'm committing the same XY crime > : again but if I run it after DistributedProcessor can this race condition > be > : avoided? > > no. that would just introduce a whole new host of problems that are a > much more ivolved conversation to get into (remeber: the processors after > DUH run on every replica, after the leader has already assigned a > version and said this update should go thorugh ... so now imagine what > your error handling logic has to look like?) > I completely missed that post-processors run on every replica. It will be too convoluted to implement. > > > Ultimately the goal that you're talking about really feels like "business > logic that requires syncronizing/blocking updates" but you're trying to > avoid writing a syncronized client to do that syncronization and error > handling before forwarding those updates to solr. > > I mean -- even with your explanation of your goal, there is a whole host > of nuance / use case specific logic that has to go into "based on various > conflicts it modifies the records for which update failed" -- and that > logic seems like it would affect the locking: if you get a request that > violates the legal state transition because of another request that > (blocked it until it) just finished .... now what? fail? apply some new > rules? > > this seems like logic you should really want in a "middle ware" layer that > your clients talk to and sends docs to solr. > > If you *REALLY* want to try and piggy back this logic into solr, then > there is _one_ place i can think of where you can "hook in" to the logic > DistributedUpdateHandler does while "locking" an id on the leader, and > that would be extending the AtomicUpdateDocumentMerger... > > It's marked experimental, and I don't really understand the use cases > for why it exists, and in order to customize this you would have to > also subclass DistributedUpdateHandlerFactory to build your custom > instance and pass it to the DUH constructor, but then -- in theory -- you > could intercept any document update *after* the RTG, and before it's > written to the TLOG, and apply some business logic. > > But i wouldn't recommend this ... "the'r be Dragons!" > Thanks for this explanation. Yes, that's too dangerous and really not worth the effort. I think I am concluding this exercise now. I will stick to my older implementation where I am handling state transitions on the clientside using optimistic locking. -- Sachin