Thanks you guys for the review. I will revert the GEODE-3520 ticket to reflect that invalidate should happen for both synchronous and asynchronous index maintenance. Also, I will resend the PR which reflects the changes mentioned in the ticket
Regards Nabarun Nag On Thu, Sep 14, 2017 at 5:55 PM Anilkumar Gingade <aging...@pivotal.io> wrote: > Dan, you are right...Thanks to Jason, myself and Jason looked into the code > to see if index is updated before the event/change is saved/injected into > the region...It looks like the index update are happening after the region > change/update is saved. Moving the index update before that is not an easy > task... > > For time, when there is any problem with index update, we can proceed with > invalidating the indexes...But we really need to look at making region and > index updates in a transactional way, silently invalidating indexes may not > be acceptable... > > -Anil. > > > > > On Thu, Sep 14, 2017 at 1:12 PM, Dan Smith <dsm...@pivotal.io> wrote: > > > I'm still going to push that we stick with Naba's original proposal. > > > > The current behavior is clearly broken. If one index update fails, an > > exception gets thrown to the user (nice!) but it leaves the put in a > > partially completed state - some other indexes may not have been updated, > > WAN/AEQs may not have been notified, etc. > > > > We should never leave the system in this corrupted state. It would be > nice > > to be able to cleanly rollback the put, but we don't have that capability > > especially considering that cache writers have already been invoked. So > the > > next best thing is to invalidate the index that failed to update. > > > > Logging an error an allowing the put to succeed does match what we do > with > > CacheListeners. Exceptions from CacheListeners do not fail the put. > > > > -Dan > > > > On Mon, Sep 11, 2017 at 3:29 PM, Jason Huynh <jhu...@pivotal.io> wrote: > > > > > Anil, we actually do have a case where the index is out of sync with > the > > > region currently. It's just not likely to happen but if there is an > > > exception from an index, the end result is that certain indexes get > > updated > > > and the region has already been updated. > > > However the exception is thrown back to the putter, so it becomes very > > > obvious something is wrong but I believe Naba has updated the ticket to > > > show a test that reproduces the problem... > > > > > > > > > On Mon, Sep 11, 2017 at 2:50 PM Anilkumar Gingade <aging...@pivotal.io > > > > > wrote: > > > > > > > The other way to look at it is; what happens to a cache op; when > there > > is > > > > an exception after Region.Entry is created? can it happen? In that > > case, > > > do > > > > we stick the entry into the Cache or not? If an exception is handled, > > how > > > > is it done, can we look at using the same for Index... > > > > > > > > Also previously, once the valid index is created (verified during > > create > > > or > > > > first put into the cache); we never had any issue where index is out > of > > > > sync with cache...If that changes with new futures (security?) then > we > > > may > > > > have to change the expectation with indexing... > > > > > > > > -Anil. > > > > > > > > > > > > > > > > On Mon, Sep 11, 2017 at 2:16 PM, Anthony Baker <aba...@pivotal.io> > > > wrote: > > > > > > > > > I’m confused. Once a cache update has been distributed to other > > > members > > > > > it can’t be undone. That update could have triggered myriad other > > > > > application behaviors. > > > > > > > > > > Anthony > > > > > > > > > > > On Sep 11, 2017, at 2:04 PM, Michael Stolz <mst...@pivotal.io> > > > wrote: > > > > > > > > > > > > Great, that's exactly the behavior I would expect. > > > > > > > > > > > > Thanks. > > > > > > > > > > > > -- > > > > > > Mike Stolz > > > > > > Principal Engineer, GemFire Product Manager > > > > > > Mobile: +1-631-835-4771 <(631)%20835-4771> > > > > > > > > > > > > On Mon, Sep 11, 2017 at 4:34 PM, Jason Huynh <jhu...@pivotal.io> > > > > wrote: > > > > > > > > > > > >> Hi Mike, I think the concern was less about the security portion > > but > > > > > rather > > > > > >> if any exception occurs during index update, right now, the > region > > > > gets > > > > > >> updated and the rest of the system (index/wan/callbacks) may or > > may > > > > not > > > > > be > > > > > >> updated. I think Naba just tried to provide an example where > this > > > > might > > > > > >> occur, but that specific scenario is invalid. > > > > > >> > > > > > >> I believe Nabarun has opened a ticket for rolling back the put > > > > operation > > > > > >> when an index exception occurs. GEODE-3589. It can probably be > > > > > modified to > > > > > >> state any exception instead of index exceptions. > > > > > >> > > > > > >> To summarize my understanding: > > > > > >> -Someone will need to implement the rollback for GEODE-3589. > This > > > > means > > > > > >> that if any exception occurs during a put, geode it will > propagate > > > > back > > > > > to > > > > > >> the user and it is expected the rollback mechanism will clean up > > any > > > > > >> partial put. > > > > > >> > > > > > >> GEODE-3520 should be modified to: > > > > > >> -Add the isValid() api to index interface > > > > > >> -Mark an index as invalid during async index updates but not for > > > > > >> synchronous index updates. The synchronous index updates will > > rely > > > > on a > > > > > >> rollback mechanism > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Mon, Sep 11, 2017 at 1:23 PM Michael Stolz < > mst...@pivotal.io> > > > > > wrote: > > > > > >> > > > > > >>> I think there was an intention of having CREATION of an index > > > > require a > > > > > >>> higher privilege than DATA:WRITE, but it shouldn't affect > > applying > > > > the > > > > > >>> index on either of put or get operations. > > > > > >>> > > > > > >>> If we are requiring something like CLUSTER:MANAGE for put on an > > > > indexed > > > > > >>> region, that is an incorrect requirement. Only DATA:WRITE > should > > be > > > > > >>> required to put an entry and have it be indexed if an index is > > > > present. > > > > > >>> > > > > > >>> -- > > > > > >>> Mike Stolz > > > > > >>> Principal Engineer, GemFire Product Manager > > > > > >>> Mobile: +1-631-835-4771 <(631)%20835-4771> <(631)%20835-4771> > > > > > >>> > > > > > >>> On Fri, Sep 8, 2017 at 6:04 PM, Anilkumar Gingade < > > > > aging...@pivotal.io > > > > > > > > > > > >>> wrote: > > > > > >>> > > > > > >>>> Indexes are critical for querying; most of the databases > doesn't > > > > allow > > > > > >>>> insert/update if there is any failure with index > maintenance... > > > > > >>>> > > > > > >>>> As Geode OQL supports two ways (sync and async) to maintain > the > > > > > >> indexes, > > > > > >>> we > > > > > >>>> need be careful about the error handling in both cases... > > > > > >>>> > > > > > >>>> My take is: > > > > > >>>> 1. For synchronous index maintenance: > > > > > >>>> If there is any failure in updating any index (security/auth > or > > > > > logical > > > > > >>>> error) on the region; throw an exception and rollback the > cache > > > > > >> update/op > > > > > >>>> (index management id done under region.entry lock - we should > be > > > > able > > > > > >> to > > > > > >>>> revert the op). If index or cache is left in bad state, then > > its a > > > > bug > > > > > >>> that > > > > > >>>> needs to be addressed. > > > > > >>>> > > > > > >>>> Most of the time, If there is any logical error in index, it > > will > > > be > > > > > >>>> detected as soon as index is created (on existing data) or > when > > > > first > > > > > >>>> update is done to the cache. > > > > > >>>> > > > > > >>>> 2. For Asynchronous index maintenance: > > > > > >>>> As this is async (assuming) user has good understanding of the > > > risk > > > > > >>>> involved with async, any error with index maintenance, the > index > > > > > should > > > > > >>> be > > > > > >>>> invalidated... > > > > > >>>> > > > > > >>>> About the security/auth, the user permission with region > > > read/write > > > > > >>> needs > > > > > >>>> to be applied for index updates, there should not be different > > > > > >> permission > > > > > >>>> on index. > > > > > >>>> > > > > > >>>> -Anil. > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> On Fri, Sep 8, 2017 at 2:01 PM, Nabarun Nag <n...@pivotal.io> > > > > wrote: > > > > > >>>> > > > > > >>>>> Hi Mike, > > > > > >>>>> > > > > > >>>>> Please do find our answers below: > > > > > >>>>> *Question:* What if there were multiple indices that were in > > > flight > > > > > >> and > > > > > >>>>> only the third > > > > > >>>>> one errors out, will they all be marked invalid? > > > > > >>>>> > > > > > >>>>> *Answer:* Only the third will be marked invalid and only the > > > third > > > > > >> one > > > > > >>>> will > > > > > >>>>> not be used for query execution. > > > > > >>>>> > > > > > >>>>> *Question/Statement:* If anything goes wrong with the put it > > > should > > > > > >>>>> probably still throw back to > > > > > >>>>> the caller. Silent invalidation of the index is probably not > > > > > >> desirable. > > > > > >>>>> > > > > > >>>>> *Answer: * > > > > > >>>>> In our current design this the flow of execution of a put > > > > operation: > > > > > >>>>> entry put into region -> update index -> other wan related > > > > > >> executions / > > > > > >>>>> callbacks etc. > > > > > >>>>> > > > > > >>>>> If an exception happens while updating the index, the cache > > gets > > > > > >> into a > > > > > >>>> bad > > > > > >>>>> state, and we may end up getting different results depending > on > > > the > > > > > >>> index > > > > > >>>>> we are using. As the failure happens half way in a put > > operation, > > > > the > > > > > >>>>> regions / cache are now in a bad state. > > > > > >>>>> -------------------------- > > > > > >>>>> We are thinking that if index is created over a method > > > invocation > > > > in > > > > > >>> an > > > > > >>>>> empty region and then we do puts, but method invocation is > not > > > > > >> allowed > > > > > >>> as > > > > > >>>>> per security policies. The puts will now be successful but > the > > > > index > > > > > >>> will > > > > > >>>>> be rendered invalid. Previously the puts will fail with > > exception > > > > and > > > > > >>> put > > > > > >>>>> the entire cache in a bad state. > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> Regards > > > > > >>>>> Nabarun > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> On Fri, Sep 8, 2017 at 10:43 AM Michael Stolz < > > mst...@pivotal.io > > > > > > > > > >>> wrote: > > > > > >>>>> > > > > > >>>>>> Just to help me understand, the index is corrupted in a way > > > beyond > > > > > >>> just > > > > > >>>>> the > > > > > >>>>>> field that errors out? > > > > > >>>>>> What if there were multiple indices that were in flight and > > only > > > > > >> the > > > > > >>>>> third > > > > > >>>>>> one errors out, will they all be marked invalid? > > > > > >>>>>> If anything goes wrong with the put it should probably still > > > throw > > > > > >>> back > > > > > >>>>> to > > > > > >>>>>> the caller. Silent invalidation of the index is probably not > > > > > >>> desirable. > > > > > >>>>>> > > > > > >>>>>> -- > > > > > >>>>>> Mike Stolz > > > > > >>>>>> Principal Engineer, GemFire Product Manager > > > > > >>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771> > <(631)%20835-4771> > > > > <(631)%20835-4771> > > > > > >>>>>> > > > > > >>>>>> On Fri, Sep 8, 2017 at 12:34 PM, Dan Smith < > dsm...@pivotal.io > > > > > > > > >>> wrote: > > > > > >>>>>> > > > > > >>>>>>> +1 > > > > > >>>>>>> > > > > > >>>>>>> -Dan > > > > > >>>>>>> > > > > > >>>>>>> On Thu, Sep 7, 2017 at 9:14 PM, Nabarun Nag < > n...@apache.org > > > > > > > > >>> wrote: > > > > > >>>>>>> > > > > > >>>>>>>> *Proposal:* > > > > > >>>>>>>> * Index interface will include an API - isValid() which > will > > > > > >>> return > > > > > >>>>>> true > > > > > >>>>>>> if > > > > > >>>>>>>> the index is still valid / uncorrupted, else will return > > false > > > > > >> if > > > > > >>>> it > > > > > >>>>>>>> corrupted / invalid. > > > > > >>>>>>>> * gfsh command "list index" will have one more column "Is > > > > > >> Valid" > > > > > >>>>> which > > > > > >>>>>>> will > > > > > >>>>>>>> state the status as "true" or "false". > > > > > >>>>>>>> * Invalid indexes will not be used during query > executions. > > > > > >>>>>>>> * In case the index is found to be invalid, the user will > be > > > > > >> able > > > > > >>>> to > > > > > >>>>>>>> remove/destroy the index. > > > > > >>>>>>>> * When a put operation corrupts an index, it will be > logged. > > > > > >>>>>>>> > > > > > >>>>>>>> *Reasoning:* > > > > > >>>>>>>> * Currently if a put operation raises an exception while > > > > > >> updating > > > > > >>>> the > > > > > >>>>>>>> index, the put operation fails with an exception to the > > > putter. > > > > > >>>>>>>> * For example, if an index is created on an object method, > > and > > > > > >>> that > > > > > >>>>>>> method > > > > > >>>>>>>> causes an exception while updating the index as a part of > a > > > put > > > > > >>>>>>> operation, > > > > > >>>>>>>> then the put operation fails for that particular entry and > > the > > > > > >>>> index > > > > > >>>>> is > > > > > >>>>>>>> left in a bad state. > > > > > >>>>>>>> * This may occur also due to lack of permission to update > > > index > > > > > >>> but > > > > > >>>>>> have > > > > > >>>>>>>> permission to do puts. > > > > > >>>>>>>> * We are proposing that in the above mentioned scenarios, > > the > > > > > >> put > > > > > >>>>>>> succeeds > > > > > >>>>>>>> in putting the entry in the region but the index which it > > was > > > > > >>>> trying > > > > > >>>>> to > > > > > >>>>>>>> update will be marked invalid and will not be used for > query > > > > > >>>>>> executions. > > > > > >>>>>>>> * This can be justified because the corrupted index will > not > > > > > >>>>> correctly > > > > > >>>>>>>> represent the region entries. > > > > > >>>>>>>> > > > > > >>>>>>>> *Status Quo:* > > > > > >>>>>>>> * Index creation will still fail if we are trying to > create > > an > > > > > >>>> index > > > > > >>>>>>> over a > > > > > >>>>>>>> region containing an entry/entries which will cause an > > > > > >> exception > > > > > >>>>> while > > > > > >>>>>>>> loading the entry into the index. > > > > > >>>>>>>> > > > > > >>>>>>>> Regards > > > > > >>>>>>>> Nabarun Nag > > > > > >>>>>>>> > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>>> > > > > > >>>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > > > > > >