Thanks for the explanation Naba, please find my reply below: 1. Debugging: If we log a warning, that should get noticed immediately, so I don't think we need to worry about logs rolling. 2. Performance for a single put: We can always schedule an async task to drop the index.
On Wed, Sep 20, 2017 at 3:49 PM Nabarun Nag <n...@apache.org> wrote: > Hi Swapnil, > There were few factors we considered before going with just invalidating > the index rather than destroying the index. > 1. Debugging reasons : If the indexes were destroyed and logs roll over, > and suddenly we see that indexes have disappeared, it will be tough to > differentiate between whether the indexes were created improperly in the > first place [/ restarts] or if a put corrupted it hence it was destroyed. > In case of marking it invalid, we know for sure that a put has corrupted > the index and prevent confusion for the user. > > 2. Performance perspective : We were worried that if a put corrupts > multiple indexes and then that put is also responsible for destroying those > indexes may slow down the execution as it will have to acquire/release > locks to destroy indexes [especially in case of PR indexes]. We were also > worried about race conditions arising in that case. > > Regards > Nabarun > > On Wed, Sep 20, 2017 at 2:59 PM Swapnil Bawaskar <sbawas...@pivotal.io> > wrote: > > > Sorry for not reading this thread earlier, but I was wondering what is > the > > point of just invalidating the index and having it lie around if it is > not > > going to be used? > > Can we just drop the index instead, and log a warning message to that > > effect? This will free up the memory used by the index and will > proactively > > let the admin/user know what happened. > > > > On Wed, Sep 20, 2017 at 9:59 AM Nabarun Nag <n...@apache.org> wrote: > > > > > The PR #768 has been created for this issue and also GEODE-3520 has > been > > > changed to reflect this requirement. > > > > > > Regards > > > Nabarun > > > > > > On Thu, Sep 14, 2017 at 5:29 PM Nabarun Nag <n...@apache.org> wrote: > > > > > > > Thanks you guys for the review. I will revert the GEODE-3520 ticket > to > > > > reflect that invalidate should happen for both synchronous and > > > asynchronous > > > > index maintenance. > > > > Also, I will resend the PR which reflects the changes mentioned in > the > > > > ticket > > > > > > > > Regards > > > > Nabarun Nag > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 14, 2017 at 5:55 PM Anilkumar Gingade < > aging...@pivotal.io > > > > > > > wrote: > > > > > > > >> Dan, you are right...Thanks to Jason, myself and Jason looked into > the > > > >> code > > > >> to see if index is updated before the event/change is saved/injected > > > into > > > >> the region...It looks like the index update are happening after the > > > region > > > >> change/update is saved. Moving the index update before that is not > an > > > easy > > > >> task... > > > >> > > > >> For time, when there is any problem with index update, we can > proceed > > > with > > > >> invalidating the indexes...But we really need to look at making > region > > > and > > > >> index updates in a transactional way, silently invalidating indexes > > may > > > >> not > > > >> be acceptable... > > > >> > > > >> -Anil. > > > >> > > > >> > > > >> > > > >> > > > >> On Thu, Sep 14, 2017 at 1:12 PM, Dan Smith <dsm...@pivotal.io> > wrote: > > > >> > > > >> > I'm still going to push that we stick with Naba's original > proposal. > > > >> > > > > >> > The current behavior is clearly broken. If one index update fails, > > an > > > >> > exception gets thrown to the user (nice!) but it leaves the put > in a > > > >> > partially completed state - some other indexes may not have been > > > >> updated, > > > >> > WAN/AEQs may not have been notified, etc. > > > >> > > > > >> > We should never leave the system in this corrupted state. It would > > be > > > >> nice > > > >> > to be able to cleanly rollback the put, but we don't have that > > > >> capability > > > >> > especially considering that cache writers have already been > invoked. > > > So > > > >> the > > > >> > next best thing is to invalidate the index that failed to update. > > > >> > > > > >> > Logging an error an allowing the put to succeed does match what we > > do > > > >> with > > > >> > CacheListeners. Exceptions from CacheListeners do not fail the > put. > > > >> > > > > >> > -Dan > > > >> > > > > >> > On Mon, Sep 11, 2017 at 3:29 PM, Jason Huynh <jhu...@pivotal.io> > > > wrote: > > > >> > > > > >> > > Anil, we actually do have a case where the index is out of sync > > with > > > >> the > > > >> > > region currently. It's just not likely to happen but if there > is > > an > > > >> > > exception from an index, the end result is that certain indexes > > get > > > >> > updated > > > >> > > and the region has already been updated. > > > >> > > However the exception is thrown back to the putter, so it > becomes > > > very > > > >> > > obvious something is wrong but I believe Naba has updated the > > ticket > > > >> to > > > >> > > show a test that reproduces the problem... > > > >> > > > > > >> > > > > > >> > > On Mon, Sep 11, 2017 at 2:50 PM Anilkumar Gingade < > > > >> aging...@pivotal.io> > > > >> > > wrote: > > > >> > > > > > >> > > > The other way to look at it is; what happens to a cache op; > when > > > >> there > > > >> > is > > > >> > > > an exception after Region.Entry is created? can it happen? In > > that > > > >> > case, > > > >> > > do > > > >> > > > we stick the entry into the Cache or not? If an exception is > > > >> handled, > > > >> > how > > > >> > > > is it done, can we look at using the same for Index... > > > >> > > > > > > >> > > > Also previously, once the valid index is created (verified > > during > > > >> > create > > > >> > > or > > > >> > > > first put into the cache); we never had any issue where index > is > > > >> out of > > > >> > > > sync with cache...If that changes with new futures (security?) > > > then > > > >> we > > > >> > > may > > > >> > > > have to change the expectation with indexing... > > > >> > > > > > > >> > > > -Anil. > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > On Mon, Sep 11, 2017 at 2:16 PM, Anthony Baker < > > aba...@pivotal.io > > > > > > > >> > > wrote: > > > >> > > > > > > >> > > > > I’m confused. Once a cache update has been distributed to > > other > > > >> > > members > > > >> > > > > it can’t be undone. That update could have triggered myriad > > > other > > > >> > > > > application behaviors. > > > >> > > > > > > > >> > > > > Anthony > > > >> > > > > > > > >> > > > > > On Sep 11, 2017, at 2:04 PM, Michael Stolz < > > mst...@pivotal.io > > > > > > > >> > > wrote: > > > >> > > > > > > > > >> > > > > > Great, that's exactly the behavior I would expect. > > > >> > > > > > > > > >> > > > > > Thanks. > > > >> > > > > > > > > >> > > > > > -- > > > >> > > > > > Mike Stolz > > > >> > > > > > Principal Engineer, GemFire Product Manager > > > >> > > > > > Mobile: +1-631-835-4771 <(631)%20835-4771> > <(631)%20835-4771> > > <(631)%20835-4771> > > > <(631)%20835-4771> <(631)%20835-4771> > > > >> > > > > > > > > >> > > > > > On Mon, Sep 11, 2017 at 4:34 PM, Jason Huynh < > > > jhu...@pivotal.io > > > >> > > > > >> > > > wrote: > > > >> > > > > > > > > >> > > > > >> Hi Mike, I think the concern was less about the security > > > >> portion > > > >> > but > > > >> > > > > rather > > > >> > > > > >> if any exception occurs during index update, right now, > the > > > >> region > > > >> > > > gets > > > >> > > > > >> updated and the rest of the system (index/wan/callbacks) > > may > > > or > > > >> > may > > > >> > > > not > > > >> > > > > be > > > >> > > > > >> updated. I think Naba just tried to provide an example > > where > > > >> this > > > >> > > > might > > > >> > > > > >> occur, but that specific scenario is invalid. > > > >> > > > > >> > > > >> > > > > >> I believe Nabarun has opened a ticket for rolling back > the > > > put > > > >> > > > operation > > > >> > > > > >> when an index exception occurs. GEODE-3589. It can > > probably > > > be > > > >> > > > > modified to > > > >> > > > > >> state any exception instead of index exceptions. > > > >> > > > > >> > > > >> > > > > >> To summarize my understanding: > > > >> > > > > >> -Someone will need to implement the rollback for > > GEODE-3589. > > > >> This > > > >> > > > means > > > >> > > > > >> that if any exception occurs during a put, geode it will > > > >> propagate > > > >> > > > back > > > >> > > > > to > > > >> > > > > >> the user and it is expected the rollback mechanism will > > clean > > > >> up > > > >> > any > > > >> > > > > >> partial put. > > > >> > > > > >> > > > >> > > > > >> GEODE-3520 should be modified to: > > > >> > > > > >> -Add the isValid() api to index interface > > > >> > > > > >> -Mark an index as invalid during async index updates but > > not > > > >> for > > > >> > > > > >> synchronous index updates. The synchronous index updates > > > will > > > >> > rely > > > >> > > > on a > > > >> > > > > >> rollback mechanism > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> On Mon, Sep 11, 2017 at 1:23 PM Michael Stolz < > > > >> mst...@pivotal.io> > > > >> > > > > wrote: > > > >> > > > > >> > > > >> > > > > >>> I think there was an intention of having CREATION of an > > > index > > > >> > > > require a > > > >> > > > > >>> higher privilege than DATA:WRITE, but it shouldn't > affect > > > >> > applying > > > >> > > > the > > > >> > > > > >>> index on either of put or get operations. > > > >> > > > > >>> > > > >> > > > > >>> If we are requiring something like CLUSTER:MANAGE for > put > > on > > > >> an > > > >> > > > indexed > > > >> > > > > >>> region, that is an incorrect requirement. Only > DATA:WRITE > > > >> should > > > >> > be > > > >> > > > > >>> required to put an entry and have it be indexed if an > > index > > > is > > > >> > > > present. > > > >> > > > > >>> > > > >> > > > > >>> -- > > > >> > > > > >>> Mike Stolz > > > >> > > > > >>> Principal Engineer, GemFire Product Manager > > > >> > > > > >>> Mobile: +1-631-835-4771 <(631)%20835-4771> > <(631)%20835-4771> > > <(631)%20835-4771> > > > <(631)%20835-4771> > > > >> <(631)%20835-4771> <(631)%20835-4771> > > > >> > > > > >>> > > > >> > > > > >>> On Fri, Sep 8, 2017 at 6:04 PM, Anilkumar Gingade < > > > >> > > > aging...@pivotal.io > > > >> > > > > > > > > >> > > > > >>> wrote: > > > >> > > > > >>> > > > >> > > > > >>>> Indexes are critical for querying; most of the > databases > > > >> doesn't > > > >> > > > allow > > > >> > > > > >>>> insert/update if there is any failure with index > > > >> maintenance... > > > >> > > > > >>>> > > > >> > > > > >>>> As Geode OQL supports two ways (sync and async) to > > maintain > > > >> the > > > >> > > > > >> indexes, > > > >> > > > > >>> we > > > >> > > > > >>>> need be careful about the error handling in both > cases... > > > >> > > > > >>>> > > > >> > > > > >>>> My take is: > > > >> > > > > >>>> 1. For synchronous index maintenance: > > > >> > > > > >>>> If there is any failure in updating any index > > > (security/auth > > > >> or > > > >> > > > > logical > > > >> > > > > >>>> error) on the region; throw an exception and rollback > the > > > >> cache > > > >> > > > > >> update/op > > > >> > > > > >>>> (index management id done under region.entry lock - we > > > >> should be > > > >> > > > able > > > >> > > > > >> to > > > >> > > > > >>>> revert the op). If index or cache is left in bad state, > > > then > > > >> > its a > > > >> > > > bug > > > >> > > > > >>> that > > > >> > > > > >>>> needs to be addressed. > > > >> > > > > >>>> > > > >> > > > > >>>> Most of the time, If there is any logical error in > index, > > > it > > > >> > will > > > >> > > be > > > >> > > > > >>>> detected as soon as index is created (on existing data) > > or > > > >> when > > > >> > > > first > > > >> > > > > >>>> update is done to the cache. > > > >> > > > > >>>> > > > >> > > > > >>>> 2. For Asynchronous index maintenance: > > > >> > > > > >>>> As this is async (assuming) user has good understanding > > of > > > >> the > > > >> > > risk > > > >> > > > > >>>> involved with async, any error with index maintenance, > > the > > > >> index > > > >> > > > > should > > > >> > > > > >>> be > > > >> > > > > >>>> invalidated... > > > >> > > > > >>>> > > > >> > > > > >>>> About the security/auth, the user permission with > region > > > >> > > read/write > > > >> > > > > >>> needs > > > >> > > > > >>>> to be applied for index updates, there should not be > > > >> different > > > >> > > > > >> permission > > > >> > > > > >>>> on index. > > > >> > > > > >>>> > > > >> > > > > >>>> -Anil. > > > >> > > > > >>>> > > > >> > > > > >>>> > > > >> > > > > >>>> > > > >> > > > > >>>> On Fri, Sep 8, 2017 at 2:01 PM, Nabarun Nag < > > > n...@pivotal.io > > > >> > > > > >> > > > wrote: > > > >> > > > > >>>> > > > >> > > > > >>>>> Hi Mike, > > > >> > > > > >>>>> > > > >> > > > > >>>>> Please do find our answers below: > > > >> > > > > >>>>> *Question:* What if there were multiple indices that > > were > > > in > > > >> > > flight > > > >> > > > > >> and > > > >> > > > > >>>>> only the third > > > >> > > > > >>>>> one errors out, will they all be marked invalid? > > > >> > > > > >>>>> > > > >> > > > > >>>>> *Answer:* Only the third will be marked invalid and > only > > > the > > > >> > > third > > > >> > > > > >> one > > > >> > > > > >>>> will > > > >> > > > > >>>>> not be used for query execution. > > > >> > > > > >>>>> > > > >> > > > > >>>>> *Question/Statement:* If anything goes wrong with the > > put > > > it > > > >> > > should > > > >> > > > > >>>>> probably still throw back to > > > >> > > > > >>>>> the caller. Silent invalidation of the index is > probably > > > not > > > >> > > > > >> desirable. > > > >> > > > > >>>>> > > > >> > > > > >>>>> *Answer: * > > > >> > > > > >>>>> In our current design this the flow of execution of a > > put > > > >> > > > operation: > > > >> > > > > >>>>> entry put into region -> update index -> other wan > > related > > > >> > > > > >> executions / > > > >> > > > > >>>>> callbacks etc. > > > >> > > > > >>>>> > > > >> > > > > >>>>> If an exception happens while updating the index, the > > > cache > > > >> > gets > > > >> > > > > >> into a > > > >> > > > > >>>> bad > > > >> > > > > >>>>> state, and we may end up getting different results > > > >> depending on > > > >> > > the > > > >> > > > > >>> index > > > >> > > > > >>>>> we are using. As the failure happens half way in a put > > > >> > operation, > > > >> > > > the > > > >> > > > > >>>>> regions / cache are now in a bad state. > > > >> > > > > >>>>> -------------------------- > > > >> > > > > >>>>> We are thinking that if index is created over a > method > > > >> > > invocation > > > >> > > > in > > > >> > > > > >>> an > > > >> > > > > >>>>> empty region and then we do puts, but method > invocation > > is > > > >> not > > > >> > > > > >> allowed > > > >> > > > > >>> as > > > >> > > > > >>>>> per security policies. The puts will now be successful > > but > > > >> the > > > >> > > > index > > > >> > > > > >>> will > > > >> > > > > >>>>> be rendered invalid. Previously the puts will fail > with > > > >> > exception > > > >> > > > and > > > >> > > > > >>> put > > > >> > > > > >>>>> the entire cache in a bad state. > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> Regards > > > >> > > > > >>>>> Nabarun > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>>> On Fri, Sep 8, 2017 at 10:43 AM Michael Stolz < > > > >> > mst...@pivotal.io > > > >> > > > > > > >> > > > > >>> wrote: > > > >> > > > > >>>>> > > > >> > > > > >>>>>> Just to help me understand, the index is corrupted > in a > > > way > > > >> > > beyond > > > >> > > > > >>> just > > > >> > > > > >>>>> the > > > >> > > > > >>>>>> field that errors out? > > > >> > > > > >>>>>> What if there were multiple indices that were in > flight > > > and > > > >> > only > > > >> > > > > >> the > > > >> > > > > >>>>> third > > > >> > > > > >>>>>> one errors out, will they all be marked invalid? > > > >> > > > > >>>>>> If anything goes wrong with the put it should > probably > > > >> still > > > >> > > throw > > > >> > > > > >>> back > > > >> > > > > >>>>> to > > > >> > > > > >>>>>> the caller. Silent invalidation of the index is > > probably > > > >> not > > > >> > > > > >>> desirable. > > > >> > > > > >>>>>> > > > >> > > > > >>>>>> -- > > > >> > > > > >>>>>> Mike Stolz > > > >> > > > > >>>>>> Principal Engineer, GemFire Product Manager > > > >> > > > > >>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771> > <(631)%20835-4771> > > <(631)%20835-4771> > > > <(631)%20835-4771> > > > >> <(631)%20835-4771> <(631)%20835-4771> > > > >> > > > <(631)%20835-4771> > > > >> > > > > >>>>>> > > > >> > > > > >>>>>> On Fri, Sep 8, 2017 at 12:34 PM, Dan Smith < > > > >> dsm...@pivotal.io > > > >> > > > > > >> > > > > >>> wrote: > > > >> > > > > >>>>>> > > > >> > > > > >>>>>>> +1 > > > >> > > > > >>>>>>> > > > >> > > > > >>>>>>> -Dan > > > >> > > > > >>>>>>> > > > >> > > > > >>>>>>> On Thu, Sep 7, 2017 at 9:14 PM, Nabarun Nag < > > > >> n...@apache.org > > > >> > > > > > >> > > > > >>> wrote: > > > >> > > > > >>>>>>> > > > >> > > > > >>>>>>>> *Proposal:* > > > >> > > > > >>>>>>>> * Index interface will include an API - isValid() > > which > > > >> will > > > >> > > > > >>> return > > > >> > > > > >>>>>> true > > > >> > > > > >>>>>>> if > > > >> > > > > >>>>>>>> the index is still valid / uncorrupted, else will > > > return > > > >> > false > > > >> > > > > >> if > > > >> > > > > >>>> it > > > >> > > > > >>>>>>>> corrupted / invalid. > > > >> > > > > >>>>>>>> * gfsh command "list index" will have one more > column > > > "Is > > > >> > > > > >> Valid" > > > >> > > > > >>>>> which > > > >> > > > > >>>>>>> will > > > >> > > > > >>>>>>>> state the status as "true" or "false". > > > >> > > > > >>>>>>>> * Invalid indexes will not be used during query > > > >> executions. > > > >> > > > > >>>>>>>> * In case the index is found to be invalid, the > user > > > >> will be > > > >> > > > > >> able > > > >> > > > > >>>> to > > > >> > > > > >>>>>>>> remove/destroy the index. > > > >> > > > > >>>>>>>> * When a put operation corrupts an index, it will > be > > > >> logged. > > > >> > > > > >>>>>>>> > > > >> > > > > >>>>>>>> *Reasoning:* > > > >> > > > > >>>>>>>> * Currently if a put operation raises an exception > > > while > > > >> > > > > >> updating > > > >> > > > > >>>> the > > > >> > > > > >>>>>>>> index, the put operation fails with an exception to > > the > > > >> > > putter. > > > >> > > > > >>>>>>>> * For example, if an index is created on an object > > > >> method, > > > >> > and > > > >> > > > > >>> that > > > >> > > > > >>>>>>> method > > > >> > > > > >>>>>>>> causes an exception while updating the index as a > > part > > > >> of a > > > >> > > put > > > >> > > > > >>>>>>> operation, > > > >> > > > > >>>>>>>> then the put operation fails for that particular > > entry > > > >> and > > > >> > the > > > >> > > > > >>>> index > > > >> > > > > >>>>> is > > > >> > > > > >>>>>>>> left in a bad state. > > > >> > > > > >>>>>>>> * This may occur also due to lack of permission to > > > update > > > >> > > index > > > >> > > > > >>> but > > > >> > > > > >>>>>> have > > > >> > > > > >>>>>>>> permission to do puts. > > > >> > > > > >>>>>>>> * We are proposing that in the above mentioned > > > scenarios, > > > >> > the > > > >> > > > > >> put > > > >> > > > > >>>>>>> succeeds > > > >> > > > > >>>>>>>> in putting the entry in the region but the index > > which > > > it > > > >> > was > > > >> > > > > >>>> trying > > > >> > > > > >>>>> to > > > >> > > > > >>>>>>>> update will be marked invalid and will not be used > > for > > > >> query > > > >> > > > > >>>>>> executions. > > > >> > > > > >>>>>>>> * This can be justified because the corrupted index > > > will > > > >> not > > > >> > > > > >>>>> correctly > > > >> > > > > >>>>>>>> represent the region entries. > > > >> > > > > >>>>>>>> > > > >> > > > > >>>>>>>> *Status Quo:* > > > >> > > > > >>>>>>>> * Index creation will still fail if we are trying > to > > > >> create > > > >> > an > > > >> > > > > >>>> index > > > >> > > > > >>>>>>> over a > > > >> > > > > >>>>>>>> region containing an entry/entries which will > cause > > an > > > >> > > > > >> exception > > > >> > > > > >>>>> while > > > >> > > > > >>>>>>>> loading the entry into the index. > > > >> > > > > >>>>>>>> > > > >> > > > > >>>>>>>> Regards > > > >> > > > > >>>>>>>> Nabarun Nag > > > >> > > > > >>>>>>>> > > > >> > > > > >>>>>>> > > > >> > > > > >>>>>> > > > >> > > > > >>>>> > > > >> > > > > >>>> > > > >> > > > > >>> > > > >> > > > > >> > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > >