[ https://issues.apache.org/jira/browse/GEODE-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930801#comment-15930801 ]
ASF subversion and git services commented on GEODE-2683: -------------------------------------------------------- Commit 1036259eca5cabd84dbe94da4043b8f266f2b6e6 in geode's branch refs/heads/develop from zhouxh [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=1036259 ] GEODE-2683: let BR.putAll/removeAll to distribute before notify gateway, which is the same order as put/destroy > Lucene query did not match region values > ---------------------------------------- > > Key: GEODE-2683 > URL: https://issues.apache.org/jira/browse/GEODE-2683 > Project: Geode > Issue Type: Bug > Reporter: xiaojian zhou > Assignee: xiaojian zhou > Fix For: 1.2.0 > > > There're several root causes. This one is due to the fix in #45782 changed > the order to notify primary bucket's gateway before distribute to secondary. > The log is at /export/buglogs_bvt/xzhou/lucene/concParRegHA-0209-235804 > CLIENT vm_1_thr_17_dataStore1_ip-10-32-108-36_11189 > TASK[1] parReg.ParRegTest.HydraTask_HADoEntryOps > ERROR util.TestException: util.TestException: Lucene query did not match > region values. missingKeys=[], extraKeys=[Object_13, Object_17, Object_952, > Object_550, Object_1876, Object_2732, Object_270, Object_4722, Object_4726, > Object_2537] > at lucene.LuceneHelper.verifyLuceneIndex(LuceneHelper.java:88) > at lucene.LuceneTest.verifyLuceneIndex(LuceneTest.java:128) > at lucene.LuceneTest.verifyFromSnapshotOnly(LuceneTest.java:79) > at parReg.ParRegTest.verifyFromSnapshot(ParRegTest.java:5638) > at parReg.ParRegTest.concVerify(ParRegTest.java:6035) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at util.MethodCoordinator.executeOnce(MethodCoordinator.java:68) > at parReg.ParRegTest.HADoEntryOps(ParRegTest.java:2273) > at parReg.ParRegTest.HydraTask_HADoEntryOps(ParRegTest.java:1032) > at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > The root cause is: > T1: A putAll (or removeAll. operation arrived at primary bucket at memberA > T2: BR.virtualPut() called handleWANEvent() and create shadow key > T3: PutAll will invoke callback (i.e. write into AEQ) before distribution. > (Put/Destroy will not have this problem because they distribute before > callback) > T4: handleSuccessfulBatchDispatch will send ParallelQueueRemovalMessage to > the secondary bucket at memberB > T5: memberB has dataRegion's secondary bucket, but brq is not created yet > (due to rebalance). So in ParallelQueueRemovalMessage.process(), it will only > try to remove the event from tempQueue (which does not contain the event, so > it will do nothing) > T6: Now, finally the BR.virtualPut()'s distribution arrived at user region's > secondary bucket at memberB. It will be added into the AEQ (or tempQueue, > depends). > T7: memberB becomes new primary (due to rebalance) and re-dispatch the shadow > key (which has been processed much earlier in memberA). Data mismatch is > because the replayed event overrides a newer event. -- This message was sent by Atlassian JIRA (v6.3.15#6346)