[ 
https://issues.apache.org/jira/browse/GEODE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352894#comment-17352894
 ] 

Barrett Oglesby commented on GEODE-9299:
----------------------------------------

If I simulate this behavior with a sleep on key=5 in Put65, I see the same 
extra event in the queue.

Keys 0-4 are processed normally in servers 1 and 2:

Server 1:
{noformat}
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=0
ServerConnection on port 57561 Thread 1: 
ParallelGatewaySenderEventProcessor.enqueueEvent put dataKey=0; shadowKey=113
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=1
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=8 dom #2 port=57607: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=1; shadowKey=114
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=2
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=8 dom #2 port=57607: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=2; shadowKey=115
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=3
ServerConnection on port 57561 Thread 1: 
ParallelGatewaySenderEventProcessor.enqueueEvent put dataKey=3; shadowKey=116
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=4
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=8 dom #2 port=57607: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=4; shadowKey=117
{noformat}
Server 2:
{noformat}
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=0; shadowKey=113
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=1; shadowKey=114
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=2; shadowKey=115
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=3; shadowKey=116
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=4; shadowKey=117
{noformat}
The ServerConnection thread in server 1 sleeps before processing key=5:
{noformat}
ServerConnection on port 57561 Thread 1: Put65.cmdExecute processing key=5
ServerConnection on port 57561 Thread 1: Put65.cmdExecute sleeping key=5
{noformat}
The client times out and fails over to server2 and retries key=5 and continues 
with keys 6-9. Notice the event with key=5 has shadowKey=118. Thats the key in 
the queue.
{noformat}
ServerConnection on port 57587 Thread 2: Put65.cmdExecute processing retried 
key=5
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=10 dom #2 port=57668: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=5; shadowKey=118
ServerConnection on port 57587 Thread 2: Put65.cmdExecute processing key=6
ServerConnection on port 57587 Thread 2: 
ParallelGatewaySenderEventProcessor.enqueueEvent put dataKey=6; shadowKey=119
ServerConnection on port 57587 Thread 2: Put65.cmdExecute processing key=7
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=10 dom #2 port=57668: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=7; shadowKey=120
ServerConnection on port 57587 Thread 2: Put65.cmdExecute processing key=8
ServerConnection on port 57587 Thread 2: 
ParallelGatewaySenderEventProcessor.enqueueEvent put dataKey=8; shadowKey=121
ServerConnection on port 57587 Thread 2: Put65.cmdExecute processing key=9
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=10 dom #2 port=57668: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=9; shadowKey=122
{noformat}
Server 1 enqueues keys 5-9:
{noformat}
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=10 dom #1 port=57664: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=5; shadowKey=118
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=10 dom #1 port=57664: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=6; shadowKey=119
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=10 dom #1 port=57664: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=7; shadowKey=120
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=10 dom #1 port=57664: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=8; shadowKey=121
P2P message reader for 10.166.145.22(ln-2:85040)<v28>:41003 unshared ordered 
uid=10 dom #1 port=57664: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=9; shadowKey=122
{noformat}
The ServerConnection thread in server 1 wakes up and processes key=5. 
DistributedEventTracker.hasSeenEvent realizes its a duplicate for the data 
region. That doesn't precent the enqueueing of the event. Notice the event is 
the same EventID, but a different shadowKey (231). So, this is a second event 
in the queue for the same event.
{noformat}
ServerConnection on port 57561 Thread 1: Put65.cmdExecute woke up key=5
ServerConnection on port 57561 Thread 1: DistributedEventTracker.hasSeenEvent 
event=EntryEventImpl[op=UPDATE;region=/__PR/_B__data_5;key=5;id=EventID[id=25 
bytes;threadID=1;sequenceID=5];tailKey=231]; highestSequenceNumber=5
ServerConnection on port 57561 Thread 1: 
ParallelGatewaySenderEventProcessor.enqueueEvent put dataKey=5; shadowKey=231
{noformat}
Server 2 processes that duplicate event the same way:
{noformat}
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: DistributedEventTracker.hasSeenEvent 
event=EntryEventImpl[op=UPDATE;region=/__PR/_B__data_5;key=5;id=EventID[id=25 
bytes;threadID=1;sequenceID=5];routing=;tailKey=231]; highestSequenceNumber=5
P2P message reader for 10.166.145.22(ln-1:85023)<v27>:41002 unshared ordered 
uid=8 dom #1 port=57606: ParallelGatewaySenderEventProcessor.enqueueEvent put 
dataKey=5; shadowKey=231
{noformat}
Dumping the queues on each server shows 11 events in each queue with 2 events 
for key=5.

Server 1:
{noformat}
The queue for parallel GatewaySender ny contains the following 6 primary 
entries:
                queueKey=113; region=/data; operation=CREATE; eventKey=0; 
value=0; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10000|1;sequenceID=0;bucketID=0]
                queueKey=116; region=/data; operation=CREATE; eventKey=3; 
value=3; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10003|1;sequenceID=3;bucketID=3]
                queueKey=118; region=/data; operation=CREATE; eventKey=5; 
value=5; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10005|1;sequenceID=5;bucketID=5]
                queueKey=120; region=/data; operation=CREATE; eventKey=7; 
value=7; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10007|1;sequenceID=7;bucketID=7]
                queueKey=122; region=/data; operation=CREATE; eventKey=9; 
value=9; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10009|1;sequenceID=9;bucketID=9]
                queueKey=231; region=/data; operation=UPDATE; eventKey=5; 
value=5; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10005|1;sequenceID=5;bucketID=5]

The queue for parallel GatewaySender ny contains the following 5 secondary 
entries:
                queueKey=114; region=/data; operation=CREATE; eventKey=1; 
value=1; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10001|1;sequenceID=1;bucketID=1]
                queueKey=115; region=/data; operation=CREATE; eventKey=2; 
value=2; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10002|1;sequenceID=2;bucketID=2]
                queueKey=117; region=/data; operation=CREATE; eventKey=4; 
value=4; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10004|1;sequenceID=4;bucketID=4]
                queueKey=119; region=/data; operation=CREATE; eventKey=6; 
value=6; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10006|1;sequenceID=6;bucketID=6]
                queueKey=121; region=/data; operation=CREATE; eventKey=8; 
value=8; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10008|1;sequenceID=8;bucketID=8]
{noformat}
Server 2:
{noformat}
The queue for parallel GatewaySender ny contains the following 5 primary 
entries:
                queueKey=114; region=/data; operation=CREATE; eventKey=1; 
value=1; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10001|1;sequenceID=1;bucketID=1]
                queueKey=115; region=/data; operation=CREATE; eventKey=2; 
value=2; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10002|1;sequenceID=2;bucketID=2]
                queueKey=117; region=/data; operation=CREATE; eventKey=4; 
value=4; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10004|1;sequenceID=4;bucketID=4]
                queueKey=119; region=/data; operation=CREATE; eventKey=6; 
value=6; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10006|1;sequenceID=6;bucketID=6]
                queueKey=121; region=/data; operation=CREATE; eventKey=8; 
value=8; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10008|1;sequenceID=8;bucketID=8]

The queue for parallel GatewaySender ny contains the following 6 secondary 
entries:
                queueKey=113; region=/data; operation=CREATE; eventKey=0; 
value=0; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10000|1;sequenceID=0;bucketID=0]
                queueKey=116; region=/data; operation=CREATE; eventKey=3; 
value=3; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10003|1;sequenceID=3;bucketID=3]
                queueKey=118; region=/data; operation=CREATE; eventKey=5; 
value=5; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10005|1;sequenceID=5;bucketID=5]
                queueKey=120; region=/data; operation=CREATE; eventKey=7; 
value=7; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10007|1;sequenceID=7;bucketID=7]
                queueKey=122; region=/data; operation=CREATE; eventKey=9; 
value=9; possibleDuplicate=false; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10009|1;sequenceID=9;bucketID=9]
                queueKey=231; region=/data; operation=UPDATE; eventKey=5; 
value=5; possibleDuplicate=true; 
eventId=EventID[10.166.145.22(:loner):57600:533faaaf;threadID=0x10005|1;sequenceID=5;bucketID=5]
{noformat}
If I start the remote site, I see logging in 
DistributedEventTracker.hasSeenEvent that ignores the duplicate event, and the 
region size is correct:
{noformat}
ServerConnection on port 5417 Thread 8: DistributedEventTracker.hasSeenEvent 
event=EntryEventImpl[op=UPDATE;region=/__PR/_B__data_5;key=5;id=EventID[id=25 
bytes;threadID=0x10005|1;sequenceID=5;bucketId=5];tailKey=231]; 
highestSequenceNumber=5
Region name=data; size=10
{noformat}
I think the way to address this issue is to change the equals check to greater 
than or equals.



> CI Failure: 
> WANRollingUpgradeSecondaryEventsNotReprocessedAfterOldSiteMemberFailover > 
> testSecondaryEventsNotReprocessedAfterOldSiteMemberFailover
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-9299
>                 URL: https://issues.apache.org/jira/browse/GEODE-9299
>             Project: Geode
>          Issue Type: Bug
>          Components: wan
>    Affects Versions: 1.15.0
>            Reporter: Hale Bales
>            Assignee: Barrett Oglesby
>            Priority: Major
>
> {code:java}
> org.apache.geode.cache.wan.WANRollingUpgradeSecondaryEventsNotReprocessedAfterOldSiteMemberFailover
>  > testSecondaryEventsNotReprocessedAfterOldSiteMemberFailover[from_v1.12.2] 
> FAILED
>     java.lang.AssertionError: expected:<100> but was:<101>
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.failNotEquals(Assert.java:835)
>         at org.junit.Assert.assertEquals(Assert.java:647)
>         at org.junit.Assert.assertEquals(Assert.java:633)
>         at 
> org.apache.geode.cache.wan.WANRollingUpgradeDUnitTest.stopSenderAndVerifyEvents(WANRollingUpgradeDUnitTest.java:227)
>         at 
> org.apache.geode.cache.wan.WANRollingUpgradeSecondaryEventsNotReprocessedAfterOldSiteMemberFailover.testSecondaryEventsNotReprocessedAfterOldSiteMemberFailover(WANRollingUpgradeSecondaryEventsNotReprocessedAfterOldSiteMemberFailover.java:98)
> {code}
> CI Failure: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UpgradeTestOpenJDK11/builds/229#B
> Artifacts Available here: 
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0253/test-results/upgradeTest/1621635640/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to