[ 
https://issues.apache.org/jira/browse/GEODE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Shu resolved GEODE-8465.
-----------------------------
    Fix Version/s: 1.14.0
       Resolution: Fixed

> A stray event can resurrect on a client due to a race condition
> ---------------------------------------------------------------
>
>                 Key: GEODE-8465
>                 URL: https://issues.apache.org/jira/browse/GEODE-8465
>             Project: Geode
>          Issue Type: Bug
>          Components: client queues
>    Affects Versions: 1.1.0
>            Reporter: Eric Shu
>            Assignee: Eric Shu
>            Priority: Major
>              Labels: caching-applications, pull-request-available
>             Fix For: 1.14.0
>
>
> A member with primary queue could send the QRM for the event before it 
> removes the event from its HARegionQueue – this causes a race that a newly 
> started member can gii and get the events from the member with primary queue 
> after it has already sent the QRM for the event to existing members holding 
> the secondary queue.
> The following analysis showing that the race exists. 
> {noformat}
> The stray create of Object_5234on edge 10634 shows this issue with this cause.
> From edge 10634, we know the stray create is for v3 of the object and from 
> bridge 15141.
> [fine 2020/06/17 16:38:49.621 PDT edgegemfire_2_1_host1_10634 <Cache Client 
> Updater Thread on 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_15141:15141)<ec><v35>:41007(version:UNKNOWN[ordinal=125])
>  port 26905> tid=0x541] VersionedThinRegionEntryHeapStringKey2@1b56c386 
> (key=Object_5234; rawValue=VMCachedDeserializable@526996532; version=
> {v3; rv28; 
> mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007;
>  ds=2; time=1592434556093}
> ;member=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007)
>  dispatching event 
> EntryEventImpl[op=CREATE;region=/DefaultRegion;key=Object_5234;oldValue=null;newValue=VMCachedDeserializable@526996532;callbackArg=Update
>  event originated in pid 
> 10655;originRemote=true;originMember=rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;version=
> {v3; rv28; 
> mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007;
>  ds=2; time=1592434556093; remote}
> ;id=EventIDid=52bytes;threadID=52;sequenceID=108];isFromServer]
> [fine 2020/06/17 16:38:49.621 PDT edgegemfire_2_1_host1_10634 <Cache Client 
> Updater Thread on 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_15141:15141)<ec><v35>:41007(version:UNKNOWN[ordinal=125])
>  port 26905> tid=0x541] Put entry for region: /DefaultRegion key: Object_5234 
> callbackArgument: Update event originated in pid 10655
> ***This version (v3) is an update. The primary queue at the time is 
> bridgegemfire_2_2_host1_10538. The event is enqueued and dispatched ***
> [fine 2020/06/17 15:55:56.098 PDT bridgegemfire_2_2_host1_10538 <P2P message 
> reader for 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125])
>  unshared ordered uid=358 dom #1 port=43914> tid=0x16d] DefaultRegion: 
> notifying 1 bridge servers of event: 
> EntryEventImpl[op=UPDATE;region=/DefaultRegion;key=Object_5234;oldValue=VMCachedDeserializable@1538889434;newValue=VMCachedDeserializable@2141181294;callbackArg=Update
>  event originated in pid 
> 10655;originRemote=true;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]);version=
> {v3; rv28; 
> mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]);
>  ds=2; time=1592434556093; remote}
> ;context=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;id=EventIDid=52bytes;threadID=52;sequenceID=108];routing=]
> [fine 2020/06/17 15:55:56.098 PDT bridgegemfire_2_2_host1_10538 <P2P message 
> reader for 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125])
>  unshared ordered uid=358 dom #1 port=43914> tid=0x16d] 
> VMStatsRegionEntryHeapLongKey@38ccc0e5 (key=3867; 
> rawValue=HAEventWrapper[refCount=2; 
> msg=ClientUpdateMessageImpl[op=AFTER_UPDATE;region=/DefaultRegion;key=Object_5234;isObject=1;cbArg=Update
>  event originated in pid 
> 10655;memberId=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;eventId=EventIDid=52bytes;threadID=52;sequenceID=108];shouldConflate=false;versionTag=
> {v3; rv28; 
> mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]);
>  ds=2; time=1592434556093; remote}
> ;hasCqs=false]]) dispatching event 
> EntryEventImpl[op=CREATE;region=/_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue;key=3867;oldValue=null;newValue=HAEventWrapper[refCount=2;
>  
> msg=ClientUpdateMessageImpl[op=AFTER_UPDATE;region=/DefaultRegion;key=Object_5234;isObject=1;cbArg=Update
>  event originated in pid 
> 10655;memberId=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;eventId=EventIDid=52bytes;threadID=52;sequenceID=108];shouldConflate=false;versionTag=
> {v3; rv28; 
> mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]);
>  ds=2; time=1592434556093; remote}
> ;hasCqs=false]];callbackArg=null;originRemote=false;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006]
> ***QRM is sent to the 2 peers at the time with the region ***
> [fine 2020/06/17 15:55:56.213 PDT bridgegemfire_2_2_host1_10538 <Queue 
> Removal Thread> tid=0x83] Sending 
> (QueueRemovalMessage[_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue,
>  1, EventIDid=52bytes;threadID=52;sequenceID=108], 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_3_host1_10666:10666:loner):60462:26db73c4:edgegemfire_2_3_host1_10666_1_queue,
>  1, EventIDid=52bytes;threadID=52;sequenceID=108]]) to 2 peers 
> ([rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125])@277,
>  
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_4_host1_23574:23574)<ec><v31>:41011(version:UNKNOWN[ordinal=125])@566])
>  via tcp/ip
> ***the newly started bridgegemfire_2_3_24320 creates the HARegionQueue and 
> gii from the member with primary queue (10538) ***
> [info 2020/06/17 15:56:08.800 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] Initializing region 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [info 2020/06/17 15:56:09.285 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
>  is done getting image from 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006(version:GEODE
>  1.3.0). isDeltaGII is false
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] adding haContainerKey to HARegion at 
> 1:HAEventWrapper[region=/DefaultRegion; key=Object_5234; refCount=1; 
> putInProgress=0; 
> event=EventID[rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;threadID=52;sequenceID=108];
>  no message] for 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] invoking listeners: 
> [org.apache.geode.internal.cache.ha.HARegionQueue$1@2e69472b]
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] Adding position 1 to available IDs. Region: 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> 
> tid=0x76] RegionQueue on 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue(backup)
>  done putting GII data into queue
> ***Please note that the event was being removed from its primary queue after 
> QRM has been sent and bridge 24320 gii the events from it ***
> [fine 2020/06/17 15:56:09.666 PDT bridgegemfire_2_2_host1_10538 <Client 
> Message Dispatcher for 
> rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634>
>  tid=0x84] VMStatsRegionEntryHeapLongKey@38ccc0e5 (key=3867; 
> rawValue=REMOVED_PHASE2) dispatching event 
> EntryEventImpl[op=LOCAL_DESTROY;region=/_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue;key=3867;oldValue=HAEventWrapper[region=/DefaultRegion;
>  key=Object_5234; refCount=1; inContainer=true; putInProgress=false; 
> event=EventIDid=52bytes;threadID=52;sequenceID=108]; no 
> message];newValue=null;callbackArg=null;originRemote=false;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006]
> ***Bridge 24320 now has the event for Object_5234(v3) that has been sent to 
> the client and won't get QRM again ***
> ***Later, bridge 15141 gii from 24320 and got the event ***
> [info 2020/06/17 16:17:36.454 PDT <Client Queue Initialization Thread 2> 
> tid=0x78] Initializing region 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [info 2020/06/17 16:17:40.187 PDT <Client Queue Initialization Thread 2> 
> tid=0x78] 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
>  is done getting image from 
> rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_3_host1_24320:24320)<ec><v33>:41010.
>  isDeltaGII is false
> [debug 2020/06/17 16:17:40.415 PDT <Client Queue Initialization Thread 2> 
> tid=0x78] 
> _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
>  processing queue key 1 and value HAEventWrapper[region=/DefaultRegion; 
> key=Object_5234; refCount=1; putInProgress=0; 
> event=EventID[rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;threadID=52;sequenceID=108];
>  no message]
> ***It will send the event to client after it becomes the primary queue holder 
> for client 10634 ***
> [debug 2020/06/17 16:38:49.125 PDT <ServerConnection on port 26905 Thread 5> 
> tid=0xb4] 
> CacheClientProxy[identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634,connection=1;
>  port=53006; primary=true; version=GEODE 1.3.0]: Queueing marker message. 
> <org.apache.geode.internal.cache.tier.sockets.ClientMarkerMessageImpl@75bf95c8>.
>  The queue contains 57 entries.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to