[ 
https://issues.apache.org/jira/browse/GEODE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201785#comment-17201785
 ] 

Mario Salazar de Torres edited comment on GEODE-8535 at 9/24/20, 11:53 PM:
---------------------------------------------------------------------------

My hypothesis for this case is that this problem is caused due to a time 
precision missalignment. My evidences to support this are in the coredump.log 
file and are the following:
 * The entry causing the crash is which key is *entry-505993* as can be seen in 
notifications-no-massif.log:34119
 * Previous mentions of this key are in notifications-no-massif.log:24867-24872:

{code:java}
[debug 2020/09/24 21:47:40.779570 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836275097ns,513315826409197ns,10ms,-134100ns
[debug 2020/09/24 21:47:40.779623 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 134100ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779661 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836390697ns,513315826409197ns,10ms,-18500ns
[debug 2020/09/24 21:47:40.779667 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 18500ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779676 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836408997ns,513315826409197ns,10ms,-200ns
[debug 2020/09/24 21:47:40.779681 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 200ns later for key [entry-505993]{code}
 * As can be seen expiry task handler is woken up 3 times and in the last time, 
whenever only 200ns remain to execute the expiry task, there is no sign of the 
task being woken up again.
 * Looking into ExpiryTaskManager::resetTask it uses an ACE_Time_Value variable 
which minimum precision is microseconds.

*Therefore* my guess is that given the expiry time is below 200ns, whenever 
calling reset, the task is considered done and the handler is destroyed.


was (Author: gaussianrecurrence):
My hypothesis for this case is that this problem is caused due to a time 
precision missalignment. My evidences to support this are in the coredump.log 
file and are the following:
 * The entry causing the crash is which key is *entry-505993* as can be seen in 
notifications-no-massif.log:34119
 * Previous mentions of this key are in notifications-no-massif.log:24867-24872:

{code:java}
[debug 2020/09/24 21:47:40.779570 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836275097ns,513315826409197ns,10ms,-134100ns
[debug 2020/09/24 21:47:40.779623 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 134100ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779661 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836390697ns,513315826409197ns,10ms,-18500ns
[debug 2020/09/24 21:47:40.779667 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 18500ns later for key [entry-505993]
[debug 2020/09/24 21:47:40.779676 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Entered entry expiry task handler for tombstone of key [entry-505993]: 
513315836408997ns,513315826409197ns,10ms,-200ns
[debug 2020/09/24 21:47:40.779681 CEST DESKTOP-3SQUK3P:746832 140626765563648] 
Resetting expiry task 200ns later for key [entry-505993]{code}

 * As can be seen expiry task handler is woken up 3 times and in the last time, 
whenever only 200ns remain to execute the expiry task, there is no sign of the 
task being woken up again.
 * Looking into ExpiryTaskManager::resetTask it uses an ACE_Time_Value variable 
which minimum precision is microseconds.

*Therefore* my guess is that given the expiry time is below 200ns, whenever 
calling reset, the task is considered done and the handler is destoyed.

> Coredump while putting an entry to a LocalRegion
> ------------------------------------------------
>
>                 Key: GEODE-8535
>                 URL: https://issues.apache.org/jira/browse/GEODE-8535
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>    Affects Versions: 1.13.0
>            Reporter: Mario Salazar de Torres
>            Priority: Major
>         Attachments: coredump.log, notifications-no-massif.log
>
>
> The scenario is the following:
> *GIVEN* concurrency-checks-enabled=true (as default) for the region in which 
> the put operation is happening.
> *GIVEN* tombstone-timeout=10ms
> *WHENEVER* a huge load (hundreds per second) of LOCAL_CREATE, LOCAL_DESTROY 
> notifications are received in the client for the same region and consecutive 
> keys, as below example shows:
> {code:java}
> t_0: LOCAL_CREATE for key entry-1
> t_1: LOCAL_DESTROY for key entry-1
> t_2: LOCAL_CREATE for key entry-2
> t_3: LOCAL_DESTROY for key entry-2
> ·
> ·
> ·
> t_(2*(n-1)): LOCAL_CREATE for key entry-n
> t_(2*n-1): LOCAL_DESTROY for key entry-n{code}
> *THEN* the application crashes, in many different places, but as for the case 
> reported here, whenever trying access the virtual destructor pointing of the 
> ExpiryHandlerTask, which turns out to be nullptr.
>  
> Find segmentation report attached as *coredump.log* and also, geode-native 
> debug log attached as *notifications-no-massif.log*
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to