[ 
https://issues.apache.org/jira/browse/GEODE-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183938#comment-17183938
 ] 

Alberto Bustamante Reyes edited comment on GEODE-8436 at 8/25/20, 11:12 AM:
----------------------------------------------------------------------------

I think there is a problem with the test. I have just run it 10 times on my 
laptop (ubuntu) with the current content of develop branch (so without the 
reverted commit), and it has passed only twice. The rest of executions it 
failed with the same exception you saw on RHEL platform.

So it seems the commit of this ticket is not the root cause of the failure.

Example:
{code}
[error 2020/08/25 12:42:20.234553 CEST alb3rtobr-XPS:26235 140597717040256] 
Execute: An exception (org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
 memberDeparted event for < alb3rtobr-XPS(GFECS19501:26645)<ec><v2>:41006 > 
crashed, false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResultInternal(PRFunctionStreamingResultCollector.java:115)
        at 
org.apache.geode.internal.cache.execute.ResultCollectorHolder.getResult(ResultCollectorHolder.java:53)
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:88)
        at 
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.executeFunctionWithResult(ExecuteRegionFunction66.java:406)
        at 
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.cmdExecute(ExecuteRegionFunction66.java:201)
        at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:183)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
        at 
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:686)
        at 
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
        at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
 memberDeparted event for < alb3rtobr-XPS(GFECS19501:26645)<ec><v2>:41006 > 
crashed, false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.memberDeparted(PRFunctionStreamingResultCollector.java:375)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberDepartedEvent.handleEvent(ClusterDistributionManager.java:2494)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2424)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2413)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleMemberEvent(ClusterDistributionManager.java:1401)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$200(ClusterDistributionManager.java:108)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEventInvoker.run(ClusterDistributionManager.java:1433)
        ... 1 more
) happened at remote server.

{code}


was (Author: alberto.bustamante.reyes):
I think there is a problem with the test. I have just run it 10 times on my 
laptop (ubuntu) with the current content of develop branch (so without the 
reverted commit), and it has passed only twice. The rest of executions it 
failed with the same exception you saw on RHEL platform.

Example:
{code}
[error 2020/08/25 12:42:20.234553 CEST alb3rtobr-XPS:26235 140597717040256] 
Execute: An exception (org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
 memberDeparted event for < alb3rtobr-XPS(GFECS19501:26645)<ec><v2>:41006 > 
crashed, false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResultInternal(PRFunctionStreamingResultCollector.java:115)
        at 
org.apache.geode.internal.cache.execute.ResultCollectorHolder.getResult(ResultCollectorHolder.java:53)
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:88)
        at 
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.executeFunctionWithResult(ExecuteRegionFunction66.java:406)
        at 
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.cmdExecute(ExecuteRegionFunction66.java:201)
        at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:183)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
        at 
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:686)
        at 
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
        at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
 memberDeparted event for < alb3rtobr-XPS(GFECS19501:26645)<ec><v2>:41006 > 
crashed, false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.memberDeparted(PRFunctionStreamingResultCollector.java:375)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberDepartedEvent.handleEvent(ClusterDistributionManager.java:2494)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2424)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2413)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleMemberEvent(ClusterDistributionManager.java:1401)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager.access$200(ClusterDistributionManager.java:108)
        at 
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEventInvoker.run(ClusterDistributionManager.java:1433)
        ... 1 more
) happened at remote server.

{code}

> Several threads calling PdxInstanceFactory::create() causes seg fault
> ---------------------------------------------------------------------
>
>                 Key: GEODE-8436
>                 URL: https://issues.apache.org/jira/browse/GEODE-8436
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>            Reporter: Alberto Bustamante Reyes
>            Assignee: Alberto Bustamante Reyes
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.14.0
>
>         Attachments: main.cpp
>
>
> I have seen a problem when "PdxInstanceFactory::create()" is called by 
> several threads that are registering the same new pdx type.
> The core is produced here:
> {code}
> void PdxInstanceImpl::toDataMutable(PdxWriter& writer) {
>    auto pt = getPdxType();
>    std::vector<std::shared_ptr<PdxFieldType>>* pdxFieldList =
>        pt->getPdxFieldTypes();
> {code}
> The problem is that "getPdxType()" returns nullptr, so in the next line, 
> there is segmentation fault when calling "pt->getPdxFieldTypes()".
> The issue can be reproduced using the attached client, and executing it using 
> 8 threads. This is the stack got in gdb:
> {code}
> #0  apache::geode::client::PdxType::getPdxFieldTypes (this=0x0) at 
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxType.hpp:178
> #1  0x00007f43dc4651b7 in 
> apache::geode::client::PdxInstanceImpl::toDataMutable (this=0x7f43c0001600, 
> writer=...) at 
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceImpl.cpp:1336
> #2  0x00007f43dc4650fd in apache::geode::client::PdxInstanceImpl::toData 
> (this=0x7f43c0001600, writer=...) at 
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceImpl.cpp:1327
> #3  0x00007f43dc444971 in apache::geode::client::PdxHelper::serializePdx 
> (output=..., pdxObject=warning: RTTI symbol not found for class 
> 'std::_Sp_counted_ptr_inplace<apache::geode::client::PdxInstanceImpl, 
> std::allocator<apache::geode::client::PdxInstanceImpl>, 
> (__gnu_cxx::_Lock_policy)2>'
> warning: RTTI symbol not found for class 
> 'std::_Sp_counted_ptr_inplace<apache::geode::client::PdxInstanceImpl, 
> std::allocator<apache::geode::client::PdxInstanceImpl>, 
> (__gnu_cxx::_Lock_policy)2>'
> std::shared_ptr<apache::geode::client::PdxSerializable> (use count 3, weak 
> count 0) = {...})
>     at 
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxHelper.cpp:77
> #4  0x00007f43dc44b4bc in apache::geode::client::PdxInstanceFactory::create 
> (this=0x7f43c7ffecc8) at 
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceFactory.cpp:53
> #5  0x000000000040de2f in doPut () at 
> /home/alb3rtobr/CLionProjects/dummy-client/main.cpp:60
> #6  0x0000000000427767 in std::__invoke_impl<void, void (*)()> 
> (__f=@0x2561aa8: 0x40d860 <doPut()>) at 
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/invoke.h:60
> #7  0x00000000004276fd in std::__invoke<void (*)()> (__fn=@0x2561aa8: 
> 0x40d860 <doPut()>) at 
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/invoke.h:95
> #8  0x00000000004276d5 in std::thread::_Invoker<std::tuple<void (*)()> 
> >::_M_invoke<0ul> (this=0x2561aa8) at 
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/thread:234
> #9  0x00000000004276a5 in std::thread::_Invoker<std::tuple<void (*)()> 
> >::operator() (this=0x2561aa8) at 
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/thread:243
> #10 0x0000000000427589 in 
> std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > 
> >::_M_run (this=0x2561aa0)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to