[ https://issues.apache.org/jira/browse/GEODE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522937#comment-17522937 ]
Barrett Oglesby commented on GEODE-10148: ----------------------------------------- I think here is where the problem is: {{LocalManager.startLocalManagement}} runs the {{ManagementTask}} once right when it starts. With logging added, the call to {{managementTask.get().run()}} returns right away. Even though the comment says its a synchronous call, it isn't. {noformat} [vm3] [warn 2022/03/23 16:16:02.173 PDT server-3 <RMI TCP Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.startLocalManagement about to run managementTask [vm3] [warn 2022/03/23 16:16:02.173 PDT server-3 <RMI TCP Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.startLocalManagement done managementTask {noformat} Then, {{LocalManager.markForFederation}} adds the mbeans to the {{federatedComponentMap}}: {noformat} [vm3] [warn 2022/03/23 16:16:02.209 PDT server-3 <RMI TCP Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to add to federatedComponentMap objName=GemFire:type=Member,member=server-3 [vm3] [warn 2022/03/23 16:16:02.364 PDT server-3 <RMI TCP Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to add to federatedComponentMap objName=GemFire:service=Region,name="/test-region-1",type=Member,member=server-3 [vm3] [warn 2022/03/23 16:16:02.437 PDT server-3 <RMI TCP Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to add to federatedComponentMap objName=GemFire:service=CacheServer,port=20017,type=Member,member=server-3 {noformat} The CacheServer mbean above is the one that is missing in the failed run. Then, the {{Management Task}} thread runs the {{ManagementTask}} started above to put the mbeans into the region: {noformat} [vm3] [warn 2022/03/23 16:16:04.177 PDT server-3 <Management Task1> tid=0x46] XXX LocalManager.doManagementTask about to putAll replicaMap={GemFire:service=CacheServer,port=20017,type=Member,member=server-3=ObjectName = GemFire:service=CacheServer,port=20017,type=Member,member=server-3, GemFire:service=Region,name="/test-region-1",type=Member,member=server-3=ObjectName = GemFire:service=Region,name="/test-region-1",type=Member,member=server-3, GemFire:type=Member,member=server-3=ObjectName = GemFire:type=Member,member=server-3} [vm3] [warn 2022/03/23 16:16:04.211 PDT server-3 <Management Task1> tid=0x46] XXX LocalManager.doManagementTask done putAll replicaMap={GemFire:service=CacheServer,port=20017,type=Member,member=server-3=ObjectName = GemFire:service=CacheServer,port=20017,type=Member,member=server-3, GemFire:service=Region,name="/test-region-1",type=Member,member=server-3=ObjectName = GemFire:service=Region,name="/test-region-1",type=Member,member=server-3, GemFire:type=Member,member=server-3=ObjectName = GemFire:type=Member,member=server-3} {noformat} If the {{Management Task}} thread runs between the added Region and CacheServer mbeans, this issue would reproduce. > [CI Failure] : JMXMBeanFederationDUnitTest > MBeanFederationAddRemoveServer > FAILED > ---------------------------------------------------------------------------------- > > Key: GEODE-10148 > URL: https://issues.apache.org/jira/browse/GEODE-10148 > Project: Geode > Issue Type: Bug > Components: jmx > Affects Versions: 1.15.0 > Reporter: Nabarun Nag > Priority: Major > Labels: test-stability > > JMXMBeanFederationDUnitTest > MBeanFederationAddRemoveServer FAILED > java.lang.AssertionError: > Expecting actual: > ["GemFire:service=AccessControl,type=Distributed", > "GemFire:service=CacheServer,port=20842,type=Member,member=server-1", > "GemFire:service=CacheServer,port=20846,type=Member,member=server-2", > > "GemFire:service=DiskStore,name=cluster_config,type=Member,member=locator-one", > "GemFire:service=FileUploader,type=Distributed", > "GemFire:service=Locator,type=Member,member=locator-one", > > "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Distributed", > > "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Member,member=locator-one", > "GemFire:service=Manager,type=Member,member=locator-one", > "GemFire:service=Region,name="/test-region-1",type=Distributed", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-1", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-2", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-3", > "GemFire:service=System,type=Distributed", > "GemFire:type=Member,member=locator-one", > "GemFire:type=Member,member=server-1", > "GemFire:type=Member,member=server-2", > "GemFire:type=Member,member=server-3"] > to contain exactly (and in same order): > ["GemFire:service=AccessControl,type=Distributed", > "GemFire:service=CacheServer,port=20842,type=Member,member=server-1", > "GemFire:service=CacheServer,port=20846,type=Member,member=server-2", > "GemFire:service=CacheServer,port=20850,type=Member,member=server-3", > > "GemFire:service=DiskStore,name=cluster_config,type=Member,member=locator-one", > "GemFire:service=FileUploader,type=Distributed", > "GemFire:service=Locator,type=Member,member=locator-one", > > "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Distributed", > > "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Member,member=locator-one", > "GemFire:service=Manager,type=Member,member=locator-one", > "GemFire:service=Region,name="/test-region-1",type=Distributed", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-1", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-2", > > "GemFire:service=Region,name="/test-region-1",type=Member,member=server-3", > "GemFire:service=System,type=Distributed", > "GemFire:type=Member,member=locator-one", > "GemFire:type=Member,member=server-1", > "GemFire:type=Member,member=server-2", > "GemFire:type=Member,member=server-3"] > but could not find the following elements: > ["GemFire:service=CacheServer,port=20850,type=Member,member=server-3"] > at > org.apache.geode.management.internal.JMXMBeanFederationDUnitTest.MBeanFederationAddRemoveServer(JMXMBeanFederationDUnitTest.java:130) > 8352 tests completed, 1 failed, 414 skipped -- This message was sent by Atlassian Jira (v8.20.1#820001)