[ https://issues.apache.org/jira/browse/GEODE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakov Varenina updated GEODE-10123: ----------------------------------- Description: We have observed that sometimes the "create region" command executes successfully during period servers restart event thought region is already available in the cluster configuration. When this happens, cluster configuration contains a duplicated region and servers throw the exception *org.apache.geode.cache.RegionExistsException* when stated again: {code:java} Exception in thread "main" org.apache.geode.cache.CacheXmlException: While parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown while parsing XML. org.apache.geode.cache.RegionExistsException: /regionTest at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119) at org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208) at org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52) at org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892) at org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807) at org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737) at org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256) Caused by: org.xml.sax.SAXException: A CacheException was thrown while parsing XML. {code} CreateRegionCommand checks whether there is already an existing region using the DistributedRegionMXBean service. This service is sometimes unreliable during restarts, as it takes some time for the locator to accumulate this information. So during the startup of the servers locator will send CreateRegionFunction to servers since it couldn't detect that region already exists using the DistributedRegionMXBean service. In most scenarios, the server will reject that function. The only case the server will execute the function is when using the local cache.xml configuration in addition to the cluster configuration. It seems that processing local cache.xml creates a time gap on the server that will accept CreateRegionFunction, although the region is already available in the cluster configuration and will be applied to that server. The solution would be to perform required checks against cluster configuration stored on locators when using cluster configuration in addition to checks done using DistributedRegionMXBean. was: We have observed that sometimes the "create region" command executes successfully during period servers restart event thought region is already available in the cluster configuration. When this happens, cluster configuration contains a duplicated region, so servers throw the following exception *org.apache.geode.cache.RegionExistsException* at startup: {code:java} Exception in thread "main" org.apache.geode.cache.CacheXmlException: While parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown while parsing XML. org.apache.geode.cache.RegionExistsException: /regionTest at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119) at org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208) at org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52) at org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892) at org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807) at org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737) at org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256) Caused by: org.xml.sax.SAXException: A CacheException was thrown while parsing XML. {code} CreateRegionCommand checks whether there is already an existing region using the DistributedRegionMXBean service. This service is sometimes unreliable during restarts, as it takes some time for the locator to accumulate this information. So during the startup of the servers locator will send CreateRegionFunction to servers since it couldn't detect that region already exists using the DistributedRegionMXBean service. In most scenarios, the server will reject that function. The only case the server will execute the function is when using the local cache.xml configuration in addition to the cluster configuration. It seems that processing local cache.xml creates a time gap on the server that will accept CreateRegionFunction, although the region is already available in the cluster configuration and will be applied to that server. The solution would be to perform required checks against cluster configuration stored on locators when using cluster configuration in addition to checks done using DistributedRegionMXBean. > Multiple execution of Create region command results with duplicated region in > cluster configuration > --------------------------------------------------------------------------------------------------- > > Key: GEODE-10123 > URL: https://issues.apache.org/jira/browse/GEODE-10123 > Project: Geode > Issue Type: Bug > Reporter: Jakov Varenina > Assignee: Jakov Varenina > Priority: Major > Labels: needsTriage, pull-request-available > > We have observed that sometimes the "create region" command executes > successfully during period servers restart event thought region is already > available in the cluster configuration. When this happens, cluster > configuration contains a duplicated region and servers throw the exception > *org.apache.geode.cache.RegionExistsException* when stated again: > {code:java} > Exception in thread "main" org.apache.geode.cache.CacheXmlException: While > parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown > while parsing XML. > org.apache.geode.cache.RegionExistsException: /regionTest > at > org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267) > at > org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119) > at > org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208) > at > org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426) > at > org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391) > at > org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) > at > org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) > at > org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) > at > org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52) > at > org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892) > at > org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807) > at > org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737) > at > org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256) > Caused by: org.xml.sax.SAXException: A CacheException was thrown while > parsing XML. > {code} > CreateRegionCommand checks whether there is already an existing region using > the DistributedRegionMXBean service. This service is sometimes unreliable > during restarts, as it takes some time for the locator to accumulate this > information. So during the startup of the servers locator will send > CreateRegionFunction to servers since it couldn't detect that region already > exists using the DistributedRegionMXBean service. In most scenarios, the > server will reject that function. The only case the server will execute the > function is when using the local cache.xml configuration in addition to the > cluster configuration. It seems that processing local cache.xml creates a > time gap on the server that will accept CreateRegionFunction, although the > region is already available in the cluster configuration and will be applied > to that server. > The solution would be to perform required checks against cluster > configuration stored on locators when using cluster configuration in addition > to checks done using DistributedRegionMXBean. -- This message was sent by Atlassian Jira (v8.20.1#820001)