[ 
https://issues.apache.org/jira/browse/GEODE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-10123:
-----------------------------------
    Description: 
We have observed that sometimes the "create region" command executes 
successfully during period when servers restart even though region is already 
available in the cluster configuration.  When this happens, cluster 
configuration contains a duplicated region and  servers throw the exception  
*org.apache.geode.cache.RegionExistsException* when restarted again:
{code:java}
Exception in thread "main" org.apache.geode.cache.CacheXmlException: While 
parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown 
while parsing XML.
org.apache.geode.cache.RegionExistsException: /regionTest
            at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119)
            at 
org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391)
            at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
            at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
            at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
            at 
org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
            at 
org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892)
            at 
org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807)
            at 
org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737)
            at 
org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256)
Caused by: org.xml.sax.SAXException: A CacheException was thrown while parsing 
XML.
{code}
CreateRegionCommand checks whether there is already an existing region using 
the DistributedRegionMXBean service. This service is unreliable during 
restarts, as it takes some time for the locator to accumulate this information. 
So during the startup of the servers locator will send CreateRegionFunction to 
servers since it couldn't detect that region already exists using the 
DistributedRegionMXBean service. In most scenarios, the server will reject that 
function. The only case the server will execute the function is when using the 
local cache.xml configuration in addition to the cluster configuration. It 
seems that processing local cache.xml creates a time gap on the server that 
will accept CreateRegionFunction, although the region is already available in 
the cluster configuration.

Solution:

In addition to checks done against the DistributedRegionMXBean information, do 
the same checks against cluster configuration (if used) stored in locators. 
This way, the "create region" command is rejected immediately by the locator 
instead of later by the servers.

  was:
We have observed that sometimes the "create region" command executes 
successfully during period when servers restart even though region is already 
available in the cluster configuration.  When this happens, cluster 
configuration contains a duplicated region and  servers throw the exception  
*org.apache.geode.cache.RegionExistsException* when restarted again:
{code:java}
Exception in thread "main" org.apache.geode.cache.CacheXmlException: While 
parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown 
while parsing XML.
org.apache.geode.cache.RegionExistsException: /regionTest
            at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119)
            at 
org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426)
            at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391)
            at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
            at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
            at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
            at 
org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
            at 
org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892)
            at 
org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807)
            at 
org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737)
            at 
org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256)
Caused by: org.xml.sax.SAXException: A CacheException was thrown while parsing 
XML.
{code}
CreateRegionCommand checks whether there is already an existing region using 
the DistributedRegionMXBean service. This service is unreliable during 
restarts, as it takes some time for the locator to accumulate this information. 
So during the startup of the servers locator will send CreateRegionFunction to 
servers since it couldn't detect that region already exists using the 
DistributedRegionMXBean service. In most scenarios, the server will reject that 
function. The only case the server will execute the function is when using the 
local cache.xml configuration in addition to the cluster configuration. It 
seems that processing local cache.xml creates a time gap on the server that 
will accept CreateRegionFunction, although the region is already available in 
the cluster configuration.

Slution:

In addition to checks done against the DistributedRegionMXBean information, do 
the same checks against cluster configuration (if used) stored in locators. 
This way, the "create region" command is rejected immediately by the locator 
instead of later by the servers.


> Multiple execution of Create region command results with duplicated region in 
> cluster configuration
> ---------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-10123
>                 URL: https://issues.apache.org/jira/browse/GEODE-10123
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Jakov Varenina
>            Assignee: Jakov Varenina
>            Priority: Major
>              Labels: needsTriage, pull-request-available
>
> We have observed that sometimes the "create region" command executes 
> successfully during period when servers restart even though region is already 
> available in the cluster configuration.  When this happens, cluster 
> configuration contains a duplicated region and  servers throw the exception  
> *org.apache.geode.cache.RegionExistsException* when restarted again:
> {code:java}
> Exception in thread "main" org.apache.geode.cache.CacheXmlException: While 
> parsing XML, caused by org.xml.sax.SAXException: A CacheException was thrown 
> while parsing XML.
> org.apache.geode.cache.RegionExistsException: /regionTest
>             at 
> org.apache.geode.internal.cache.xmlcache.CacheXmlParser.parse(CacheXmlParser.java:267)
>             at 
> org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4119)
>             at 
> org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:208)
>             at 
> org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1426)
>             at 
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1391)
>             at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
>             at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
>             at 
> org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
>             at 
> org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
>             at 
> org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:892)
>             at 
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:807)
>             at 
> org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737)
>             at 
> org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256)
> Caused by: org.xml.sax.SAXException: A CacheException was thrown while 
> parsing XML.
> {code}
> CreateRegionCommand checks whether there is already an existing region using 
> the DistributedRegionMXBean service. This service is unreliable during 
> restarts, as it takes some time for the locator to accumulate this 
> information. So during the startup of the servers locator will send 
> CreateRegionFunction to servers since it couldn't detect that region already 
> exists using the DistributedRegionMXBean service. In most scenarios, the 
> server will reject that function. The only case the server will execute the 
> function is when using the local cache.xml configuration in addition to the 
> cluster configuration. It seems that processing local cache.xml creates a 
> time gap on the server that will accept CreateRegionFunction, although the 
> region is already available in the cluster configuration.
> Solution:
> In addition to checks done against the DistributedRegionMXBean information, 
> do the same checks against cluster configuration (if used) stored in 
> locators. This way, the "create region" command is rejected immediately by 
> the locator instead of later by the servers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to