Barry Oglesby created GEODE-2848:
------------------------------------

             Summary: While destroying a LuceneIndex, the AsyncEventQueue 
region is destroyed in remote members before stopping the AsyncEventQueue
                 Key: GEODE-2848
                 URL: https://issues.apache.org/jira/browse/GEODE-2848
             Project: Geode
          Issue Type: Bug
          Components: lucene
            Reporter: Barry Oglesby


This causes a NullPointerException in BatchRemovalThread getAllRecipients like:
{noformat}
[fine 2017/04/24 14:27:29.163 PDT gemfire4_r02-s28_3222 <BatchRemovalThread> 
tid=0x6b] BatchRemovalThread: ignoring exception
java.lang.NullPointerException
  at 
org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue$BatchRemovalThread.getAllRecipients(ParallelGatewaySenderQueue.java:1776)
  at 
org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue$BatchRemovalThread.run(ParallelGatewaySenderQueue.java:1722)
{noformat}
This message is currently only logged at fine level and doesn't cause any real 
issues.

The simple fix is to check for null in getAllRecipients like:
{noformat}
PartitionedRegion pReg = ((PartitionedRegion) (cache.getRegion((String) pr)));
if (pReg != null) {
  recipients.addAll(pReg.getRegionAdvisor().adviseDataStore());
}
{noformat}
Another more complex fix is to change the destroyIndex sequence.

The current destroyIndex sequence is:

# stops and destroys the AEQ in the initiator (including the underlying PR)
# closes the repository manager in the initiator
# stops and destroys the AEQ in remote members (not including the underlying PR)
# closes the repository manager in the remote members
# destroys the fileAndChunk region in the initiator

Between steps 1 and 3, the region will be null in the remote members, so the 
NPE can occur.

A better sequence would be:

# stops the AEQ in the initiator
# stops the AEQ in remote members
# closes the repository manager in the initiator
# closes the repository manager in the remote members
# destroys the AEQ in the initiator (including the underlying PR) 
# destroys the AEQ in the remote members (not including the underlying PR)
# destroys the fileAndChunk region in the initiator

That would be 3 messages between the members.

I think that can be combined into one remote message like:

# stops the AEQ in the initiator
# closes the repository manager in the initiator
# stops the AEQ in remote members
# closes the repository manager in the remote members
# destroys the AEQ in the remote members (not including the underlying PR)
# destroys the AEQ in the initiator (including the underlying PR) 
# destroys the fileAndChunk region in the initiator




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to