Barry Oglesby created GEODE-2848:
------------------------------------
Summary: While destroying a LuceneIndex, the AsyncEventQueue
region is destroyed in remote members before stopping the AsyncEventQueue
Key: GEODE-2848
URL: https://issues.apache.org/jira/browse/GEODE-2848
Project: Geode
Issue Type: Bug
Components: lucene
Reporter: Barry Oglesby
This causes a NullPointerException in BatchRemovalThread getAllRecipients like:
{noformat}
[fine 2017/04/24 14:27:29.163 PDT gemfire4_r02-s28_3222 <BatchRemovalThread>
tid=0x6b] BatchRemovalThread: ignoring exception
java.lang.NullPointerException
at
org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue$BatchRemovalThread.getAllRecipients(ParallelGatewaySenderQueue.java:1776)
at
org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue$BatchRemovalThread.run(ParallelGatewaySenderQueue.java:1722)
{noformat}
This message is currently only logged at fine level and doesn't cause any real
issues.
The simple fix is to check for null in getAllRecipients like:
{noformat}
PartitionedRegion pReg = ((PartitionedRegion) (cache.getRegion((String) pr)));
if (pReg != null) {
recipients.addAll(pReg.getRegionAdvisor().adviseDataStore());
}
{noformat}
Another more complex fix is to change the destroyIndex sequence.
The current destroyIndex sequence is:
# stops and destroys the AEQ in the initiator (including the underlying PR)
# closes the repository manager in the initiator
# stops and destroys the AEQ in remote members (not including the underlying PR)
# closes the repository manager in the remote members
# destroys the fileAndChunk region in the initiator
Between steps 1 and 3, the region will be null in the remote members, so the
NPE can occur.
A better sequence would be:
# stops the AEQ in the initiator
# stops the AEQ in remote members
# closes the repository manager in the initiator
# closes the repository manager in the remote members
# destroys the AEQ in the initiator (including the underlying PR)
# destroys the AEQ in the remote members (not including the underlying PR)
# destroys the fileAndChunk region in the initiator
That would be 3 messages between the members.
I think that can be combined into one remote message like:
# stops the AEQ in the initiator
# closes the repository manager in the initiator
# stops the AEQ in remote members
# closes the repository manager in the remote members
# destroys the AEQ in the remote members (not including the underlying PR)
# destroys the AEQ in the initiator (including the underlying PR)
# destroys the fileAndChunk region in the initiator
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)