I believe this is at debug level because the exception & its text ought
to be visible to the person attempting to start the new node. If that's
not the case we should probably change this to error/severe level though
it likely wouldn't make it to an alert listener because the node is
still joining the system. I see that you're using CacheServerLauncher.
That API and the ServerLauncher API both have a flaw that you should
investigate - see GEODE-4013. That flaw can cause a node to appear to
have crashed and take a while to clear from the membership view. I
recently saw this same problem of conflicting names and tracked its
cause down to this flaw.
On 11/23/17 8:43 AM, Anton Mironenko wrote:
Hello,
Currently when I start two servers, there is no any indication what went wrong.
Only when I add --log-level=fine, I can get a clue what's going on:
[fine 2017/11/23 19:26:22.911 MSK host1-server-1 <main> tid=0x1] cleaning up
incompletely started DistributionManager due to exception
org.apache.geode.IncompatibleSystemException: Member 10.50.3.14(host1-server-1:13008)<v4>:1024
could not join this distributed system because the existing member
10.50.3.38(host1-server-1:6609)<v3>:1025 used the same name. Set the "name" gemfire
property to a unique value.
at
org.apache.geode.distributed.internal.DistributionManager.create(DistributionManager.java:593)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:740)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:350)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:336)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:330)
at
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:205)
at
org.apache.geode.internal.cache.CacheServerLauncher.connect(CacheServerLauncher.java:792)
at
org.apache.geode.internal.cache.CacheServerLauncher.server(CacheServerLauncher.java:677)
at
org.apache.geode.internal.cache.CacheServerLauncher.main(CacheServerLauncher.java:214)
My question is:
Why is it a DEBUG category, and not ERROR?
https://github.com/apache/geode/blob/develop/geode-core/src/main/java/org/apache/geode/distributed/internal/DistributionManager.java#L658
} catch (RuntimeException r) {
if (distributionManager != null) {
if (logger.isDebugEnabled()) {
logger.debug("cleaning up incompletely started DistributionManager due to
exception", r);
}
distributionManager.uncleanShutdown(beforeJoined);
}
throw r;
}
Anton Mironenko
Software Architect
Amdocs ASP team
This message and the information contained herein is proprietary and
confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer
<https://www.amdocs.com/about/email-disclaimer>