francisoliverlee opened a new issue #4949:
URL: https://github.com/apache/incubator-doris/issues/4949


   **Describe the bug**
   0.12.21.release, add fe follower error
   
   **To Reproduce**
   1. use "kill -9" to kill one follower FE
   2. delete the FE's meta dir, and start it but  fail
   3. use "alter system" to drop the FE, clean the meta dir, use "alter system" 
to add it into the cluster 
   and get the error
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Screenshots**
   ```
   2020-11-23 13:13:59.722 UTC 警告 [11.11.27.153_9010_1606137216465] Exiting 
inner Replica loop with exception com.sleepycat.je.EnvironmentFailureException: 
(JE 7.3.7) 11.11.27.153_9010_1606137216465(-1):/data/doris-fe/doris-meta/bdb  
Feeder: 11.11.27.152_9010_1603703245840(14). 
com.sleepycat.je.rep.impl.RepGroupImpl$NodeConflictException: (JE 7.3.7) New or 
moved node:11.11.27.153_9010_1606137216465, is configured with the socket 
address: /11.11.27.153:9010.  It conflicts with the socket already used by the 
member: 11.11.27.153_9010_1603291427805 HANDSHAKE_ERROR: Error during the 
handshake between two nodes. Some validity or compatibility check failed, 
preventing further communication between the nodes. Environment is invalid and 
must be closed.
   com.sleepycat.je.EnvironmentFailureException: (JE 7.3.7) 
11.11.27.153_9010_1606137216465(-1):/data/doris-fe/doris-meta/bdb  Feeder: 
11.11.27.152_9010_1603703245840(14). 
com.sleepycat.je.rep.impl.RepGroupImpl$NodeConflictException: (JE 7.3.7) New or 
moved node:11.11.27.153_9010_1606137216465, is configured with the socket 
address: /11.11.27.153:9010.  It conflicts with the socket already used by the 
member: 11.11.27.153_9010_1603291427805 HANDSHAKE_ERROR: Error during the 
handshake between two nodes. Some validity or compatibility check failed, 
preventing further communication between the nodes. Environment is invalid and 
must be closed.
        at 
com.sleepycat.je.rep.stream.ReplicaFeederHandshake.verifyMembership(ReplicaFeederHandshake.java:334)
        at 
com.sleepycat.je.rep.stream.ReplicaFeederHandshake.execute(ReplicaFeederHandshake.java:259)
        at 
com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:691)
        at 
com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:474)
        at 
com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:409)
        at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1873)
   ```
   
   **Additional context**
   ```
   i do some check and find out that the droped-node are still in bdb memory, 
the two variables bellowing
   
/com/sleepycat/je/7.3.7/je-7.3.7.jar!/com/sleepycat/je/rep/impl/RepGroupImpl.class
   
       /* All the nodes that form the replication group, indexed by Id. */
       private final Map<Integer, RepNodeImpl> nodesById =
           new HashMap<Integer, RepNodeImpl>();
   
       /*
        * All the nodes that form the replication group, indexed by node name.
        * This map is used exclusively for efficient lookups by name. The map
        * nodesById does all the heavy lifting.
        */
       private final Map<String, RepNodeImpl> nodesByName =
           new HashMap<String, RepNodeImpl>();
   
   
   but in bdb image file , the droped node is not found. 
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to