peterxcli commented on PR #17544:
URL: https://github.com/apache/kafka/pull/17544#issuecomment-2599646858

   After merging trunk, the test 
`QuorumControllerMetricsIntegrationTest#testFailingOverIncrementsNewActiveControllerCount
 (forceFailoverUsingLogLayer=true)` failed with following log. The first 
exception is expected, as `logEnv.activeLogManager().get().throwOnNextAppend()` 
has been called. But the subsequent SnapshotRegistry exception seems always 
happen 4 time, which leads the `NEW_ACTIVE_CONTROLLERS_COUNT` be 6 instead of 2.
   
   ```
   org.apache.kafka.raft.errors.BufferAllocationException: Test asked to fail 
the next prepareAppend
        at 
org.apache.kafka.metalog.LocalLogManager.prepareAppend(LocalLogManager.java:743)
 ~[test/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:827)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:912)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
 ~[main/:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
   16:27:39.012 [quorum-controller-1-event-handler] ERROR 
org.apache.kafka.server.fault.MockFaultHandler - Encountered 
nonFatalFaultHandler fault: createTopics: event failed with RuntimeException 
(treated as UnknownServerException) at epoch 3 in 47070 microseconds. 
Renouncing leadership and reverting to the last committed offset 5.
   java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 3 
because there is already a snapshot with epoch 5. Snapshot epochs are 5
        at 
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:912)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
 ~[main/:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
   16:27:48.077 [quorum-controller-2-event-handler] ERROR 
org.apache.kafka.server.fault.MockFaultHandler - Encountered 
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with 
RuntimeException (treated as UnknownServerException) at epoch 5 in 26900 
microseconds. Renouncing leadership and reverting to the last committed offset 
9.
   java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 3 
because there is already a snapshot with epoch 9. Snapshot epochs are 9
        at 
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
 ~[main/:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
   16:27:57.133 [quorum-controller-0-event-handler] ERROR 
org.apache.kafka.server.fault.MockFaultHandler - Encountered 
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with 
RuntimeException (treated as UnknownServerException) at epoch 7 in 29393 
microseconds. Renouncing leadership and reverting to the last committed offset 
13.
   java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 6 
because there is already a snapshot with epoch 13. Snapshot epochs are 13
        at 
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
 ~[main/:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
   16:28:06.204 [quorum-controller-2-event-handler] ERROR 
org.apache.kafka.server.fault.MockFaultHandler - Encountered 
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with 
RuntimeException (treated as UnknownServerException) at epoch 9 in 32702 
microseconds. Renouncing leadership and reverting to the last committed offset 
16.
   java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 4 
because there is already a snapshot with epoch 16. Snapshot epochs are 16
        at 
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
 ~[main/:?]
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
 ~[main/:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
 ~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
        at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
   
   Expected :2
   Actual   :6
   ```
   
   @ahuang98,
   Do you have any thoughts on this issue? Any feedback would be greatly 
appreciated. Thank you!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to