peterxcli commented on PR #17544:
URL: https://github.com/apache/kafka/pull/17544#issuecomment-2599646858
After merging trunk, the test
`QuorumControllerMetricsIntegrationTest#testFailingOverIncrementsNewActiveControllerCount
(forceFailoverUsingLogLayer=true)` failed with following log. The first
exception is expected, as `logEnv.activeLogManager().get().throwOnNextAppend()`
has been called. But the subsequent SnapshotRegistry exception seems always
happen 4 time, which leads the `NEW_ACTIVE_CONTROLLERS_COUNT` be 6 instead of 2.
```
org.apache.kafka.raft.errors.BufferAllocationException: Test asked to fail
the next prepareAppend
at
org.apache.kafka.metalog.LocalLogManager.prepareAppend(LocalLogManager.java:743)
~[test/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:827)
~[main/:?]
at
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:912)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
~[main/:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
16:27:39.012 [quorum-controller-1-event-handler] ERROR
org.apache.kafka.server.fault.MockFaultHandler - Encountered
nonFatalFaultHandler fault: createTopics: event failed with RuntimeException
(treated as UnknownServerException) at epoch 3 in 47070 microseconds.
Renouncing leadership and reverting to the last committed offset 5.
java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 3
because there is already a snapshot with epoch 5. Snapshot epochs are 5
at
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
~[main/:?]
at
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:912)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
~[main/:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
16:27:48.077 [quorum-controller-2-event-handler] ERROR
org.apache.kafka.server.fault.MockFaultHandler - Encountered
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with
RuntimeException (treated as UnknownServerException) at epoch 5 in 26900
microseconds. Renouncing leadership and reverting to the last committed offset
9.
java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 3
because there is already a snapshot with epoch 9. Snapshot epochs are 9
at
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
~[main/:?]
at
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
~[main/:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
16:27:57.133 [quorum-controller-0-event-handler] ERROR
org.apache.kafka.server.fault.MockFaultHandler - Encountered
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with
RuntimeException (treated as UnknownServerException) at epoch 7 in 29393
microseconds. Renouncing leadership and reverting to the last committed offset
13.
java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 6
because there is already a snapshot with epoch 13. Snapshot epochs are 13
at
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
~[main/:?]
at
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
~[main/:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
16:28:06.204 [quorum-controller-2-event-handler] ERROR
org.apache.kafka.server.fault.MockFaultHandler - Encountered
nonFatalFaultHandler fault: maybeFenceStaleBroker: event failed with
RuntimeException (treated as UnknownServerException) at epoch 9 in 32702
microseconds. Renouncing leadership and reverting to the last committed offset
16.
java.lang.RuntimeException: Can't create a new in-memory snapshot at epoch 4
because there is already a snapshot with epoch 16. Snapshot epochs are 16
at
org.apache.kafka.timeline.SnapshotRegistry.getOrCreateSnapshot(SnapshotRegistry.java:225)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.timeline.SnapshotRegistry.idempotentCreateSnapshot(SnapshotRegistry.java:245)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.controller.OffsetControlManager.handleScheduleAppend(OffsetControlManager.java:305)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.lambda$run$0(QuorumController.java:844)
~[main/:?]
at
org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:928)
~[main/:?]
at
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:821)
~[main/:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:132)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
~[kafka-server-common-4.1.0-SNAPSHOT.jar:?]
at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]
Expected :2
Actual :6
```
@ahuang98,
Do you have any thoughts on this issue? Any feedback would be greatly
appreciated. Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]