lucasbru commented on code in PR #20457:
URL: https://github.com/apache/kafka/pull/20457#discussion_r2330172947


##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##########
@@ -6005,7 +6010,13 @@ public void replay(
 
         if (value == null)  {
             // Tombstone. Group should be removed.
-            removeGroup(groupId);
+            // In case of streams groups, which get inserted into memory 
immediately to store soft state,
+            // It may happen that the groups map contains the new streams 
groups already, and the classic group
+            // was removed already. In this case, we can ignore the tombstone.

Review Comment:
   The internal memory state of the coordinator runtime is essentially 
"snapshottable". So we can create a snapshot that corresponds to a specific 
offset of the consumer offset topic, and revert to that snapshot if something 
goes wrong.
   
   The coordinator runtime works this way:
   
    - When a heartbeat comes, we create a bunch of records. We typically do not 
update the in-memory state directly, but just create records that will update 
the in-memory state.
    - After the heartbeat handler finished, we attempt to append the records to 
the consumer offset topic.
    - Then, we replay the records up to the end, which updates our in-memory 
state.
    - If we fail to append and replay our records, we revert to the previous 
in-memory state
    - Otherwise, we create a new snapshot of the in-memory state, attached to 
the offset of the last record we replayed (the "lastCommmittedOffset").
   
   The replay obviously is also triggered when loading the coordinator.
   
   When you call `Admin.listGroups` or `Admin.describeStreamsGroup`, the 
coordinator always reads from the last snapshot. The problem in the bug was 
that we'd not create `configuredTopology` during replay, but on the first 
heartbeat (because we do not persist it in the records, but we can derive it 
from the records). So the last snapshot in memory, would not always contain the 
configuredTopology.
   
   The alternative to this fix would be to change the records in the KIP to 
also persist configuredTopology. We can discuss if this is a way we want to go 
at this point.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to