lianetm commented on code in PR #15778:
URL: https://github.com/apache/kafka/pull/15778#discussion_r1581203195
##########
tests/kafkatest/tests/client/consumer_test.py:
##########
@@ -242,16 +242,15 @@ def test_static_consumer_bounce(self, clean_shutdown,
static_membership, bounce_
self.rolling_bounce_consumers(consumer, keep_alive=num_keep_alive,
num_bounces=num_bounces)
num_revokes_after_bounce = consumer.num_revokes_for_alive() -
num_revokes_before_bounce
-
- check_condition = num_revokes_after_bounce != 0
+
# under static membership, the live consumer shall not revoke any
current running partitions,
# since there is no global rebalance being triggered.
if static_membership:
- check_condition = num_revokes_after_bounce == 0
-
- assert check_condition, \
- "Total revoked count %d does not match the expectation of having 0
revokes as %d" % \
- (num_revokes_after_bounce, check_condition)
+ assert num_revokes_after_bounce == 0, \
+ "Unexpected revocation triggered when bouncing static member.
Expecting 0 but had %d revocations" % num_revokes_after_bounce
+ elif consumer.is_eager():
+ assert num_revokes_after_bounce != 0, \
Review Comment:
> So maybe we should instead try to get the set of partitions and check that
it didn't change?
Doable, but that wouldn't truly ensure the static membership behaviour
either I guess, because the test is intentionally leaving 1 member alive that
could be the leader or not. So the assignment would remain the same regardless
of the static membership under CONSUMER (or Cooperative) if the single
partition belongs to the live member (single partition that I guess was
intentionally decided to be able to easily check the delivery semantics after
the bounces)
So as I see it this test is specifically crafted to validate the stickiness
that static membership brings into the `RangeAssignor` ( nicely explained in
[this
section](https://github.com/apache/kafka/blob/e7792258df934a5c8470c2925c5d164c7d5a8e6c/clients/src/main/java/org/apache/kafka/clients/consumer/RangeAssignor.java#L58-L78)
of the `RangeAssignor` class doc btw). We're trying to apply it to the
CONSUMER protocol but finding not much value given the purpose of the test and
the shape it has (bounce n-1 nodes, check at least 1 revocation if dynamic,
none if static, regardless of partition ownership). Listening to myself telling
you this makes me reconsider if we should just not run this test for the new
protocol, as it was truly never intended or run for CooperativeAssignor? (I
would probably rename it to something like
`test_eager_stickiness_on_static_consumer_bounce` , make the use of
RangeAssignor explicit, and then it looks clearer that we don't want to run
such test on Cooper
ativeAssignor or CONSUMER protocol.
We do have other tests that ensure that static membership behaves as
expected for the new protocol (ex. `test_fencing_static_consumer`), but agreed
that the "owned partition not re-assigned for a static member that is bounced"
is not covered in sys tests (not for CONSUMER, not for Cooperative either). We
could think of a new test to cover that. The shape would be different I expect,
because it would either have to rely on bouncing a member with assignment while
having others with none, or ensuring that all members have at least 1
partition. It would also need a different, more complex delivery semantics
validation, if any. I would just suggest a different Jira/PR for a new test, to
be able to finalize migrating the current sys tests that apply to the new
protocol. Makes sense?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]