Boyang Chen created KAFKA-9475:
----------------------------------
Summary: Replace transaction abortion scheduler with a delayed
queue
Key: KAFKA-9475
URL: https://issues.apache.org/jira/browse/KAFKA-9475
Project: Kafka
Issue Type: Sub-task
Reporter: Boyang Chen
Although we could try setting the txn timeout to be 10 second, the purging
scheduler only works every one minute interval, so in the worst case we shall
still wait for 1 minute. We are considering several potential fixes:
# Change interval to 10 seconds: means we will have 6X frequent checking, more
read contention on txn metadata. The benefit here is an easy one-line fix
without correctness concern
# Use an existing delayed queue, a.k.a purgatory. From what I heard, the
purgatory needs at least 2 extra threads to work properly, with some add-on
overhead for memory and complexity. The benefit here is more precise timeout
reaction, without a redundant full metadata read lock.
# Create a new delayed queue. This could be done by using scala delayed queue,
the concern here is that whether this approach is production ready. Benefits
are the same as 2, with less code complexity potentially
This ticket is to track #2 progress if we decide to go through this path
eventually.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)