Matthias J. Sax created KAFKA-20438:
---------------------------------------
Summary: Allow dropping "future" records
Key: KAFKA-20438
URL: https://issues.apache.org/jira/browse/KAFKA-20438
Project: Kafka
Issue Type: Improvement
Components: streams
Reporter: Matthias J. Sax
In the Kafka Streams DSL, certain stateful operators advance their state based
on stream-time, ie, the highest observes record timestamp over all input
records.
In particular, window- and session-stores (and other segmented state stores)
roll segments base on stream-time, and also apply window-close and
grace-periods base on it.
The issue is, that a single malformed record with a timestamp far in the
future, might advance stream-time incorrectly, thus triggering closing windows
and rolling segments too early, and there is no good way to recover from it.
Currently, users could build manual filtering and compare a record timestamps
to current stream-time (which is tracked by the KS runtime) to avoid such
issues.
We propose to add a built-in mechanism (either via a global config, or a
per-window parameter like "grace period"), to let Kafka Streams drop such
malformed future records automatically.
Note: while it could also make sense to write such records into a DLQ, we
should tackle DLQ support separately.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)