Matthias J. Sax created KAFKA-20438:
---------------------------------------

             Summary: Allow dropping "future" records
                 Key: KAFKA-20438
                 URL: https://issues.apache.org/jira/browse/KAFKA-20438
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: Matthias J. Sax


In the Kafka Streams DSL, certain stateful operators advance their state based 
on stream-time, ie, the highest observes record timestamp over all input 
records.

In particular, window- and session-stores (and other segmented state stores) 
roll segments base on stream-time, and also apply window-close and 
grace-periods base on it.

The issue is, that a single malformed record with a timestamp far in the 
future, might advance stream-time incorrectly, thus triggering closing windows 
and rolling segments too early, and there is no good way to recover from it.

Currently, users could build manual filtering and compare a record timestamps 
to current stream-time (which is tracked by the KS runtime) to avoid such 
issues.

We propose to add a built-in mechanism (either via a global config, or a 
per-window parameter like "grace period"), to let Kafka Streams drop such 
malformed future records automatically.

Note: while it could also make sense to write such records into a DLQ, we 
should tackle DLQ support separately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to