[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

Jiangtao Liu (Jira) Mon, 13 Apr 2020 12:45:05 -0700


     [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jiangtao Liu updated KAFKA-8270:
--------------------------------
    Description: 
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.
 # It will be a way which may be used to attack Kafka server.{quote}
What's workaround adopted to resolve this issue?
{quote} # if it's already happened on your Kafka system, you will follow a very 
tricky steps to resolve it. (I have documented it at 
[here|[https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]])
 # if it has not happened on your Kafka system yet, you may need to evaluate 
whether you can switch to LogAppendTime for log.message.timestamp.type. {quote}
What are the reproduce steps?
{quote} # Make sure Kafka client and server are not hosted in the same machine.
 # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
 # Hack Kafka client's system clock time with a *future time*, e.g 
03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
 # Send message from Kafka client to server.{quote}
What kinds of things you need to have a look after message handled by Kafka 
server?
{quote} # Check the timestamp in segment *.timeindex (e.g 
00000000035957300794.timeindex) and record in segment *.log. You will see all 
the timestamp values in **.timeindex are messed up with a future time after 
`03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.   (let's 
say 00000000035957300794.log is the log segment which receive the record 
firstly. Later we will use it in #3)
 # You will also see the log segment will be rolled with a smaller size (e.g 
50mb) than configured segment max size (e.g 1gb). 
 # All of log segments including 00000000035957300794.* and new rolled, will 
not be deleted over retention hours.{quote}
What's the particular logic to cause this issue?
{quote} # private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.{color:#172b4d} {color}{quote}

  was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.
 # It will be a way which may be used to attack Kafka server.{quote}
What's workaround adopted to resolve this issue?
{quote} # if it's already happened on your Kafka system, you will follow a very 
tricky steps to resolve it.
 # if it has not happened on your Kafka system yet, you may need to evaluate 
whether you can switch to LogAppendTime for log.message.timestamp.type. {quote}
What are the reproduce steps?
{quote} # Make sure Kafka client and server are not hosted in the same machine.
 # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
 # Hack Kafka client's system clock time with a *future time*, e.g 
03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
 # Send message from Kafka client to server.{quote}
What kinds of things you need to have a look after message handled by Kafka 
server?
{quote} # Check the timestamp in segment *.timeindex (e.g 
00000000035957300794.timeindex) and record in segment *.log. You will see all 
the timestamp values in **.timeindex are messed up with a future time after 
`03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.   (let's 
say 00000000035957300794.log is the log segment which receive the record 
firstly. Later we will use it in #3)
 # You will also see the log segment will be rolled with a smaller size (e.g 
50mb) than configured segment max size (e.g 1gb). 
 # All of log segments including 00000000035957300794.* and new rolled, will 
not be deleted over retention hours.{quote}
What's the particular logic to cause this issue?
{quote} # private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.{color:#172b4d} {color}{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> -----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8270
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8270
>             Project: Kafka
>          Issue Type: Bug
>          Components: log, log cleaner, logging
>    Affects Versions: 1.1.1
>            Reporter: Jiangtao Liu
>            Priority: Major
>              Labels: storage
>         Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.
>  # It will be a way which may be used to attack Kafka server.{quote}
> What's workaround adopted to resolve this issue?
> {quote} # if it's already happened on your Kafka system, you will follow a 
> very tricky steps to resolve it. (I have documented it at 
> [here|[https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]])
>  # if it has not happened on your Kafka system yet, you may need to evaluate 
> whether you can switch to LogAppendTime for log.message.timestamp.type. 
> {quote}
> What are the reproduce steps?
> {quote} # Make sure Kafka client and server are not hosted in the same 
> machine.
>  # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
>  # Hack Kafka client's system clock time with a *future time*, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
>  # Send message from Kafka client to server.{quote}
> What kinds of things you need to have a look after message handled by Kafka 
> server?
> {quote} # Check the timestamp in segment *.timeindex (e.g 
> 00000000035957300794.timeindex) and record in segment *.log. You will see all 
> the timestamp values in **.timeindex are messed up with a future time after 
> `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.   (let's 
> say 00000000035957300794.log is the log segment which receive the record 
> firstly. Later we will use it in #3)
>  # You will also see the log segment will be rolled with a smaller size (e.g 
> 50mb) than configured segment max size (e.g 1gb). 
>  # All of log segments including 00000000035957300794.* and new rolled, will 
> not be deleted over retention hours.{quote}
> What's the particular logic to cause this issue?
> {quote} # private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.{color:#172b4d} 
> {color}{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

Reply via email to