[
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiangtao Liu updated KAFKA-8270:
--------------------------------
Description:
What's the issue?
{quote}There were log segments, which can not be deleted over configured
retention hours.
{quote}
What are impacts?
{quote} # Log space keep in increasing and finally cause space shortage.
# There are lots of log segment rolled with a small size. e.g log segment may
be only 50mb, not the expected 1gb.
# Kafka stream or client may experience missing data.
# It will be a way which may be used to attack Kafka server.{quote}
What's workaround adopted to resolve this issue?
{quote} # if it's already happened on your Kafka system, you will follow a very
tricky steps to resolve it. (I have documented it at
[here|[https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]])
# if it has not happened on your Kafka system yet, you may need to evaluate
whether you can switch to LogAppendTime for log.message.timestamp.type. {quote}
What are the reproduce steps?
{quote} # Make sure Kafka client and server are not hosted in the same machine.
# Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
# Hack Kafka client's system clock time with a *future time*, e.g
03/04/*2025*, 3:25:52 PM
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
# Send message from Kafka client to server.{quote}
What kinds of things you need to have a look after message handled by Kafka
server?
{quote} # Check the timestamp in segment *.timeindex (e.g
00000000035957300794.timeindex) and record in segment *.log. You will see all
the timestamp values in **.timeindex are messed up with a future time after
`03/04/*2025*, 3:25:52 PM
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`. (let's
say 00000000035957300794.log is the log segment which receive the record
firstly. Later we will use it in #3)
# You will also see the log segment will be rolled with a smaller size (e.g
50mb) than configured segment max size (e.g 1gb).
# All of log segments including 00000000035957300794.* and new rolled, will
not be deleted over retention hours.{quote}
What's the particular logic to cause this issue?
{quote} # private def deletableSegments(predicate: (LogSegment,
Option[LogSegment]) =>
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
will always return empty deletable log segments.{color:#172b4d} {color}{quote}
was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured
retention hours.
{quote}
What are impacts?
{quote} # Log space keep in increasing and finally cause space shortage.
# There are lots of log segment rolled with a small size. e.g log segment may
be only 50mb, not the expected 1gb.
# Kafka stream or client may experience missing data.
# It will be a way which may be used to attack Kafka server.{quote}
What's workaround adopted to resolve this issue?
{quote} # if it's already happened on your Kafka system, you will follow a very
tricky steps to resolve it.
# if it has not happened on your Kafka system yet, you may need to evaluate
whether you can switch to LogAppendTime for log.message.timestamp.type. {quote}
What are the reproduce steps?
{quote} # Make sure Kafka client and server are not hosted in the same machine.
# Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
# Hack Kafka client's system clock time with a *future time*, e.g
03/04/*2025*, 3:25:52 PM
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
# Send message from Kafka client to server.{quote}
What kinds of things you need to have a look after message handled by Kafka
server?
{quote} # Check the timestamp in segment *.timeindex (e.g
00000000035957300794.timeindex) and record in segment *.log. You will see all
the timestamp values in **.timeindex are messed up with a future time after
`03/04/*2025*, 3:25:52 PM
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`. (let's
say 00000000035957300794.log is the log segment which receive the record
firstly. Later we will use it in #3)
# You will also see the log segment will be rolled with a smaller size (e.g
50mb) than configured segment max size (e.g 1gb).
# All of log segments including 00000000035957300794.* and new rolled, will
not be deleted over retention hours.{quote}
What's the particular logic to cause this issue?
{quote} # private def deletableSegments(predicate: (LogSegment,
Option[LogSegment]) =>
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
will always return empty deletable log segments.{color:#172b4d} {color}{quote}
> Kafka timestamp-based retention policy is not working when Kafka client's
> time is not reliable.
> -----------------------------------------------------------------------------------------------
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
> Issue Type: Bug
> Components: log, log cleaner, logging
> Affects Versions: 1.1.1
> Reporter: Jiangtao Liu
> Priority: Major
> Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured
> retention hours.
> {quote}
> What are impacts?
> {quote} # Log space keep in increasing and finally cause space shortage.
> # There are lots of log segment rolled with a small size. e.g log segment
> may be only 50mb, not the expected 1gb.
> # Kafka stream or client may experience missing data.
> # It will be a way which may be used to attack Kafka server.{quote}
> What's workaround adopted to resolve this issue?
> {quote} # if it's already happened on your Kafka system, you will follow a
> very tricky steps to resolve it. (I have documented it at
> [here|[https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]])
> # if it has not happened on your Kafka system yet, you may need to evaluate
> whether you can switch to LogAppendTime for log.message.timestamp.type.
> {quote}
> What are the reproduce steps?
> {quote} # Make sure Kafka client and server are not hosted in the same
> machine.
> # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
> # Hack Kafka client's system clock time with a *future time*, e.g
> 03/04/*2025*, 3:25:52 PM
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> # Send message from Kafka client to server.{quote}
> What kinds of things you need to have a look after message handled by Kafka
> server?
> {quote} # Check the timestamp in segment *.timeindex (e.g
> 00000000035957300794.timeindex) and record in segment *.log. You will see all
> the timestamp values in **.timeindex are messed up with a future time after
> `03/04/*2025*, 3:25:52 PM
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`. (let's
> say 00000000035957300794.log is the log segment which receive the record
> firstly. Later we will use it in #3)
> # You will also see the log segment will be rolled with a smaller size (e.g
> 50mb) than configured segment max size (e.g 1gb).
> # All of log segments including 00000000035957300794.* and new rolled, will
> not be deleted over retention hours.{quote}
> What's the particular logic to cause this issue?
> {quote} # private def deletableSegments(predicate: (LogSegment,
> Option[LogSegment]) =>
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
> will always return empty deletable log segments.{color:#172b4d}
> {color}{quote}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)