Luke Chen created KAFKA-16424:
---------------------------------
Summary: truncated logs will be left undeleted after alter dir
completion
Key: KAFKA-16424
URL: https://issues.apache.org/jira/browse/KAFKA-16424
Project: Kafka
Issue Type: Bug
Affects Versions: 3.7.0
Reporter: Luke Chen
When doing log dir movement, we'll create a temp future replica with the dir
named: topic-partition.uniqueId-future, ex:
t3-0.9af8e054dbe249cf9379a210ec449af8-future. After the log dir movement
completed, we'll rename the future log dir to the normal log dir, in the above
case, it'll be "t3" only.
So, if there are some logs to be deleted during the log dir movement, we'll
send for a scheduler to do the deletion later
([here|https://github.com/apache/kafka/blob/2d4abb85bf4a3afb1e3359a05786ab8f3fda127e/core/src/main/scala/kafka/log/LocalLog.scala#L926]).
However, when the log dir movement completed, the future log is renamed, the
async log deletion will fail with no file existed error:
{code:java}
[2024-03-26 17:35:10,809] INFO [LocalLog partition=t3-0,
dir=/tmp/kraft-broker-logs] Deleting segment files LogSegment(baseOffset=0,
size=0, lastModifiedTime=0, largestRecordTimestamp=-1) (kafka.log.LocalLog$)
[2024-03-26 17:35:10,810] INFO Failed to delete log
/tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/00000000000000000000.log.deleted
because it does not exist. (org.apache.kafka.storage.internals.log.LogSegment)
[2024-03-26 17:35:10,811] INFO Failed to delete offset index
/tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/00000000000000000000.index.deleted
because it does not exist. (org.apache.kafka.storage.internals.log.LogSegment)
[2024-03-26 17:35:10,811] INFO Failed to delete time index
/tmp/kraft-broker-logs/t3-0.9af8e054dbe249cf9379a210ec449af8-future/00000000000000000000.timeindex.deleted
because it does not exist. (org.apache.kafka.storage.internals.log.LogSegment)
{code}
I think we could consider fall back to the normal log dir if the future log dir
cannot find the files. That is, when the file cannot be found under
"t3-0.9af8e054dbe249cf9379a210ec449af8-future" dir, then try to find "t3"
folder, and delete the file. Because the file is already having the suffix with
".delete", it should be fine if we delete them.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)