gaurav-narula commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r1549695928
##########
core/src/main/scala/kafka/log/LogManager.scala:
##########
@@ -1173,6 +1173,35 @@ class LogManager(logDirs: Seq[File],
}
}
+ def recoverAbandonedFutureLogs(brokerId: Int, newTopicsImage: TopicsImage):
Unit = {
+ val abandonedFutureLogs = findAbandonedFutureLogs(brokerId, newTopicsImage)
+ abandonedFutureLogs.foreach { log =>
+ val tp = log.topicPartition
+
+ log.renameDir(UnifiedLog.logDirName(tp), shouldReinitialize = true)
+ log.removeLogMetrics()
+ futureLogs.remove(tp)
+
+ currentLogs.put(tp, log)
+ log.newMetrics()
+
+ info(s"Successfully renamed abandoned future log for $tp")
+ }
+ }
+
+ private def findAbandonedFutureLogs(brokerId: Int, newTopicsImage:
TopicsImage): Iterable[UnifiedLog] = {
+ futureLogs.values.flatMap { log =>
+ val topicId = log.topicId.getOrElse {
+ throw new RuntimeException(s"The log dir $log does not have a topic
ID, " +
+ "which is not allowed when running in KRaft mode.")
+ }
+ val partitionId = log.topicPartition.partition()
+ Option(newTopicsImage.getPartition(topicId, partitionId))
+ .filter(pr =>
directoryId(log.parentDir).contains(pr.directory(brokerId)))
+ .map(_ => log)
Review Comment:
Thanks for the feedback.
For (2), we've couple of options. We can either:
(a) ignore the future replica (say in dir2) if the main replica exists in an
online log dir (say dir1) or,
(b) promote the future replica (in dir2) and remove the main replica (in
dir1).
(a) would result in ReplicaManager spawning a replicaAlterLogDir thread for
the future replica and correcting the assignment to dir1, only for it to be
changed back again to dir2 when the replicaAlterLogDir thread finishes its job.
Refer
https://github.com/apache/kafka/blob/acecd370cc3b25f12926e7a4664a2648f08c6c9a/core/src/main/scala/kafka/server/ReplicaManager.scala#L2734
and
https://github.com/apache/kafka/blob/acecd370cc3b25f12926e7a4664a2648f08c6c9a/core/src/main/scala/kafka/server/ReplicaManager.scala#L2745
Since in these scenarios, the future replica is almost caught up with the
main replica, I'm leaning towards option (b) to avoid more reassignments.
Please let me know if you feel otherwise.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]