[
https://issues.apache.org/jira/browse/KAFKA-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Chen updated KAFKA-17428:
------------------------------
Description:
Currently, we will delete failed uploaded segment and Custom metadata size
exceeded segments in copyLogSegment in RLMCopyTask. But after deletion, these
segment states are still in COPY_SEGMENT_STARTED. That "might" cause unexpected
issues in the future. We'd better to move the state from
{{COPY_SEGMENT_STARTED}} -> {{DELETE_SEGMENT_STARTED}} ->
{{DELETE_SEGMENT_FINISHED}}
updated:
I thought about this when I first had a look at it and one thing that bothered
me is that {{DELETE_SEGMENT_STARTED}} means to me that we're now in a state
where we attempt deletion. However if the remote store is down and we fail to
copy and delete we will leave that segment in {{DELETE_SEGMENT_STARTED}} and
not attempt to delete it till the segment itself breaches retention.ms/bytes.
We can probably just make it clearer but that was my thought at the time.
So, maybe when in deletion loop, we can add {{DELETE_SEGMENT_STARTED}} segments
into deletion directly, but that also needs to consider the retention size
calculation.
was:Currently, we will delete failed uploaded segment and Custom metadata
size exceeded segments in copyLogSegment in RLMCopyTask. But after deletion,
these segment states are still in COPY_SEGMENT_STARTED. That "might" cause
unexpected issues in the future. We'd better to move the state from
{{COPY_SEGMENT_STARTED}} -> {{DELETE_SEGMENT_STARTED}} ->
{{DELETE_SEGMENT_FINISHED}}
> remote segments deleted in RLMCopyTask stays `COPY_SEGMENT_START` state
> -----------------------------------------------------------------------
>
> Key: KAFKA-17428
> URL: https://issues.apache.org/jira/browse/KAFKA-17428
> Project: Kafka
> Issue Type: Improvement
> Reporter: Luke Chen
> Priority: Major
>
> Currently, we will delete failed uploaded segment and Custom metadata size
> exceeded segments in copyLogSegment in RLMCopyTask. But after deletion, these
> segment states are still in COPY_SEGMENT_STARTED. That "might" cause
> unexpected issues in the future. We'd better to move the state from
> {{COPY_SEGMENT_STARTED}} -> {{DELETE_SEGMENT_STARTED}} ->
> {{DELETE_SEGMENT_FINISHED}}
>
> updated:
> I thought about this when I first had a look at it and one thing that
> bothered me is that {{DELETE_SEGMENT_STARTED}} means to me that we're now in
> a state where we attempt deletion. However if the remote store is down and we
> fail to copy and delete we will leave that segment in
> {{DELETE_SEGMENT_STARTED}} and not attempt to delete it till the segment
> itself breaches retention.ms/bytes.
> We can probably just make it clearer but that was my thought at the time.
> So, maybe when in deletion loop, we can add {{DELETE_SEGMENT_STARTED}}
> segments into deletion directly, but that also needs to consider the
> retention size calculation.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)