[jira] [Comment Edited] (CASSANDRA-21173) Snapshots from tables without table-id embedded in their folder name are not loaded by SnapshotLoader

Stefan Miklosovic (Jira) Mon, 09 Mar 2026 13:01:12 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-21173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18064234#comment-18064234
 ]


Stefan Miklosovic edited comment on CASSANDRA-21173 at 3/9/26 8:00 PM:
-----------------------------------------------------------------------

Sorry for asking but for what exact purpose do you have snapshots of dropped 
tables from times of 2.0 and you want to upgrade to 5.0? I just can not wrap my 
mind around the usecase. These snapshots have to have like 12 years, of 
_dropped tables_. Why dont you just back it up? I am also not sure that we 
should care so much about stuff in 2.0, these versions are officially 
discontinued and not supported anymore. 

??We could I suppose go with leveraging CFS for 5.0 if we have any concerns 
with changing on disk structures in an already released version. Though I guess 
in SnapshotManifest as you mentioned previously we have ignoreUnknown = true 
and I'm not sure I can find any concrete problems.??

I am trying to be as friction-less as possible. If we can go without we should. 
Why would we want to introduce it if we do not need it? That is my line of 
thinking here.

We can list snapshots of dropped tables from (1). I have not tested what 
happens when we try to load snapshot of dropped tables for which there is no 
table id, that is super corner-case. But in that case I do not think that the 
solution of adding table id into manifest would help anyway. A table is dropped 
so we do not have it via ColumnFamilyStore and directory does not contain it - 
so where would you want to actually get it from?

https://issues.apache.org/jira/browse/CASSANDRA-16843


was (Author: smiklosovic):
Sorry for asking but for what exact purpose do you have snapshots of dropped 
tables from times of 2.0 and you want to upgrade to 5.0? I just can not wrap my 
mind about the usecase. These snapshots have to have like 12 years, of _dropped 
tables_. Why dont you just back it up? I am also not sure that we should care 
so much about stuff in 2.0, these versions are officially discontinued and not 
supported anymore. 

??We could I suppose go with leveraging CFS for 5.0 if we have any concerns 
with changing on disk structures in an already released version. Though I guess 
in SnapshotManifest as you mentioned previously we have ignoreUnknown = true 
and I'm not sure I can find any concrete problems.??

I am trying to be as friction-less as possible. If we can go without we should. 
Why would we want to introduce it if we do not need it? That is my line of 
thinking here.

We can list snapshots of dropped tables from (1). I have not tested what 
happens when we try to load snapshot of dropped tables for which there is no 
table id, that is super corner-case. But in that case I do not think that the 
solution of adding table id into manifest would help anyway. A table is dropped 
so we do not have it via ColumnFamilyStore and directory does not contain it - 
so where would you want to actually get it from?

https://issues.apache.org/jira/browse/CASSANDRA-16843

> Snapshots from tables without table-id embedded in their folder name are not 
> loaded by SnapshotLoader
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21173
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21173
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Snapshots, Local/Startup and Shutdown
>            Reporter: Matt Byrd
>            Assignee: Matt Byrd
>            Priority: Normal
>             Fix For: 5.0.x, 5.1, 6.x
>
>         Attachments: ci_summary_trunk_mbyrd_CASSANDRA-21173.html
>
>
> Tables created prior to 2.1 do not have a table-id embedded in their table 
> folder name.
> This is handled correctly in Directories.java (see constructor) unfortunately 
> in SnapshotLoader, we use a regex which attempts to extract the table-id and 
> hence skips over any tables created prior to 2.1.
> The end result is that these tables are not visible in list snapshot and more 
> importantly cannot be cleared via nodetool clearsnapshot. This was noticed 
> upon major upgrade to 5.0.
> I've observed this on 5.0, from reading the code it appears likely improved 
> in 5.1, in that it now requires a restart in addition to trigger.
> Some related tickets:
> Introduction of table-id and backwards compatible handling of old folders 
> originally here:
> https://issues.apache.org/jira/browse/CASSANDRA-5202
> Machinery to list snapshots which doesn’t handle old format was added here:
> https://issues.apache.org/jira/browse/CASSANDRA-16843
> https://github.com/apache/cassandra/commit/31aa17a2a3b18bdda723123cad811f075287807d
> There was some discussion at the time of not handling pre 2.1 tables here:
> https://issues.apache.org/jira/browse/CASSANDRA-16843?focusedCommentId=17440088&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17440088
> Then nodetool clearsnapshot stopped working here with:
> https://issues.apache.org/jira/browse/CASSANDRA-17757
> Things improve a bit in 5.1 with 
> https://issues.apache.org/jira/browse/CASSANDRA-18111
> Now we no longer try and load the snapshots via SnapshotLoader in entirety 
> before deciding if we can clear them, but instead make use of 
> SnapshotManager. Whilst snapshots taken while the jvm is running are now 
> visible and clearable, from reading upon restart we lose that information and 
> cannot view/clear snapshots created before the restart.
> One solution to handle these pre 2.1 tables, is to include the table-id in 
> the manifest.json, then we'll be able to read this information if not 
> available from folder name upon restart.
> Another possibility which doesn't fix as many problems, is just to expose via 
> jmx/nodetool
> something to allow operators to bypass the snapshot loading mechanism and 
> directly clear the old pre-2.1 snapshots.
> A more involved and risky change would be to somehow think about how we 
> migrate all this existing data in different folder structures to new 
> consistent folder structure, but this seems quite involved and would likely 
> deserve it's own JIRA at least.
> I have a patch locally against trunk for the first approach, just storing the 
> tableId in the manifest.json, which does this and will run it against CI.
> I'll have a further think about if there are any other approaches, if anyone 
> has any ideas let me know.
> Another thing to consider is where we should apply this change.
> Probably at a minimum 5.0, since that's where one can no longer nodetool 
> clearsnapshot on certain tables and the effect is a bit worse there than in 
> 5.1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-21173) Snapshots from tables without table-id embedded in their folder name are not loaded by SnapshotLoader

Reply via email to