Re: [DISCUSS] inotify for detection of manually removed snapshots
If we consider caching it all to be too much, we might probably make caching an option an admin would need to opt-in into? There might be a flag in cassandra.yaml, once enabled, it would be in memory, otherwise it would just load it as it was so people can decide if caching is enough for them or they want to have it as it was before (would be by default set to as it was). This puts additional complexity into SnapshotManager but it should be in general doable. Let me know what you think, I would really like to have this resolved, 18111 brings a lot of code cleanup and simplifies stuff a lot. On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie wrote: > If you have a lot of snapshots and have for example a metric monitoring > them and their sizes, if you don’t cache it, creating the metric can cause > performance degradation. We added the cache because we saw this happen to > databases more than once. > > I mean, I believe you, I'm just surprised querying out metadata for files > and basic computation is leading to hundreds of ms pause times even on > systems with a lot of files. Aren't most / all of these values cached at > the filesystem layer so we're basically just tomato / tomahto caching > systems, either one we maintain or one the OS maintains? > > Or is there really just a count of files well outside what I'm thinking > here? > > Anyway, not trying to cause a ruckus and make needless noise, trying to > learn. ;) > > > On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: > > > > On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: > > With WatcherService, when events are missed (which is to be expected), you > will still need to list the files. It seems to me that WatcherService > doesn't offer significant benefits in this case. > > > Yeah I think we leave it out eventually. > > > Regarding listing directory with a refresh flag, my concern is the > potential for abuse. End-users might/could always refresh before listing, > which could undermine the purpose of caching. Perhaps Jeremiah can provide > more insight on this. > > > Well, by default, it would not be refreshed every single time. You would > need to opt-in into that. If there is a shop which has users with a direct > access to the disk of Cassandra nodes and they are removing data manually, > I do not know what to say, what is nodetool clearsnapshot and jmx methods > good for then? I do not think we can prevent people from shooting into > their feet if they are absolutely willing to do that. > > If they want to refresh that every time, that would be equal to the > current behavior. It would be at most as "bad" as it is now. > > > IMO, caching is best handled internally. I have a few UX-related questions: > - Is it valid or acceptable to return stale data? If so, end-users have to > do some form of validation before consuming each snapshot to account for > potential deletions. > > > answer below > > - Even if listsnapshot returns the most recent data, is it possible that > some of the directories get deleted when end-users are accessing them? I > think it is true. It, then, enforces end-users to do some validation first, > similar to handling stale data. > > > I think that what you were trying to say is that when at time T0 somebody > lists snapshots and at T1 somebody removes a snapshot manually then the > list of snapshots is not actual anymore? Sure. That is a thing. This is how > it currently works. > > Now, we want to cache them, so if you clear a snapshot which is not > physically there because somebody removed it manually, that should be a > no-op, it will be just removed from the internal tracker. So, if it is at > disk and in cache and you clear it, then all is fine. It is fine too if it > is not on disk anymore and you clear it, then it is just removed > internally. It would fail only in case you want to remove a snapshot which > is not cached, regardless whether it is on disk or not. The deletion of > non-existing snapshot ends up with a failure, nothing should be changed in > that regard, this is the current behavior too. > > I want to say that I did not write it completely correctly at the very > beginning of this thread. Currently, we are caching only _expiring_ > snapshots, because we need to know what is their time of removal so we act > on it later. _normal_ snapshots are _not_ cached _yet_. I spent so much > time with 18111 that I live in a reality where it is already in, I forgot > this is not actually in place yet, we are very close to that. > > OK thank you all for your insights, I will NOT use inotify. > > > Just my 2 cents. > > - Yifan > > On Wed, Aug 7, 2024 at 6:03 AM Štefan Miklošovič > wrote: > > Yes, for example as reported here > > https://issues.apache.org/jira/browse/CASSANDRA-13338 > > People who are charting this in monitoring dashboards might also hit this. > > On Wed, Aug 7, 2024 at 2:59 PM J. D. Jordan > wrote: > > If you have a lot of snapshots and have for example a metric monitoring > them and their sizes,
Re: [DISCUSS] inotify for detection of manually removed snapshots
Has anyone considered simply updating the documentation saying this? "Removing the snapshot files directly from the filesystem may break things. Always use the `nodetool` command or JMX to remove snapshots." On 09/08/2024 09:18, Štefan Miklošovič wrote: If we consider caching it all to be too much, we might probably make caching an option an admin would need to opt-in into? There might be a flag in cassandra.yaml, once enabled, it would be in memory, otherwise it would just load it as it was so people can decide if caching is enough for them or they want to have it as it was before (would be by default set to as it was). This puts additional complexity into SnapshotManager but it should be in general doable. Let me know what you think, I would really like to have this resolved, 18111 brings a lot of code cleanup and simplifies stuff a lot. On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie wrote: If you have a lot of snapshots and have for example a metric monitoring them and their sizes, if you don’t cache it, creating the metric can cause performance degradation. We added the cache because we saw this happen to databases more than once. I mean, I believe you, I'm just surprised querying out metadata for files and basic computation is leading to hundreds of ms pause times even on systems with a lot of files. Aren't most / all of these values cached at the filesystem layer so we're basically just tomato / tomahto caching systems, either one we maintain or one the OS maintains? Or is there really just a count of files well outside what I'm thinking here? Anyway, not trying to cause a ruckus and make needless noise, trying to learn. ;) On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: With WatcherService, when events are missed (which is to be expected), you will still need to list the files. It seems to me that WatcherService doesn't offer significant benefits in this case. Yeah I think we leave it out eventually. Regarding listing directory with a refresh flag, my concern is the potential for abuse. End-users might/could always refresh before listing, which could undermine the purpose of caching. Perhaps Jeremiah can provide more insight on this. Well, by default, it would not be refreshed every single time. You would need to opt-in into that. If there is a shop which has users with a direct access to the disk of Cassandra nodes and they are removing data manually, I do not know what to say, what is nodetool clearsnapshot and jmx methods good for then? I do not think we can prevent people from shooting into their feet if they are absolutely willing to do that. If they want to refresh that every time, that would be equal to the current behavior. It would be at most as "bad" as it is now. IMO, caching is best handled internally. I have a few UX-related questions: - Is it valid or acceptable to return stale data? If so, end-users have to do some form of validation before consuming each snapshot to account for potential deletions. answer below - Even if listsnapshot returns the most recent data, is it possible that some of the directories get deleted when end-users are accessing them? I think it is true. It, then, enforces end-users to do some validation first, similar to handling stale data. I think that what you were trying to say is that when at time T0 somebody lists snapshots and at T1 somebody removes a snapshot manually then the list of snapshots is not actual anymore? Sure. That is a thing. This is how it currently works. Now, we want to cache them, so if you clear a snapshot which is not physically there because somebody removed it manually, that should be a no-op, it will be just removed from the internal tracker. So, if it is at disk and in cache and you clear it, then all is fine. It is fine too if it is not on disk anymore and you clear it, then it is just removed internally. It would fail only in case you want to remove a snapshot which is not cached, regardless whether it is on disk or not. The deletion of non-existing snapshot ends up with a failure, nothing should be changed in that regard, this is the current behavior too. I want to say that I did not write it completely correctly at the very beginning of this thread. Currently, we are caching only _expiring_ snapshots, because we need to know what is their time of removal so we act on it later. _normal_ snapshots are _not_ cached _yet_. I spent so much time with 18111 that I live in a reality where it is already in, I forgot this is not actually in place yet, we are very close to that. OK thank you all for your
Re: [DISCUSS] inotify for detection of manually removed snapshots
We could indeed do that. Does your suggestion mean that there should not be a problem with caching it all once explicitly stated like that? On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev wrote: > Has anyone considered simply updating the documentation saying this? > > "Removing the snapshot files directly from the filesystem may break > things. Always use the `nodetool` command or JMX to remove snapshots." > On 09/08/2024 09:18, Štefan Miklošovič wrote: > > If we consider caching it all to be too much, we might probably make > caching an option an admin would need to opt-in into? There might be a flag > in cassandra.yaml, once enabled, it would be in memory, otherwise it would > just load it as it was so people can decide if caching is enough for them > or they want to have it as it was before (would be by default set to as it > was). This puts additional complexity into SnapshotManager but it should be > in general doable. > > Let me know what you think, I would really like to have this resolved, > 18111 brings a lot of code cleanup and simplifies stuff a lot. > > On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie > wrote: > >> If you have a lot of snapshots and have for example a metric monitoring >> them and their sizes, if you don’t cache it, creating the metric can cause >> performance degradation. We added the cache because we saw this happen to >> databases more than once. >> >> I mean, I believe you, I'm just surprised querying out metadata for files >> and basic computation is leading to hundreds of ms pause times even on >> systems with a lot of files. Aren't most / all of these values cached at >> the filesystem layer so we're basically just tomato / tomahto caching >> systems, either one we maintain or one the OS maintains? >> >> Or is there really just a count of files well outside what I'm thinking >> here? >> >> Anyway, not trying to cause a ruckus and make needless noise, trying to >> learn. ;) >> >> >> On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: >> >> >> >> On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: >> >> With WatcherService, when events are missed (which is to be expected), >> you will still need to list the files. It seems to me that WatcherService >> doesn't offer significant benefits in this case. >> >> >> Yeah I think we leave it out eventually. >> >> >> Regarding listing directory with a refresh flag, my concern is the >> potential for abuse. End-users might/could always refresh before listing, >> which could undermine the purpose of caching. Perhaps Jeremiah can provide >> more insight on this. >> >> >> Well, by default, it would not be refreshed every single time. You would >> need to opt-in into that. If there is a shop which has users with a direct >> access to the disk of Cassandra nodes and they are removing data manually, >> I do not know what to say, what is nodetool clearsnapshot and jmx methods >> good for then? I do not think we can prevent people from shooting into >> their feet if they are absolutely willing to do that. >> >> If they want to refresh that every time, that would be equal to the >> current behavior. It would be at most as "bad" as it is now. >> >> >> IMO, caching is best handled internally. I have a few UX-related >> questions: >> - Is it valid or acceptable to return stale data? If so, end-users have >> to do some form of validation before consuming each snapshot to account for >> potential deletions. >> >> >> answer below >> >> - Even if listsnapshot returns the most recent data, is it possible that >> some of the directories get deleted when end-users are accessing them? I >> think it is true. It, then, enforces end-users to do some validation first, >> similar to handling stale data. >> >> >> I think that what you were trying to say is that when at time T0 somebody >> lists snapshots and at T1 somebody removes a snapshot manually then the >> list of snapshots is not actual anymore? Sure. That is a thing. This is how >> it currently works. >> >> Now, we want to cache them, so if you clear a snapshot which is not >> physically there because somebody removed it manually, that should be a >> no-op, it will be just removed from the internal tracker. So, if it is at >> disk and in cache and you clear it, then all is fine. It is fine too if it >> is not on disk anymore and you clear it, then it is just removed >> internally. It would fail only in case you want to remove a snapshot which >> is not cached, regardless whether it is on disk or not. The deletion of >> non-existing snapshot ends up with a failure, nothing should be changed in >> that regard, this is the current behavior too. >> >> I want to say that I did not write it completely correctly at the very >> beginning of this thread. Currently, we are caching only _expiring_ >> snapshots, because we need to know what is their time of removal so we act >> on it later. _normal_ snapshots are _not_ cached _yet_. I spent so much >> time with 18111 that I live in a reality where it is a
Re: [DISCUSS] inotify for detection of manually removed snapshots
It is also worth to say that there are plans to include more snapshot metadata in the future, for example, it would be nice if we had a way to verify the consistency of a snapshot. That leads to a manifest.json file which would contain checksums to each file of a snapshot and similar. The result of doing that would be that one could do "nodetool listsnapshots --verify" which would compute the hashes on disk and it would compare it with the hashes in manifest file (hence, with in-memory metadata) so we would be sure that a snapshot is not corrupted in any way before restoring it. I measured the amount of memory in-memory metadata would hold and currently it would take around 5MB for 10 000 snapshots. 20K snapshots would take around 10MB. If we started to include checksums as well, that would increase memory pressure but looking at these numbers and the amount of snapshots a typical node has (I do not think anybody has 10k+ snapshots under normal conditions) then these numbers are still absolutely manageable. Anyway, we could always filter out these hashes as they are not absolutely necessary every time we go to list, if somebody wants to verify then it would go to the disk, I think that is perfectly justified. I just wanted to put this in a perspective. (1) https://issues.apache.org/jira/browse/CASSANDRA-18111?focusedCommentId=17856211&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17856211 On Fri, Aug 9, 2024 at 12:16 PM Štefan Miklošovič wrote: > We could indeed do that. Does your suggestion mean that there should not > be a problem with caching it all once explicitly stated like that? > > On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev < > dev@cassandra.apache.org> wrote: > >> Has anyone considered simply updating the documentation saying this? >> >> "Removing the snapshot files directly from the filesystem may break >> things. Always use the `nodetool` command or JMX to remove snapshots." >> On 09/08/2024 09:18, Štefan Miklošovič wrote: >> >> If we consider caching it all to be too much, we might probably make >> caching an option an admin would need to opt-in into? There might be a flag >> in cassandra.yaml, once enabled, it would be in memory, otherwise it would >> just load it as it was so people can decide if caching is enough for them >> or they want to have it as it was before (would be by default set to as it >> was). This puts additional complexity into SnapshotManager but it should be >> in general doable. >> >> Let me know what you think, I would really like to have this resolved, >> 18111 brings a lot of code cleanup and simplifies stuff a lot. >> >> On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie >> wrote: >> >>> If you have a lot of snapshots and have for example a metric monitoring >>> them and their sizes, if you don’t cache it, creating the metric can cause >>> performance degradation. We added the cache because we saw this happen to >>> databases more than once. >>> >>> I mean, I believe you, I'm just surprised querying out metadata for >>> files and basic computation is leading to hundreds of ms pause times even >>> on systems with a lot of files. Aren't most / all of these values cached at >>> the filesystem layer so we're basically just tomato / tomahto caching >>> systems, either one we maintain or one the OS maintains? >>> >>> Or is there really just a count of files well outside what I'm thinking >>> here? >>> >>> Anyway, not trying to cause a ruckus and make needless noise, trying to >>> learn. ;) >>> >>> >>> On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: >>> >>> >>> >>> On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: >>> >>> With WatcherService, when events are missed (which is to be expected), >>> you will still need to list the files. It seems to me that WatcherService >>> doesn't offer significant benefits in this case. >>> >>> >>> Yeah I think we leave it out eventually. >>> >>> >>> Regarding listing directory with a refresh flag, my concern is the >>> potential for abuse. End-users might/could always refresh before listing, >>> which could undermine the purpose of caching. Perhaps Jeremiah can provide >>> more insight on this. >>> >>> >>> Well, by default, it would not be refreshed every single time. You would >>> need to opt-in into that. If there is a shop which has users with a direct >>> access to the disk of Cassandra nodes and they are removing data manually, >>> I do not know what to say, what is nodetool clearsnapshot and jmx methods >>> good for then? I do not think we can prevent people from shooting into >>> their feet if they are absolutely willing to do that. >>> >>> If they want to refresh that every time, that would be equal to the >>> current behavior. It would be at most as "bad" as it is now. >>> >>> >>> IMO, caching is best handled internally. I have a few UX-related >>> questions: >>> - Is it valid or acceptable to return stale data? If so, end-users have >>> to do some form of validation before consum
Re: [DISCUSS] inotify for detection of manually removed snapshots
If we have the documentation in place, we can then consider the cache to be the master copy of metadata, and rely on it to be always accurate and up to date. If someone deletes the snapshot files from filesystem, they can't complain about Cassandra stopped working correctly - which is the same if they had manually deleted some SSTable files (they shouldn't). On 09/08/2024 11:16, Štefan Miklošovič wrote: We could indeed do that. Does your suggestion mean that there should not be a problem with caching it all once explicitly stated like that? On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev wrote: Has anyone considered simply updating the documentation saying this? "Removing the snapshot files directly from the filesystem may break things. Always use the `nodetool` command or JMX to remove snapshots." On 09/08/2024 09:18, Štefan Miklošovič wrote: If we consider caching it all to be too much, we might probably make caching an option an admin would need to opt-in into? There might be a flag in cassandra.yaml, once enabled, it would be in memory, otherwise it would just load it as it was so people can decide if caching is enough for them or they want to have it as it was before (would be by default set to as it was). This puts additional complexity into SnapshotManager but it should be in general doable. Let me know what you think, I would really like to have this resolved, 18111 brings a lot of code cleanup and simplifies stuff a lot. On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie wrote: If you have a lot of snapshots and have for example a metric monitoring them and their sizes, if you don’t cache it, creating the metric can cause performance degradation. We added the cache because we saw this happen to databases more than once. I mean, I believe you, I'm just surprised querying out metadata for files and basic computation is leading to hundreds of ms pause times even on systems with a lot of files. Aren't most / all of these values cached at the filesystem layer so we're basically just tomato / tomahto caching systems, either one we maintain or one the OS maintains? Or is there really just a count of files well outside what I'm thinking here? Anyway, not trying to cause a ruckus and make needless noise, trying to learn. ;) On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: With WatcherService, when events are missed (which is to be expected), you will still need to list the files. It seems to me that WatcherService doesn't offer significant benefits in this case. Yeah I think we leave it out eventually. Regarding listing directory with a refresh flag, my concern is the potential for abuse. End-users might/could always refresh before listing, which could undermine the purpose of caching. Perhaps Jeremiah can provide more insight on this. Well, by default, it would not be refreshed every single time. You would need to opt-in into that. If there is a shop which has users with a direct access to the disk of Cassandra nodes and they are removing data manually, I do not know what to say, what is nodetool clearsnapshot and jmx methods good for then? I do not think we can prevent people from shooting into their feet if they are absolutely willing to do that. If they want to refresh that every time, that would be equal to the current behavior. It would be at most as "bad" as it is now. IMO, caching is best handled internally. I have a few UX-related questions: - Is it valid or acceptable to return stale data? If so, end-users have to do some form of validation before consuming each snapshot to account for potential deletions. answer below - Even if listsnapshot returns the most recent data, is it possible that some of the directories get deleted when end-users are accessing them? I think it is true. It, then, enforces end-users to do some validation first, similar to handling stale data. I think that what you were trying to say is that when at time T0 somebody lists snapshots and at T1 somebody removes a snapshot manually then the list of snapshots is not actual anymore? Sure. That is a thing. This is how it currently works. Now, we want to cache them, so if you clear a snapshot which is not physically there because somebody removed it manually, that should be a no-op, it will be just removed from the internal tracker
Re: [DISCUSS] inotify for detection of manually removed snapshots
I understand and agree. It is just that it would be cool if we avoided the situation when there is a figurative ABC company which has these "bash scripts removing snapshots from cron by rm -rf every second Sunday at 3:00 am" because "that was their workflow for ages". I am particularly sensitive to this as Cassandra is very cautious when it comes to not disrupting the workflows already out there. I do not know how frequent this would be and if somebody started to complain. I mean ... they could still remove it by hand, right? It is just listsnapshots would not be relevant anymore without refreshing it. I think that might be acceptable. It would be something else if we flat out made manual deletion forbidden, which it is not. On Fri, Aug 9, 2024 at 4:50 PM Bowen Song via dev wrote: > If we have the documentation in place, we can then consider the cache to > be the master copy of metadata, and rely on it to be always accurate and up > to date. If someone deletes the snapshot files from filesystem, they can't > complain about Cassandra stopped working correctly - which is the same if > they had manually deleted some SSTable files (they shouldn't). > On 09/08/2024 11:16, Štefan Miklošovič wrote: > > We could indeed do that. Does your suggestion mean that there should not > be a problem with caching it all once explicitly stated like that? > > On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev < > dev@cassandra.apache.org> wrote: > >> Has anyone considered simply updating the documentation saying this? >> >> "Removing the snapshot files directly from the filesystem may break >> things. Always use the `nodetool` command or JMX to remove snapshots." >> On 09/08/2024 09:18, Štefan Miklošovič wrote: >> >> If we consider caching it all to be too much, we might probably make >> caching an option an admin would need to opt-in into? There might be a flag >> in cassandra.yaml, once enabled, it would be in memory, otherwise it would >> just load it as it was so people can decide if caching is enough for them >> or they want to have it as it was before (would be by default set to as it >> was). This puts additional complexity into SnapshotManager but it should be >> in general doable. >> >> Let me know what you think, I would really like to have this resolved, >> 18111 brings a lot of code cleanup and simplifies stuff a lot. >> >> On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie >> wrote: >> >>> If you have a lot of snapshots and have for example a metric monitoring >>> them and their sizes, if you don’t cache it, creating the metric can cause >>> performance degradation. We added the cache because we saw this happen to >>> databases more than once. >>> >>> I mean, I believe you, I'm just surprised querying out metadata for >>> files and basic computation is leading to hundreds of ms pause times even >>> on systems with a lot of files. Aren't most / all of these values cached at >>> the filesystem layer so we're basically just tomato / tomahto caching >>> systems, either one we maintain or one the OS maintains? >>> >>> Or is there really just a count of files well outside what I'm thinking >>> here? >>> >>> Anyway, not trying to cause a ruckus and make needless noise, trying to >>> learn. ;) >>> >>> >>> On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: >>> >>> >>> >>> On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: >>> >>> With WatcherService, when events are missed (which is to be expected), >>> you will still need to list the files. It seems to me that WatcherService >>> doesn't offer significant benefits in this case. >>> >>> >>> Yeah I think we leave it out eventually. >>> >>> >>> Regarding listing directory with a refresh flag, my concern is the >>> potential for abuse. End-users might/could always refresh before listing, >>> which could undermine the purpose of caching. Perhaps Jeremiah can provide >>> more insight on this. >>> >>> >>> Well, by default, it would not be refreshed every single time. You would >>> need to opt-in into that. If there is a shop which has users with a direct >>> access to the disk of Cassandra nodes and they are removing data manually, >>> I do not know what to say, what is nodetool clearsnapshot and jmx methods >>> good for then? I do not think we can prevent people from shooting into >>> their feet if they are absolutely willing to do that. >>> >>> If they want to refresh that every time, that would be equal to the >>> current behavior. It would be at most as "bad" as it is now. >>> >>> >>> IMO, caching is best handled internally. I have a few UX-related >>> questions: >>> - Is it valid or acceptable to return stale data? If so, end-users have >>> to do some form of validation before consuming each snapshot to account for >>> potential deletions. >>> >>> >>> answer below >>> >>> - Even if listsnapshot returns the most recent data, is it possible that >>> some of the directories get deleted when end-users are accessing them? I >>> think it is true. It, then, enforces end-users
Re: [DISCUSS] inotify for detection of manually removed snapshots
I lean towards the documentation approach vs complicating the implementation. For me personally: I regularly use shell commands to operate on snapshots. That includes listing them. I probably should use nodetool for it all instead though. Jordan On Fri, Aug 9, 2024 at 08:09 Štefan Miklošovič wrote: > I understand and agree. It is just that it would be cool if we avoided the > situation when there is a figurative ABC company which has these "bash > scripts removing snapshots from cron by rm -rf every second Sunday at 3:00 > am" because "that was their workflow for ages". > > I am particularly sensitive to this as Cassandra is very cautious when it > comes to not disrupting the workflows already out there. > > I do not know how frequent this would be and if somebody started to > complain. I mean ... they could still remove it by hand, right? It is just > listsnapshots would not be relevant anymore without refreshing it. I think > that might be acceptable. It would be something else if we flat out made > manual deletion forbidden, which it is not. > > On Fri, Aug 9, 2024 at 4:50 PM Bowen Song via dev < > dev@cassandra.apache.org> wrote: > >> If we have the documentation in place, we can then consider the cache to >> be the master copy of metadata, and rely on it to be always accurate and up >> to date. If someone deletes the snapshot files from filesystem, they can't >> complain about Cassandra stopped working correctly - which is the same if >> they had manually deleted some SSTable files (they shouldn't). >> On 09/08/2024 11:16, Štefan Miklošovič wrote: >> >> We could indeed do that. Does your suggestion mean that there should not >> be a problem with caching it all once explicitly stated like that? >> >> On Fri, Aug 9, 2024 at 12:01 PM Bowen Song via dev < >> dev@cassandra.apache.org> wrote: >> >>> Has anyone considered simply updating the documentation saying this? >>> >>> "Removing the snapshot files directly from the filesystem may break >>> things. Always use the `nodetool` command or JMX to remove snapshots." >>> On 09/08/2024 09:18, Štefan Miklošovič wrote: >>> >>> If we consider caching it all to be too much, we might probably make >>> caching an option an admin would need to opt-in into? There might be a flag >>> in cassandra.yaml, once enabled, it would be in memory, otherwise it would >>> just load it as it was so people can decide if caching is enough for them >>> or they want to have it as it was before (would be by default set to as it >>> was). This puts additional complexity into SnapshotManager but it should be >>> in general doable. >>> >>> Let me know what you think, I would really like to have this resolved, >>> 18111 brings a lot of code cleanup and simplifies stuff a lot. >>> >>> On Wed, Aug 7, 2024 at 11:30 PM Josh McKenzie >>> wrote: >>> If you have a lot of snapshots and have for example a metric monitoring them and their sizes, if you don’t cache it, creating the metric can cause performance degradation. We added the cache because we saw this happen to databases more than once. I mean, I believe you, I'm just surprised querying out metadata for files and basic computation is leading to hundreds of ms pause times even on systems with a lot of files. Aren't most / all of these values cached at the filesystem layer so we're basically just tomato / tomahto caching systems, either one we maintain or one the OS maintains? Or is there really just a count of files well outside what I'm thinking here? Anyway, not trying to cause a ruckus and make needless noise, trying to learn. ;) On Wed, Aug 7, 2024, at 3:20 PM, Štefan Miklošovič wrote: On Wed, Aug 7, 2024 at 6:39 PM Yifan Cai wrote: With WatcherService, when events are missed (which is to be expected), you will still need to list the files. It seems to me that WatcherService doesn't offer significant benefits in this case. Yeah I think we leave it out eventually. Regarding listing directory with a refresh flag, my concern is the potential for abuse. End-users might/could always refresh before listing, which could undermine the purpose of caching. Perhaps Jeremiah can provide more insight on this. Well, by default, it would not be refreshed every single time. You would need to opt-in into that. If there is a shop which has users with a direct access to the disk of Cassandra nodes and they are removing data manually, I do not know what to say, what is nodetool clearsnapshot and jmx methods good for then? I do not think we can prevent people from shooting into their feet if they are absolutely willing to do that. If they want to refresh that every time, that would be equal to the current behavior. It would be at most as "bad" as it is now. IMO, caching is best handled internally. I have a