Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
Apologies for the late reply. > I'm guessing shutting down Netdata did not stop the resets? There was a hiccup yesterday, so I had to redo the test and wait until today to see if there were any stats resets ... and NONE today!! reset_status| cnt +- reset before today | 100 no reset | 57 reset today| 0 This is quite interesting, and a little unexpected... but in order to confirm this theory, I have done the following just now: 1) Restart Netdata service 2) rename the "pg_stat_reset()" function , as you suggested in another post - very clever btw! :-) Now let's wait until tomorrow to see if we have any stats reset and/or some relevant errors in the PG logs! Quick note: I did do a quick check in any user functions for any "pg_stat_reset" calls , but didn't find any. With that said, we have 150+ databases in our cluster, so perhaps I missed something. I also did a grep on the netdata directories, but nothing as well. -Steeve On Sat, Nov 23, 2024 at 10:19 AM Adrian Klaver wrote: > On 11/23/24 05:16, Steeve Boulanger wrote: > >> Here (Ireland) we sometimes say "common-or-garden variety" It means > a normal, everyday variety. :-) > > > > I'm afraid that my Irish dialect is limited to "sláinte" only ;-) In > > any case, thanks for taking the time to help with this issue. I'm > > still investigating, but I think that calling the "ghostbusters" is > > moving up the list now lol. > > I'm guessing shutting down Netdata did not stop the resets? > > You might try setting log_min_messages to info to see if that catches > anything. > > > > > > > > On Sat, Nov 23, 2024 at 7:09 AM Ray O'Donnell wrote: > >> > >> On 23/11/2024 13:06, Steeve Boulanger wrote: > >> > >> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> As a superuser, rename pg_stat_reset inside one of the commonly affected databases: > alter function pg_stat_reset rename to hey_stop_running_pg_stat_reset_already; > Then see who starts complaining. Additionally, your server log will get helpful entries like this: Thanks for the tip Greg - that's very clever! I've done this just now, and will check back tomorrow to see if there's any corresponding error message. NB: I hadn't originally realized that the system functions (ie pg_*) were created for each database, so I had renamed only the one in the postgres database, thinking it was global (ie I'll blame this on my Oracle days ;-)). Now I renamed the function under one database that had been verified as getting its stats reset on a daily basis. Hopefully tomorrow we'll get more interesting info! -Steeve On Sat, Nov 23, 2024 at 11:50 AM Greg Sabino Mullane wrote: > As a superuser, rename pg_stat_reset inside one of the commonly affected > databases: > > alter function pg_stat_reset rename to > hey_stop_running_pg_stat_reset_already; > > Then see who starts complaining. Additionally, your server log will get > helpful entries like this: > > ERROR: function pg_stat_reset() does not exist > > Cheers, > Greg > >
Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
Hello, I have noticed recently that our databases (ie most but not all) have their stats reset on daily basis: select count(case when stats_reset > (now() - interval '1 day') then 1 else null end) db_reset_cnt, count(*) db_total_cnt from pg_stat_database; db_reset_cnt | db_total_cnt --+-- 77 | 157 Furthermore, the archiver stats also get reset: select stats_reset from pg_stat_archiver; stats_reset --- 2024-11-21 13:52:35.864855+00 Some facts: - PostgreSQL v13.16 on Linux Ubuntu 20.04 - There's no evidence in the logs that any pg_stat_reset* functions have been executed name| setting +- log_min_duration_statement | 0 log_statement | all - we don't have any user procs with a pg_stat_reset call - the timestamps of the reset across databases are sometimes close to each other, but not always Anyone has any insight on what could be doing this daily stats reset, or perhaps how to find the root cause? Regards, Steeve
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> All I can think to do is look at the logs around the stats_reset times > for the databases and see if there is anything relevant. That was already done, but nothing relevant was found unfortunately. -Steeve On Thu, Nov 21, 2024 at 3:12 PM Adrian Klaver wrote: > On 11/21/24 12:57, Steeve Boulanger wrote: > > > > > Please reply to list also. > > > > My apologies - I thought I did a "Reply all", but apparently not. I'm a > > little bit of a noob with email distrib lists. > > > > > 1) What is log_min_error_statement set to? > > > >name | setting | pending_restart > > -+-+- > > log_min_error_statement | error | f > > > > > 2) Did you reload the server when changing?: > > > > yes - pg_reload_conf() > > All I can think to do is look at the logs around the stats_reset times > for the databases and see if there is anything relevant. > > > > > -Steeve > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> 1) Do the 77 share some trait the other 80 don't. No pattern found yet .. but still verifying a few things > 2) Do the OS system logs reveal anything? Nothing found in syslog > 3) What was happening in the databases just prior to the time the stats reset? Here's an example (log extracts) for a stats reset occurrence: select datname, stats_reset, now()-stats_reset as since_reset from pg_stat_database where ( now()-stats_reset ) < interval '1 day' order by 3 limit 1; datname | stats_reset | since_reset +---+- MyDB | *2024-11-21 13:48:34.332*785+00 | 00:00:22.266304 <--LOGS--> 2024-11-21 13:48:34.324 UTC pid=[322035][2] db=[MyDB] usr=[user1] client=[host1] app=[[unknown]]LOG: connection authorized: user=user1 database=MyDB applicatio n_name=app1 <..> <.. no calls at "2024-11-21 13:48:34.332" - WHY?? ..> 2024-11-21 13:48:34.336 UTC pid=[322035][3] db=[MyDB] usr=[user1] client=[host1] app=[app1]LOG: duration: 1.071 ms parse : SELECT <..> <--LOGS--> As you can see from above, the stats for MyDB were reset at ".332" . The only logs before/after for the db was the connection (at .324), and then the parse (at .336). NB: I also checked the logs at ".333" in case there would have been a rounding up, but nothing relevant was found. With that said, I only verified one occurence - tomorrow I'll check a few more just to validate. > 4) Do you have external tools accessing these databases? We have internal micro-services accessing the databases, as well as a monitoring tool (Netdata), and some of the Devs use pgAdmin. I discarded the scenario where someone would inadvertently do a "pg_stat_reset" via pgAdmin, just because a lot of databases have their stats reset within a short period of time. On the other hand, Netdata does connect to most (if not all) databases frequently by its nature - so as a test, I stopped the Netdata service today to see if tomorrow we're still seeing the stats reset or not. I can report back tomorrow on this. > 5) Is the cluster directly open to the world? No. It's an on-premise installation. Only local applications can connect to it. -Steeve On Thu, Nov 21, 2024 at 4:32 PM Adrian Klaver wrote: > On 11/21/24 13:31, Steeve Boulanger wrote: > > > All I can think to do is look at the logs around the stats_reset > times > > > for the databases and see if there is anything relevant. > > > > That was already done, but nothing relevant was found unfortunately. > > Unless it was not recognized as relevant. Since for the time being I am > eliminating magic as the cause, something concrete is causing this and > it should be leaving a trace. In your post you had this affecting 77 out > of 157 databases in the cluster. > > 1) Do the 77 share some trait the other 80 don't. > > 2) Do the OS system logs reveal anything? > > 3) What was happening in the databases just prior to the time the stats > reset? > > 4) Do you have external tools accessing these databases? > > 5) Is the cluster directly open to the world? > > > > > -Steeve > > > > On Thu, Nov 21, 2024 at 3:12 PM Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 11/21/24 12:57, Steeve Boulanger wrote: > > > > > > > Please reply to list also. > > > > > > My apologies - I thought I did a "Reply all", but apparently not. > > I'm a > > > little bit of a noob with email distrib lists. > > > > > > > 1) What is log_min_error_statement set to? > > > > > >name | setting | pending_restart > > > -+-+- > > > log_min_error_statement | error | f > > > > > > > 2) Did you reload the server when changing?: > > > > > > yes - pg_reload_conf() > > > > All I can think to do is look at the logs around the stats_reset > times > > for the databases and see if there is anything relevant. > > > > > > > > -Steeve > > > > > > -- > > Adrian Klaver > > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
Hello Ron, > Also, "SELECT pg_postmaster_start_time();" to show the *real* Postgresql start time, instead of what you think it is. Yes you're right - I should have been more explicit. [local]:5432 postgres@postgres=# select current_timestamp - pg_postmaster_start_time() as uptime; uptime - 15 days 10:03:44.048383 -Steeve On Thu, Nov 21, 2024 at 3:30 PM Ron Johnson wrote: > On Thu, Nov 21, 2024 at 4:12 PM Adrian Klaver > wrote: > >> On 11/21/24 12:57, Steeve Boulanger wrote: >> > >> > > Please reply to list also. >> > >> > My apologies - I thought I did a "Reply all", but apparently not. I'm a >> > little bit of a noob with email distrib lists. >> > >> > > 1) What is log_min_error_statement set to? >> > >> >name | setting | pending_restart >> > -+-+- >> > log_min_error_statement | error | f >> > >> > > 2) Did you reload the server when changing?: >> > >> > yes - pg_reload_conf() >> >> All I can think to do is look at the logs around the stats_reset times >> for the databases and see if there is anything relevant. >> > > Also, "SELECT pg_postmaster_start_time();" to show the *real* Postgresql > start time, instead of what you think it is. > > -- > Death to , and butter sauce. > Don't boil me, I'm still alive. > lobster! >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> Please reply to list also. My apologies - I thought I did a "Reply all", but apparently not. I'm a little bit of a noob with email distrib lists. > 1) What is log_min_error_statement set to? name | setting | pending_restart -+-+- log_min_error_statement | error | f > 2) Did you reload the server when changing?: yes - pg_reload_conf() -Steeve On Thu, Nov 21, 2024 at 2:49 PM Adrian Klaver wrote: > On 11/21/24 12:34, Steeve Boulanger wrote: > > Please reply to list also. > > Ccing list. > > > > Thanks Adrian for taking the time to respond. I will review the > > documentation once more, just in case I missed anything. > > > > My apologies - I forgot to mention in my original post, that our last > > cluster shutdown was over 15 days ago, thus an "unclean" > > shutdown would not explain these daily stats reset that we are seeing. > > It might also be relevant to mentioned > > Then something is resetting the statistics. > > 1) What is log_min_error_statement set to? > > 2) Did you reload the server when changing?: > > log_min_duration_statement | 0 > log_statement | all > > > > > -Steeve > > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> What is the [2] referring to? Number of the log line for each session or process, starting at 1 > My guess is the difference in time it takes to log the action and set > the log timestamp. Whereas the stats_reset value is the timestamp when > the stats system actually did the reset. Very plausible. I thought the same too. > The above is some garden variety select? Not 100% sure what the expression "garden variety select" means lol, but I'll take a guess that it means an "select from an in-house application" .. and yes it is. -Steeve On Fri, Nov 22, 2024 at 11:18 AM Adrian Klaver wrote: > On 11/21/24 15:50, Steeve Boulanger wrote: > > > 1) Do the 77 share some trait the other 80 don't. > > > > No pattern found yet .. but still verifying a few things > > > > > 2) Do the OS system logs reveal anything? > > > > Nothing found in syslog > > > > > 3) What was happening in the databases just prior to the time the > stats > > reset? > > > > Here's an example (log extracts) for a stats reset occurrence: > > > > select datname, stats_reset, now()-stats_reset as since_reset > > from pg_stat_database > > where ( now()-stats_reset ) < interval '1 day' > > order by 3 limit 1; > > > > datname | stats_reset | since_reset > > +---+- > > MyDB | *2024-11-21 13:48:34.332*785+00 | 00:00:22.266304 > > > > <--LOGS--> > > 2024-11-21 13:48:34.324 UTC pid=[322035][2] db=[MyDB] usr=[user1] > > client=[host1] app=[[unknown]]LOG: connection authorized: user=user1 > > database=MyDB applicatio > > n_name=app1 <..> > > What is the [2] referring to? > > > > > <.. no calls at "2024-11-21 13:48:34.332" - WHY?? ..> > > My guess is the difference in time it takes to log the action and set > the log timestamp. Whereas the stats_reset value is the timestamp when > the stats system actually did the reset. > > > > > 2024-11-21 13:48:34.336 UTC pid=[322035][3] db=[MyDB] usr=[user1] > > client=[host1] app=[app1]LOG: duration: 1.071 ms parse : > > SELECT <..> > > The above is some garden variety select? > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why?
> Here (Ireland) we sometimes say "common-or-garden variety" It means a > normal, everyday variety. :-) I'm afraid that my Irish dialect is limited to "sláinte" only ;-) In any case, thanks for taking the time to help with this issue. I'm still investigating, but I think that calling the "ghostbusters" is moving up the list now lol. On Sat, Nov 23, 2024 at 7:09 AM Ray O'Donnell wrote: > > On 23/11/2024 13:06, Steeve Boulanger wrote: > > > > The above is some garden variety select? > > Not 100% sure what the expression "garden variety select" means lol, but I'll > take a guess that it means an "select from an in-house application" .. and > yes it is. > > > Here (Ireland) we sometimes say "common-or-garden variety" It means a > normal, everyday variety. :-) > > Ray. > > > > > -Steeve > > On Fri, Nov 22, 2024 at 11:18 AM Adrian Klaver > wrote: >> >> On 11/21/24 15:50, Steeve Boulanger wrote: >> > > 1) Do the 77 share some trait the other 80 don't. >> > >> > No pattern found yet .. but still verifying a few things >> > >> > > 2) Do the OS system logs reveal anything? >> > >> > Nothing found in syslog >> > >> > > 3) What was happening in the databases just prior to the time the stats >> > reset? >> > >> > Here's an example (log extracts) for a stats reset occurrence: >> > >> > select datname, stats_reset, now()-stats_reset as since_reset >> > from pg_stat_database >> > where ( now()-stats_reset ) < interval '1 day' >> > order by 3 limit 1; >> > >> > datname | stats_reset | since_reset >> > +---+- >> > MyDB | *2024-11-21 13:48:34.332*785+00 | 00:00:22.266304 >> > >> > <--LOGS--> >> > 2024-11-21 13:48:34.324 UTC pid=[322035][2] db=[MyDB] usr=[user1] >> > client=[host1] app=[[unknown]]LOG: connection authorized: user=user1 >> > database=MyDB applicatio >> > n_name=app1 <..> >> >> What is the [2] referring to? >> >> > >> > <.. no calls at "2024-11-21 13:48:34.332" - WHY?? ..> >> >> My guess is the difference in time it takes to log the action and set >> the log timestamp. Whereas the stats_reset value is the timestamp when >> the stats system actually did the reset. >> >> > >> > 2024-11-21 13:48:34.336 UTC pid=[322035][3] db=[MyDB] usr=[user1] >> > client=[host1] app=[app1]LOG: duration: 1.071 ms parse : >> > SELECT <..> >> >> The above is some garden variety select? >> >> >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com >> > > > -- > Raymond O'Donnell // Galway // Ireland > r...@rodonnell.ie